exprecs, making json usable.

Erlang and Syntax

Many flames have been ignited over erlang’s syntax. Erlang as a system is exceptional; easy concurrency, clustering and OTP design principles to name just a few. However, its syntax has a lot to be desired.

There are minor annoyances, like the comma (“,”), period (“.”) and semicolon (“;”), which make diffs larger and harder to read than they should be, and cause irritating compile errors after a quick edit.

A simple contrived example:

add(First, Second) ->
    Result = First + Second.

Now if you want to store the result

add(First, Second) ->
    Result = First + Second,
    store_and_return(Result).

And now there is a two line diff, instead of one:

--- add1.erl    2011-09-20 11:19:18.000000000 -0400
+++ add2.erl    2011-09-20 11:20:34.000000000 -0400
@@ -1,2 +1,3 @@
 add(First,Second) ->
-    Result = First + Second.
+    Result = First + Second,
+    store_and_return(Result).

This is a minor nuisance, but the number of times I have forgotten to change a period to a comma approaches infinity.

Records

While I lack a rigorous statistical analysis, you would be hard pressed to find an erlang programmer who enjoys records. Records are essentially syntactic sugar on top of tagged tuples. This sugar does not taste good.

Defining records is easy and straight forward:

-record(poetry, {
          style  :: string(),
          line   :: string(),
          author :: string()
          }).

However, using them is another story.

vogon_example() ->
    #poetry{style = "vogon",
            line = "Oh freddled gruntbuggly/thy micturations are to me/As plurdled gabbleblotchits on a lurgid bee.",
            author = "Jeltz"
            }.

echo_poem(Poem = #poetry{}) ->
    io:format("~s~nby ~s",[Poem#poetry.line,Poem#poetry.author]).

The need to specify the record type on a variable before using the element accessor can lead to some fairly ugly code

contrived(Collection) ->
    %% in R14 you do not need the parens
    (Collection#collection.vogon)#poetry.author.

If the need to specify the record type was removed you could do

contrived(Collection) ->
    Collection.vogon.author.

Which looks much cleaner. However, ugly syntax is a trivial annoyance and is primarily a subjective aesthetic concern.

The need to specify the record type is a more pragmatic problem. Writing generic code that consumes records conforming to a pattern or interface is impossible.

While it is true that erlang:element/2 can be used to access records as tuples, the usability of named fields is lost. If the record definition is changed, your code that uses erlang:element/2 may break in interesting ways. (note: that is not a Good Thing[tm])

exprecs to the rescue

I stumbled onto an interesting chunk of code by Ulf Wiger called exprecs. Exprecs is a parse transform that allows you to work magic, freeing your code from the constraints of the erlang record while still maintaining the benefits derived from named fields.

At this point it may be beneficial for you to read over the exprecs edoc. To make things simple I have generated the edoc html. Go have a quick read, I will wait.

While it doesn’t remove the need to scatter # all over your code, exprecs does enable the ability to treat records as discoverable and fungible entities. This opens the door to more generic and reusable code.

The Problem

I have written a lot of erlang code dealing with json, primarily in REST interfaces built with webmachine. The json validation and parsing into erlang terms is made extremely easy thanks to mochijson2. mochijson2 has a lot of benefits: roundtrips are consistent, json strings are parsed to erlang binaries and a single line of code is generally all you need.

However, I do find the resulting erlang terms produced by mochijson2 to be confusing and difficult to remember. Recursing down a proplist tree is not my most favorite activity and one can easily get lost in large data structures. This makes changing the data structure difficult, error prone and tedious, even with good unit tests.

The appropriate representation of large or complex data structures in erlang is a record. Due to the problems outlined above, abstracting out the mochijson2 code to create a generic json to record parser is impossible.

This means that I found myself writing json to record parsers frequently. I was violating DRY and becoming more and more frustrated.

The Solution

Thanks to the awesomeness that is exprecs, I was able to write a module that would take the erlang terms produced by mochijson2:decode/1 and transform them into a record. The code can even roundtrip from a record to json.

I no longer have to write yet another proplist walker in order to get json into mnesia. I am quite excited about this.

The json_rec.erl module exports two functions; to_rec/3 and to_json/2. The following code is example usage to illustrate the interface:

store_vogon_json(HttpBody) ->
    Json = mochijson2:decode(HttpBody),
    Record = json_rec:to_rec(Json, vogon_model, vogon_model:new(<<"poetry">>)),
    vogon_model:create(Record).

get_vogon(Author) ->
    Record = vogon_model:read(Author),
    json_rec:to_json(Record, vogon_model).

exprecs Explained

In order to give some example usage of exprecs, I am going to provide lots of contrived examples. If you want a real world use case, see the code for json_rec.

We have two modules, poetry.erl and book.erl that each have their own record defined in poetry.hrl and book.hrl

%% include/poetry.hrl
-record(poetry, {
          style      :: atom(),
          excerpt    :: string(),
          author     :: string(),
          available  :: boolean(),
          count      :: integer()
         }).

%% include/book.hrl
-record(book, {
          style      :: atom(),
          count      :: integer(),
          available  :: boolean(),
          pages      :: integer(),
          excerpt    :: string(),
          author     :: string()
         }).

Now you have a massive inefficient database of 100 book records and 100 poetry records. Someone has just snuck in and stolen your entire library and you are pedantic enough to want to update this fact.

Since the two records have an different number of fields and they are in a different order, using element/2 is not an option. This is where exprecs comes in.

Housekeeping

First some basic housekeeping. The record needs to be ‘exported’ from a module

-module(poetry).

%% include the record definition or put it inline
-include("poetry.hrl").

-compile({parse_transform, exprecs}).
-export_records([poetry]).

To make the above -compile work, exprecs.erl needs to be in your erlang path. For simplicity I have put exprecs.erl in a basic erlang app that all our erlang projects depend on, that way I am certain to have it available. ( I need a better way to do this besides having a ‘utils’/'misc’ app.)

-export_records is created by the exprecs parse transform. This is what generates and exports the funky ‘#get-’ functions and makes the records usable.

..and the same for the book.erl module

update function

Now we need to write a function that updates the count field to zero in all records, since our collection has been stolen.

-type count_record() :: #poetry{} | #book{}.
-spec reset_count(Module :: atom(), Record :: count_record() ) ->
                                   {error, string()} | {ok, count_record()}.
reset_count(Module, Record) ->
    %% crash if there is not a count field
    true = lists:member(count, Module:'#info-'(fields,Record)),

    %% get the count value by specifying the field we want. notice how
    %% there is no explicit mention of what record is being used. We
    %% just care that there is a count.
    case Module:'#get-'(count, Record) of
        0 ->
            Record;
        _N ->
            Module:'#set-'([{count, 0}, {available, false}], Record)
    end.

In order to use this we write a simple loop over all books and poetry available, specifying the module and record.

reset_all() ->
    %% loop over all modules
    lists:foreach(fun(Module) ->
                          %% reset the count of all records in the module
                          lists:foreach(fun(Record) ->
                                                New = reset_count(Module, Record),
                                                Module:update(New)
                                        end, Module:all())
                  end, [poetry, book]).

The bane of all example code is that bad code can be easier to read. I hope the above illustrates the benefit of exprecs, namely, that it opens the door to generic record-based code.

json_rec, a walk through

As with all code, there are quite few bits missing, namely internal documentation. It may prove difficult for others to hack on this until I get to that. The good news is that I have extensively documented the exported functions and even written an example model.

You can pull the current code from https://github.com/justinkirby/json_rec

Using

The goal of json_rec is to take json and provide a record ultimately destined for some type of datastore; mnesia, riak, couch, etc.. As such json_rec assumes that you have a model for interacting with the store, e.g. standard MVC.

json_rec places a few very simple requirements on your model’s interface:

  • it MUST export new/1
  • it MUST export rec/1
  • it MUST export the exprecs transforms or the record.

At this point, if you have not read the exprecs edoc I highly recommend that you do.

Keeping with the above example, let’s make book.erl a json_rec compatible module.

-module(book).

-export([
         new/1,
         rec/1
        ]).

-record(book, {
          style      :: atom(),
          count      :: integer(),
          available  :: boolean(),
          pages      :: integer(),
          excerpt    :: string(),
          author     :: string()
         }).

%% the exprecs export of the record interface
-compile({parse_transform, exprecs}).
-export_records([book]).

%% here we provide a mapping of the json key to a record.
new(<<"book">>) ->
    '#new-book'();

%% if the key is unknown, return undefined.
new(_RecName) ->
    undefined.

%% return true for the #book{} indicating that we support it.
rec(#book{}) -> true;
rec(_) -> false.

At this point we can take the following json and transform it into the #book{} record.

{
    "style": "fiction",
    "count": 1,
    "available": true,
    "pages": 42,
    "excerpt": "Good bye and thanks for all the fish.",
    "author":"Adams, Douglas"
}

We can get a #book{} record from the above with

-spec json_to_rec(Json :: string()) -> #book{}.
json_to_rec(Json) ->
    ErlJson = mochijson2:decode(Json),
    Record = book:new(<<"book">>),
    json_rec:to_rec(ErlJson, book, Record).

Other features

json_rec will try its best to transform into known records, e.g. ones exported from the module. However, if Module:new/1 returns ‘undefined’, then it will fall back to a proplist. The major downside of this is that it loses the clean round trip that mochijson2 gives you.

json_rec also supports nested records. Whenever a json dictionary key has a dictionary as a value json_rec will call Module:new/1 to determine if it is a known record type. If it is json_rec will create a record and make it the value of the parent record field.

json_rec supports a list of dictionaries as well.

In summary I have tried to support all reasonable data structure combinations. json_rec does a best effort to do what you expect. However, it is not an AI or Turing-complete so I am sure there are various combinations of lists and dicts that will not work.

Summary

json_rec is an 80% solution that has saved me a ton of copy/paste coding. I have found it extremely useful in saving my sanity when transforming json into useable data.

I would like to thank Ulf Wiger for creating exprecs, making json_rec possible.

Comment (1)

  1. Ulf Wiger wrote::

    Thank you for a nice article. It’s great to see your own stuff be put to good use. :)

    Tuesday, November 8, 2011 at 3:57 pm #

Warning: fsockopen() [function.fsockopen]: php_network_getaddresses: getaddrinfo failed: System error in /var/www/blogs.openaether.org/htdocs/wp-includes/class-snoopy.php on line 1148

Warning: fsockopen() [function.fsockopen]: unable to connect to identi.ca:80 (php_network_getaddresses: getaddrinfo failed: System error) in /var/www/blogs.openaether.org/htdocs/wp-includes/class-snoopy.php on line 1148