TL;DR: Elixir - a language that positively celebrates the use of macros

I’ve been picking up some Elixir so I can try my hand at a new project I want to build using the Erlang/OTP ecosystem.

Elixir uses lots and lots of compile-time macros. The language’s pervasive use of macros is unprecedented in my experience. Most (all?) of the core features of Elixir (its special forms) are (non-overridable) macros (e.g. the case statement)

Everybody likes and wants to use a language that has macros. But you are not supposed to use them right? ☺ ☺ ☺

Possibly the single most comment that has endeared me to Elixir and its community is a passage from Chris McCord’s great book Metaprogramming Elixir.

Some of the greatest insights I’ve had with macros have been with irresponsible code that I would never ship to production. There’s no substitute for learning by experimentation.

Write irresponsible code, experiment, and have fun. Use the insight you gain to drive design decisions of things you would ship to a production system.

In the same vein as Chris’s just-give-it-a-go-and-learn-some-lessons meme, I recently watched Zack Tellman’s talk Some Things that Macros do at last year’s Curry On! where he made the point that macro usage in Clojure has tended to be either at the low-end, standard fare of boilerplate removal, or up at the high-end where the doyen of Clojure macros core.async towers. Zack reckoned the gap between the two extremes offered plenty of space for people to experiment with macros.

So, with that thought, this post is my first irresponsible experiment in designing, writing and using Elixir macros. Caveat Emptor

Elixir’s built-in pipe |> operator?

Of course anybody who has written even the merest snippet of Elixir knows it already has a statement threading operator: the famous pipe operator |>

The docs say

it simply takes the output from the expression on its left side and passes it as the first argument to the function call on its right side.

The |> is really syntactic sugar that enables this code that creates the string HelloWorld

1
Enum.join(Stream.map(String.split("hello world"), &String.capitalize/1))

… to be written like this:

1
"hello world" |> String.split |> Stream.map(&String.capitalize/1) |> Enum.join

… or even like this where each |> breaks onto a new line.

1
2
3
4
"hello world" 
|> String.split 
|> Steam.map(&String.capitalize/1) 
|> Enum.join

No difference at all to the final, generated, code (and result) but again a welcome boost to visual clarity.

Elixir Metaprogramming

To explain how the |> macro works needs a bit of an explanation of how Elixir does code generation. Many people have already covered the structure of an AST notably Chris’s book but also in many fine blogs out there (see for example Sasa Juric’s series on macros). So I’m not going to delve too deep.

To set the scene, when you (run-time) metaprogram in Ruby you build the string representation of the new code and call e.g. class_eval to compile it.

Clojure, being a Lisp, is homoiconic so you’d write a (compile-time) macro to construct s-expressions with the new code.

Elixir’s lingua franca of code generation is a quoted abstract syntax tree (AST). Elixir ASTs are homoiconic and fractal-like.

Elixir macros take an AST, transform it and return the transformed AST which is then compiled by the compiler.

What’s in an AST?

An Elixir AST is a tuple with three elements:

1
{function-to-call, function-metadata, function-arguments}
  1. The function-to-call is often an atom of the function to call e.g. :hd (to take the head of a list), or it may be another AST for example representing a call to a function in another module; .

  2. The function-metadata is “red tape” which I wont dwell on here.

  3. The function-arguments is an list containing the arguments to the functions. The arguments may be literals e.g. 42 or, again, ASTs.

ASTs are regular, recursive structure including tuples, lists, atoms and “literals” such as strings, booleans (i.e. true or false) and numbers (1, 2, etc)

Show me an AST!

The AST for any code can be created using the quote function.

Try this in Elixir’s interactive repl iex to see the AST for the first step of the example above, splitting the “hello world” string

1
iex(1)> quote do: String.split("hello world")

and you’ll see a simple AST showing the the Kernel function . (dot) being used to call the split function in module String with the argument “hello world”:

1
{{:., [], [{:__aliases__, [alias: false], [:String]}, :split]}, [], ["hello world"]}

If you have a look at the AST generated when using the pipe operator i.e.

1
iex(2)> quote do: "hello world" |> String.split

and you should see something like this:

1
2
3
4
{:|>, 
 [context: Elixir, import: Kernel],
 ["hello world", 
  {{:., [], [{:__aliases__, [alias: false], [:String]}, :split]}, [], []}]}

This AST is quite hard to parse by eye, but if you stare at it for a while, you should be able to discern its the |> operator being called with two arguments:

  1. The first argument is “hello world” string; and

  2. The second argument is essentially the same AST as above when Kernel . (dot) was used to call split in String but without the “hello world” string as the first argument i.e.

1
{{:., [], [{:__aliases__, [alias: false], [:String]}, :split]}, [], []}

To labour this point: the AST for the |> has two arguments, one of which is another AST.

The |> is just an Elixir macro

The |> operator is just a macro, and a very brief one at that with only two lines using Macro pipe and unpipe.

1
2
3
4
defmacro left |> right do
  [{h, _}|t] = Macro.unpipe({:|>, [], [left, right]})
  :lists.foldl fn {x, pos}, acc -> Macro.pipe(acc, x, pos) end, h, t
end

How does so little code do so much?

The Macro.unpipe call

Lets generate the AST for the complete HelloWorld pipeline

1
2
# generate the ast for the complete HelloWorld pipeline
iex(n)> pipeline_ast = quote do: "hello world" |> String.split |> Stream.map(&String.capitalize/1) |> Enum.join

you’ll see

1
2
3
4
5
6
7
8
9
10
11
{:|>, [context: Elixir, import: Kernel],
 [{:|>, [context: Elixir, import: Kernel],
   [{:|>, [context: Elixir, import: Kernel],
     ["hello world",
      {{:., [], [{:__aliases__, [alias: false], [:String]}, :split]}, [], []}]},
    {{:., [], [{:__aliases__, [alias: false], [:Stream]}, :map]}, [],
     [{:&, [],
       [{:/, [context: Elixir, import: Kernel],
         [{{:., [], [{:__aliases__, [alias: false], [:String]}, :capitalize]},
           [], []}, 1]}]}]}]},
  {{:., [], [{:__aliases__, [alias: false], [:Enum]}, :join]}, [], []}]}

Now run the Macro unpipe call on the pipeline_ast

1
2
# run unpipe on the pipeline ast
pipeline_tuples = Macro.unpipe(pipeline_ast)

you’ll see a list - pipeline_tuples - with four items which I’ve laid out and annotated below to try to highlight the structure and contents.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[
  # The "source" - a tuple with two elements, the first being the initial string
  {"hello world", 0},

  # This is the String.split step
  # Note it is also a 2tuple, the first being the ast for the call to
  # String.split *without* any args (i.e. the "hello world").
  # BTW the second element of the 2tuple is 0
  {{{:., [], [{:__aliases__, [alias: false], [:String]}, :split]}, [], []}, 0},

  # This is the Stream.map step
  # As before its a 2tuple where the first element is the call to
  # Enum.map with *only* one argument: the call to String.capitalize.
  # So it is missing the first argument: the result of the String.split
  {{{:., [], [{:__aliases__, [alias: false], [:Stream]}, :map]}, [],
    [{:&, [],
      [{:/, [context: Elixir, import: Kernel],
        [{{:., [], [{:__aliases__, [alias: false], [:String]}, :capitalize]}, [],
          []}, 1]}]}]}, 0}

  # The final step: the call to Enum.join
  # As before, a 2tuple and missing its first (and only) argument: the result of the Stream.map
  {{{:., [], [{:__aliases__, [alias: false], [:Enum]}, :join]}, [], []}, 0}
]

To sum up:

  1. the list has for items;

  2. all items are tuples of size 2 (2tuples);

  3. the first element of each 2tuple is an AST;

  4. for the 2nd and subsequent tuples, the AST is missing its first argument

  5. the second element of each 2tuple is 0; this is the position of the previous AST in the arguments to the current AST i.e. its missing first argument

The Macro.pipe call

The Macro unpipe call has done most of the heavy lifting.

If you take another peek at the unpipe statement in the |> macro you’ll see it destructures the unpipe result - pipeline_tuples in my example code above - to pick out the first 2tuple, and drop its second element i.e 0 (the index):

1
iex(n)> [{first_ast, _index} | rest_tuples] = pipeline_tuples

The call to Macro pipe has the relatively simpler job of creating as AST that looks like the code when =|> has not been used i.e, from the above,

1
Enum.join(Enum.map(String.split("hello world"), &String.capitalize/1))

The call to Macro pipe in the |> uses a call to Erlang’s foldl which may not be familar to many who only know Elixir. So I’m going the rewrite this step using the more familiar Enum reduce

1
2
3
# thread the asts together, the first being the inner most
iex(n)> final_ast = rest_tuples |> Enum.reduce(first_ast, 
fn {rest_ast, rest_index}, this_ast -> Macro.pipe(this_ast, rest_ast, rest_index) end)

The pipe inserts the this_ast (e.g “hello world”) as the rest_index argument (e.g. 0th) in the rest_ast (e.g. the ast for the call to String.split).

To see what the code for final_ast looks like, Macro to_string will “convert” the AST back to Elixir

1
iex(n)> Macro.to_string(final_ast)

and you should see:

1
"Enum.join(Stream.map(String.split(\"hello world\"), &String.capitalize/1))"

Voilá!

Too many pipes!

Yeah well that’s all fine and awesome but, like, you know, too many pipes dude, too many pipes!

You may feel having to type the |> each time gets a bit old, after all you’ve already signalled your intention of creating a thread-first pipeline. What I’d really like to write is something like this:

1
2
3
4
|> "hello world" 
   String.split 
   Stream.map(&String.capitalize/1) 
   Enum.join

Of course this will give the parser kittens because each |> takes two arguments: one on the left and one on the right! Oh well. Lost cause? Maybe not?

How about a “Thread First” macro?

Consider the same code but written like this using a familiar Elixir do block passed to a macro called thread-first (like a def with no arguments and just a do block)

1
2
3
4
5
6
thread_first do
  "hello world"
  String.split 
  Stream.map(&String.capitalize/1) 
  Enum.join
end

What would a thread-first macro need to do to make this work?

You may have wondered why I took so much time explaining how |> works. Well its because its implementation holds the answer to my question - and it is really is quite simple.

We’ve already seen how to thread ASTs together using Macro pipe. How can we produce a list of ASTs from the do block to feed pipe?

As you may or may not know a do block is just the value of the :do key in the Keyword list passed as the last argument to the macro.

Lets have a stab at writing thread_first:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
defmacro thread_first(args) do

  # get the statements' asts from the do block
  [first_ast | rest_asts] = args
  # fetch the :do key
  |> Keyword.fetch!(:do)
  |> case do

       # if the do block has multiple statements it will be a __block__
       # and the statements' asts will be its args
       {:__block__, _, args} -> args

       # if only one statement in the do block the value of the :do:
       # key will be the statement's ast
       ast -> [ast]

     end

  # now use Macro.pipe to thread all the statements' asts together
  # and return the threaded ast
  rest_asts
  |> Enum.reduce(first_ast,
  fn rest_ast, this_ast ->
    # insert this_ast as the 0th argument of rest_ast
    Macro.pipe(this_ast, rest_ast, 0)
  end)

end

Proof of the Pudding!

Lets do some quick tests. First the original example creating the HelloWorld string.

1
2
3
4
5
6
7
8
9
10
11
12
test "thread1: HelloWorld" do

  result = thread_first do
    "hello world" 
    String.split 
    Stream.map(&String.capitalize/1) 
    Enum.join
  end

  assert result == "HelloWorld"

end

It doesn’t make any sense really to have a single line do block but thread_first handles it:

1
2
3
4
5
6
7
8
9
test "thread2: single-line do block" do

  result = thread_first do
    "hello world" 
  end

  assert result == "hello world" 

end

A daft example: a calculator

1
2
3
4
5
6
7
8
9
10
11
12
13
14
test "thread3: a calculator" do

  result = thread_first do

    42
    + 5 # 47
    - 17 # 30
    rem 26 # remainder is 4 

  end

  assert 4 == result 

end

Final Words

A bit of fun but it demonstrates how easy it is to roll-your-own code using Elixir’s macros.

My thread_first macro has only about 10 lines of actual code. That’s not only a tribute to the power of Elixir’s macros but also the standard library that includes utility modules such as Macro and Code that do so much of the heavy lifting for you.

Bit of a fess-up. Those of you who know any Clojure will have spotted fairly quickly that thread_first is equivalent to Clojure’s own thread first macro → which I’ve written about here.

Final Final Words

People are generally “warned off” writing and using macros. Don’t be. Use your own judgement when they are necessary and when not. And when they are, use them effectively just like any other feature of Elixir. And remember: Sometimes only a macro will do.

Code on Github

There is not much to the code but its on Github if anybody wants it:

1
2
3
4
cd /tmp
git clone git@github.com:ianrumford/elixir-thread-first.git
cd elixir-thread-first
mix test