How does get_in() work when you pass it a function?

Question

In "Programming Elixir 1.6", there is this example:

authors = [
  %{name: "José", language: "Elixir"},
  %{name: "Matz", language: "Ruby"},
  %{name: "Larry", language: "Perl"}
]

languages_with_an_r = fn (:get, collection, next_fn) ->
  for row <- collection do
    if String.contains?(row.language, "r") do
      next_fn.(row)
    end
  end
end

IO.inspect get_in(authors, [languages_with_an_r, :name])
#=> ["José", nil, "Larry"]

I have some questions about the example:

The function that you pass to get_in() is called by Elixir and the first argument that Elixir passes to the function is the atom :get. How is that useful?
The third argument that Elixir passes to the function is a function that gets bound to next_fn. Where in the docs does it say how many arguments that function takes? What does that function do? How are we supposed to use next_fn? It seems to me that the for construct is already iterating over each map in the list, so what does the name next_fn even mean? Is next_fn used to somehow tag a row for further consideration?
Where does nil in the result list come from?

And, I'll say this: that example is one of the poorest examples I've seen in any programming book--because there's not adequate discussion of the example, and the docs for get_in() suck. That means there are at least three people who don't understand get_in(): me, Dave Thomas, and whoever wrote the docs--because if you can't explain something, they you don't understand it yourself.

Edit: I found this in the source code:

def get_in(data, [h | t]) when is_function(h), 
  do: h.(:get, data, &get_in(&1, t))

What does &1 refer to there? data? Why not just use data, then?

7stud 7stud · Accepted Answer · 2018-08-04T03:47:11

Well, I've been playing around in iex a bit:

iex(13)> mymax = fn x -> &max(&1, x) end
#Function<6.99386804/1 in :erl_eval.expr/5>

iex(15)> max_versus_3 = mymax.(3)
#Function<6.99386804/1 in :erl_eval.expr/5>

iex(16)> max_versus_3.(4)
4

iex(17)> max_versus_3.(2)
3

It looks like the syntax &max(&1, 3) returns the anonymous function:

fn (arg) -> max(&1, 3)

and when x=3 in the surrounding scope, the syntax &max(&1, x) will also return the function:

fn (arg) -> max(&1, 3)

and &1 will become whatever single arg that the anonymous function is called with.

In this example:

authors = [
  %{name: "José", language: "Elixir"},
  %{name: "Matz", language: "Ruby"},
  %{name: "Larry", language: "Perl"}
]

languages_with_an_r = fn (:get, collection, next_fn) ->
  for row <- collection do
    if String.contains?(row.language, "r") do
      next_fn.(row)
    end
  end
end

IO.inspect get_in(authors, [languages_with_an_r, :name])
#=> ["José", nil, "Larry"]

The call to get_in() here:

IO.inspect get_in(authors, [languages_with_an_r, :name])

matches the following function definition in the source code:

def get_in(data, [h | t]) when is_function(h), 
    do: h.(:get, data, &get_in(&1, t))

which creates the following bindings:

data = authors
h = languages_with_an_r
t = [:name]

Then Elixir executes the body of the function and calls:

h.(:get, data, &get_in(&1, t))

which is equivalent to:

languages_with_an_r.(
     :get, 
     authors, 
     fn (arg) -> get_in(&1, [:name])

That creates the binding:

next_fn = fn (arg) -> get_in(&1, [:name])

Therefore, in the authors example the line:

next_fn.(row)

is equivalent to calling:

fn (row) -> get_in(&1, [:name])

which causes get_in() to execute with these arguments:

get_in(row, [:name])

and that call to get_in() returns the value corresponding to the :name key in row. I think the authors example would be clearer if the parameter variables in the definition of languages_with_an_r() were renamed:

languages_with_an_r = fn (:get, collection, search_for_next_key_in) ->
      for row <- collection do
        if String.contains?(row.language, "r") do
          search_for_next_key_in.(row)
        end
      end
    end

That code will only search for the :name key in row if row.language contains an "r".

Finally, the following snippet shows where nil comes from:

iex(5)> for x <- [1, 2, 3] do       
...(5)> if x == 1_000_000, do: x+1    
...(5)> end
[nil, nil, nil]

Like in Ruby, it appears that a do block in Elixir returns the value of the last expression that was evaluated. And, when a do block doesn't evaluate an expression, then the do block returns nil by default. Therefore, if row.language does not contain an "r", then the if statement is skipped, and the do block doesn't evaluate any expressions, so by default the do block returns nil.

How does get_in() work when you pass it a function?

1 Answers