Converting external strings to atoms

Question

On p. 25 in "Programming Phoenix 1.4 (ebook, beta)", there is an aside by Chris McCord that says:

In the world action in our controllers, the external parameters have string keys, "name" => name, while internally we use name: name. That’s a convention followed throughout Phoenix. External data can’t safely be converted to atoms, because the atom table isn’t garbage-collected. Instead, we explicitly match on the string keys, and then our application boundaries like controllers and channels will convert them into atom keys, which we’ll rely on everywhere else inside Phoenix.

Paraphrasing, the quote says:

External data can’t safely be converted to atoms...so you convert string keys to atom keys...

Huh? I think what he is trying to say is that if somebody sends you some json data with 100 million (string) keys, and you blindly convert the whole json to an elixir map with atom keys, then you will be in danger of overflowing the atom table. On the other hand, if you use pattern matching to pick out the key/values that your are interested in from the json data, then insert them into an elixir map with atom keys, then you will obviously create fewer atoms in the atom table.

Sheharyar Sheharyar · Accepted Answer · 2018-11-27T01:29:01

That is correct. The erlang garbage collector safely disposes of all data that's not being used by any process, except atoms. That's because once atoms are created they're permanently stored in the Erlang Atom Table (which have a fixed limit).

From the Erlang manual:

Atoms are not garbage-collected. Once an atom is created, it is never removed. The emulator terminates if the limit for the number of atoms (1,048,576 by default) is reached.

That means if you use something like String.to_atom/1 on external data (for example input received from a socket or during a web request), a malicious user (or even a regular one, unknowingly) could DoS your symbol table, crashing your application. If for some reason, you do need to convert external strings to atoms, you should use String.to_existing_atom/1 which ensures that the atoms were indeed created before-hand.

Other Resources:

Blog Post: Monitoring Erlang Atoms
Erlang: System Limits
Mailing List: Why is there an Atom table?

^{On a side note, I actually created a package because of this very reason – I wanted to safely use atoms for user input in Phoenix web requests.}

Converting external strings to atoms

1 Answers