I decided to build my own CSV parser in elixir as a practice project and managed to get something working without too much hassle.
I know that this was a problem that had been solved in the past by some of the "top" elixir devs so I decided to take a look at how they went about it.
I started looking at the source code for the elixir module NimbleCSV. It was written by José Valim, the creator of the language, with contributions from a few notable elixir devs so I thought that this was a good choice.
In the parse_string
function they check for the strings length with the function byte_size(string)
. I think I understand how this function works. e.g.
iex()> byte_size(<<104, 101, 108, 108, 111>>)
5
iex()> byte_size(<<104, 101, 108, 108, 111::9>>)
6
The first function is 40 bits
which is 5 bytes
(each value in the binary defaults to 8 bits in elixir if not told otherwise)
In the second I am assigning one of the values to be 9 bits
so the total is 41 bits
. This means that it is 6 bytes
(due to rounding)
sorry if some of the language is not exactly right
That makes sense to me. My, question is why would they choose this function over String.length
in this case? If they are just getting the length of a string wouldn't both return the same result?