0
votes

I am learning erlang, and am trying to build an Http Server to better learn about how erlang works, I am able to get the request:

<<"GET / HTTP/1.1\r\nHost: 127.0.0.1:8000\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nAccept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5\r\nUser-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.51 Safari/534.3\r\nAccept-Encoding: gzip,deflate,sdch\r\nAccept-Language: en-US,en;q=0.8\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3\r\n\r\n">>

but I'm not exactly sure how to begin pattern matching, or am wondering if I'm going to have to build a FSM or something to keep track of the current parses and the state. Is there a simple method to extract headers and body using pattern matching, maybe splitting on \r\n? I'd prefer not to use something like mochiweb, since I am trying to learn fundamentals.

2

2 Answers

1
votes

Dead simple solution: check inet:setopts/2 and the {packet, http_bin} option.

The more involved solution:

You need to parse the binary. Something along the lines of:

-module(foo).
-compile(export_all).

x() ->
    <<"GET /hello/world/ HTTP/1.1\r\nHost: 127.0.0.1:8000\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nAccept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5\r\nUser-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.51 Safari/534.3\r\nAccept-Encoding: gzip,deflate,sdch\r\nAccept-Language: en-US,en;q=0.8\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3\r\n\r\n">>.

parse(<<"GET ", R/binary>>) ->
    parse_get(R).

parse_get(Bin) ->
    [Path, R] = binary:split(Bin, [<<" ">>]),
    {{get, Path}, R}.

The basic trick is that you want to see a parser as binary() -> {parse(), binary()} where the binary output is the remaining things to parse. This sets you up nicely for a combinator-parser or a recursive descent. An alternative is to convert the binary to a list and work on that, but it is going to be considerably slower. Look at the binary modules which can do a lot of heavylifting for you in this case.

Another alternative is to check either the Yaws or the Mochiweb applications which already has to do this and lure out what they do.

1
votes

To parse HTTP requests you can use erlang:decode_packet/3.