How do we keep multiple semantic values during parsing with Happy/Haskell

Question

I'm trying to build a simple lexer/parser with Alex/Happy in Haskell, and I would like to keep some localisation information from the text file into my final AST.

I managed to build a lexer using Alex that build a list of Tokens with localisation:

data Token = Token AlexPosn Foo Bar
lexer :: String -> [Token]

in my Happy file, when declaring the %token part, I can declare what are the semantic part of the token with the $$ symbol

%token FOO  { Token _ $$ _ }

and in the parsing rule, the $i will refer to this $$.

foo_list: FOO  { [$1] }
        | foo_list FOO { $2 : $1 }

Is there a way to refer to the AlexPosn part and to the Foo part of the FOO token ? Right now I only know how do refer to only one of them. I can find information on a way to ''add several $$'', and to refer to them afterwards.

Is there a way to do so ?

V.

In fact, it doesn't seem possible even in the C flex/bison, so it should not be possible directly in haskell or caml. However, I could use a tuple data Token = Token (AlexPosn,Foo,Bar)) instead of several arguments. I'm leaving the question opened for a few days but I think I'll close it soon. — Vinz

Vinz Vinz · Accepted Answer · 2010-07-28T08:32:31

In the end, I did find 2 solutions:

pack all the meaning data in a tuple, so that $$ point to this tuple, then extract the data by projection:

data Token = Token (AlexPosn,Foo) Bar
%token FOO { Token $$ some_bar }
rule : FOO  { Ast (fst $1) (snd $1) }

do not use $$ at all: if you don't use $$, happy will give you the full token during the parsing, so it is up to you to extract what you really need from this token:

data Token = Token AlexPosn Foo Bar
%token FOO = { Token _ _ some_bar }
rule : FOO  { Ast (get_pos $1) (get_foo $1) }

get_pos :: Token -> AlexPosn
get_foo :: Token -> Foo

...

I think the first one is the most elegant. The second one can be quite heavy in term of lines of code if you are carrying a lot of information: you will have to build "projections" by hand (pattern matching and so on), and doing so in a safe way can be tricky if your token type is quite big.

How do we keep multiple semantic values during parsing with Happy/Haskell

2 Answers