As is explained in Removing left recursion , there are two ways to remove the left recursion.
- Modify the original grammar to remove the left recursion using some procedure
- Write the grammar originally not to have the left recursion
What people normally use for removing (not having) the left recursion with ANTLR? I've used flex/bison for parser, but I need to use ANTLR. The only thing I'm concerned about using ANTLR (or LL parser in genearal) is left recursion removal.
- In practical sense, how serious of removing left recursion in ANTLR? Is this a showstopper in using ANTLR? Or, nobody cares about it in ANTLR community?
- I like the idea of AST generation of ANTLR. In terms of getting AST quick and easy way, which method (out of the 2 removing left recursion methods) is preferable?
Added
I did some experiment with the following grammar.
E -> E + T|T T -> T * F|F F -> INT | ( E )
After left recursion removal, I get the following one
E -> TE' E' -> null | + TE' T -> FT' T' -> null | * FT'
I could come up with the following ANTLR representation. Even though, It's relatively pretty simple and straightforward, it seems the grammar that doesn't have the left recursion should be the better way to go.
grammar T; options { language=Python; } start returns [value] : e {$value = $e.value}; e returns [value] : t ep { $value = $t.value if $ep.value != None: $value += $ep.value } ; ep returns [value] : {$value = None} | '+' t r = ep { $value = $t.value if $r.value != None: $value += $r.value } ; t returns [value] : f tp { $value = $f.value if $tp.value != None: $value *= $tp.value } ; tp returns [value] : {$value = None} | '*' f r = tp { $value = $f.value; if $r.value != None: $value *= $r.value } ; f returns [int value] : INT {$value = int($INT.text)} | '(' e ')' {$value = $e.value} ; INT : '0'..'9'+ ; WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;