5
votes

I'm trying to write a compiler for C (simpler grammar though).

There is something that I've been stuck on for a while. If I understated correctly, all binary operations are left associative. So if we have we "x+y+z", x+y occurs first and then followed by plus z.

However, doesn't enforcing left associativity causes infinite left recursion?

So far all solutions that I've checked are either left associative or don't have left recursion, but not both. Is it possible to have a grammar that have both of these properties. Is it even possible?

Example:

Left Associative:

Expr = Term | Expr + Term
Term = Element | Term ∗ Element
Element = x|y|z|(Expr)

Left Recursion Eliminated:

Expr = Term ExprTail
ExprTail = epsilon | + Term ExprTail

Term = Element TermTail
TermTail = epsilon | * Element TermTail

Element = x|y|z|(Expr)

Any ideas?

1
You can eliminate left recusrion without losing left-associativity, and it is a standard procedure in LL parsers. The exponentiation operator, if you have one, is binary and right-associative.user207421
In C the assignment operators are right-associative, but that's it.sepp2k
@EJP Are you saying there's a way to rewrite the grammar in the question without left-recursion in such a way that the generated parse tree will be left-associative without having to do any post-processing on the tree? If so: how?sepp2k

1 Answers

5
votes

If an operator is left-associative, then the corresponding production will be left recursive.

If you use an LR parser generator, then there is no problem. The LR algorithm has no problem coping with left recursion (and little problem with any other kind of recursion although it might require a bit more stack space).

You could also use other bottom-up techniques, such as the classic operator-precedence algorithm (viz. shunting yard), but LR parsing is strictly more expressive and the parser generator makes implementation relatively simple.

If you insist on recursive descent parsing, that is possible because you can parse a repeated pattern using a loop (theoretically right recursive) but combine elements left-to-right. In some theoretic sense, this is a tree-rewrite of the AST, but I suspect that lots of programmers have coded that without noticing the tree fix-up.