3
votes

My knowledge on formal language and compiler theory is limited. So I am learning by the book The Definitive ANTLR 4 Reference by Terence Parr.

In the book, it says:

Now, with v4, you can match expressions with rules that look like this:

expr : expr '*' expr // match subexpressions joined with '*' operator
     | expr '+' expr // match subexpressions joined with '+' operator
     | INT           // matches simple integer atom
     ;

Self-referential rules like expr are recursive and, in particular, left recursive because at least one of its alternatives immediately refers to itself.

But I think the bold part is incorrect. Accroding to here, the production is left-recursive if the leftmost symbol on the right side is the same as the non terminal on the left side. For example,

    expr → expr + term.

So back to the Terence's book's sample, it is neither left recursive nor right recursive. It's merely recursive.

Am I right? Or did I misunderstand the author?

ADD 1

Thanks for the replies so far. Sorry I didn't make myself clear enough.

My understanding is: being left-recursive means the left-most part of the right-side is the same as the non-terminal on the left side. AND the remaining part of the right-side is ONLY suffix which is NOT the left-side.

In the Parr's book sample, the reamining part of the right-side is not a suffix, it's another same expr, so I don't think it strictly fits the rule of being left-recursive.

2
Pardon me if I don't see something obvious, but you wrote "But I think the bold part is incorrect. (..) if the leftmost symbol on the right side is the same as the non terminal on the left side.". Nonerminal on left side is clearly expr, and (the leftmost) on right side is expr so what's your rationale behind claiming that the rule stated in "according to here" is not satisfied? Is it the nonrecursive third part of the alternative (| INT) that makes you think so?quetzalcoatl
Ah so.. other answers are great, so just let me sum it up briefly: expr → expr + term is left-recursive, expr → term + expr is right-recursive, and expr → expr + expr is simply both left- and right- recursive at the same time.. <strike>The term means "anything"</strike> (uh oh.. strike-through doesnt work in comments..)quetzalcoatl
After re-reading artile you pointed to I think that the choice of words used there could be better. The line you referred to, The production is left-recursive if the leftmost symbol on the right side is the same as the non terminal on the left side. For example, expr → expr + term. uses a phrase "for example", which, actually, doesn't say that this is the only valid example. This example doesn't indicate that expr -> expr*expr is invalid. I'd consider this article an overview rather than strict definitions.quetzalcoatl
You added a condition "AND the remaining part of the right-side..." that's not in the definition. A definition is left-recursive if, in order to produce an X, you must start with an X. Whether you also need a Y, Z, or another X, or something else entirely is irrelevant.Willis Blackburn

2 Answers

1
votes

Let's compare what the book gives as an example of a left-recursive rule:

expr → expr + term.

with the first part of the compound rule you are examining:

expr : expr '*' expr // match subexpressions joined with '*' operator

Here, the left side is expr; the right side is expr '*' expr, and the left-most symbol on the right side is expr. Since the leftmost symbol on the right side, expr, is the same as the symbol on the left side, also expr, then this part of the rule is left recursive.

The second part of the compound rule would also be left-recursive, for the same reason:

expr : expr '+' expr

This third part of the compound rule is not left recursive, since INT is an atom:

expr : INT           // matches simple integer atom

When you combine these three different expression for expr into one using the | operator, you get the original expression in your question:

expr : expr '*' expr // match subexpressions joined with '*' operator
     | expr '+' expr // match subexpressions joined with '+' operator
     | INT           // matches simple integer atom
     ;

Which, since at least one of its three choices is left-recursive, is itself left-recursive.

1
votes

"The production is left-recursive if the leftmost symbol on the right side is the same as the non terminal on the left side."

In your example "expr -> expr + term" the leftmost symbol on the right side is "expr" and the non terminal on the left side is "expr" so they're the same. The production is left-recursive. That your alternatives are "expr * expr" and "expr + expr" make no difference; your definition is still left-recursive.

If you imagine you were creating a parser to parse this language, you'd have a function called "parseExpr" that would parse an "expr." According to the definition, the first thing it would have to do in order to evaluate "expr + (anything)" is to parse an "expr." So it should call the function to parse an "expr," which is of course itself. That's basically what left-recursive means.