1
votes

I am learning lexer and parser, so I am reading this classical book : flex & bison (By John Levine, Publisher: O'Reilly Media). An example is given that could not be parsed by bison :

phrase : cart_animal AND CART | work_animal AND PLOW
cart_animal-> HORSE | GOAT
work_animal -> HORSE | OX

I understand very well why it could not. Indeed, it requires TWO symbols of lookahead.

But, with a simple modification, it could be parsed :

phrase : cart_animal CART | work_animal PLOW
cart_animal-> HORSE AND | GOAT AND
work_animal -> HORSE AND | OX AND

I wonder why bison is not able to translate automatically grammar in simple cases like that ?

1

1 Answers

2
votes

Because simple cases like that are pretty well all artificial, and in the case of real-life examples, it is difficult or impossible.

To be clear, if you have an LR(k) grammar with k>1 and you know the value of k, there is a mechanical transformation with which you can make an equivalent LR(1) grammar, and moreover you can, with some juggling, fix the reduction actions so that they have the same effect (at least, as long as they don't contain side effects). I don't know any parser generator which does that, in part because correctly translating the reduction actions will be tricky, and in part because the resulting LR(1) grammar is typically quite large, even for small values of k.

But, as I mentioned above, you need to know the value of k to perform this transformation, and it turns out that there is no algorithm which can take a grammar and tell you whether it is LR(k). So all you could do is try successively larger values of k until you find one which works, or you decide to give up.