4
votes

When GLR parser reduces some text to the same non-terminal in two ore more ways it merges parse subtrees. Rekers uses 'symbol nodes' for this.

I this not each non-terminal could cause a merge. Knowing in advance what non-terminals never merge will greatly simplify parse tree construction.

For instance in Elkhound Technical Report the author implemented C++ grammar for GLR parser. He describes it:

The grammar currently has 37 shift/reduce conflicts, 47 reduce/reduce conflicts and 8 ambiguous nonterminal.

How can I separate ambiguous and unambiguous nonterminal for a given CFG? Where can I read about this?

1

1 Answers

0
votes

How does knowing which non-terminals can produce an ambiguity simplify parse tree construction in a practical way?

If you have NO such nonterminals in a grammar, then you can leave out the symbol node construction and subtree sharing machinery, true. It isn't that much code, so this win is pretty minor.

But if any nonterminal can be ambiguous, then you need the machinery. Sharing that machinery across all ambiguities is pretty easy (I've build GLR parsers). So what exactly do you gain?

Lastly, you can't in general (theorem) determine if a grammar is ambiguous. Since a nonterminal represents a sub-grammar, you can't in general determine if any specific nonterminal is ambiguous. You can do this for lots of special cases. (My GLR parser generator in fact will find some ambiguities automatically basically by enumerating possible expansions of nonterminals).