0
votes

I'm really stuck with a Shift-Reduce conflict in Goldparser.

I wrote a PHP-like grammar that theoretically should be able to parse the following script:

public $Test = null;
protected $bDemo = true;

function Main()
{
}

private function Run()
{
}

At the top I want to assign the global variables and after that come the function definitions.

To narrow down the problem I reduced my large grammar to the following lines which can reproduce the error. Obviously this is incomplete. The functions have no parameters, no return value and no statements...

"Start Symbol"   = <Script>

! ------------------------------------------------- Sets

{Name} = {Letter} + {Alphanumeric} + [_]

! ------------------------------------------------- Terminals

VarName        = '$'  {Name}*
FuncName       = {Name}*

! ------------------------------------------------- Rules

<Variable>       ::= VarName

<Value>          ::= 'null'
                   | 'true'
                   | 'false'

<Modifier>       ::= 'private'
                   | 'protected'
                   | 'public'

<ModifierOpt>    ::= <Modifier>
                   |

! ---------------

<GlbAssignVar>   ::= <ModifierOpt> <Variable> '=' <Value> ';'

<GlobalList>     ::= <GlobalList> <GlbAssignVar>
                   | <GlbAssignVar>

<GlobalListOpt>  ::= <GlobalList>
                   | 

! ---------------

<FuncDef>        ::= <ModifierOpt> 'function'  FuncName '('  ')' '{'  '}'

<FuncList>       ::= <FuncList> <FuncDef>
                   | <FuncDef>

<FuncListOpt>    ::= <FuncList>
                   |

! ---------------

<Script>         ::=  <GlobalListOpt> <FuncListOpt>

When building the LALR Table Goldparser tells me:

"A Shift-Reduce Conflict was fixed 'private', 'protected', 'public' can follow a completed rule and also be shifted. The conflict was resolved by selecting the 'shift' action over the 'reduce'. Be careful, some parts grammar may not be accessible. It is recommended that you attempt to remove all conflicts."

But the fix it applies makes that the grammar does not work correctly. In the above example I get a syntax error at function Main() where it expects a 'private', 'protected' or 'public' although I declared them as optional.

The error disappears when I remove the <ModifierOpt> from either the <FuncDef> definition or from the <GlbAssignVar> definition.

I have no idea how to solve this. Please help!

1

1 Answers

1
votes

Remember that the parser generator is building an LALR(1) parser, which means that the parser needs to be able to decide while scanning the input left-to-right (LR) whether to reduce an already-completed production or to shift a token which might form part of a not-already-completed production, looking only at the next (1) tokens (which is the token which might be shifted).

So, let's suppose we can see only the first token, private. Now, there are a couple of possibilities:

  1. There is a non-empty GlobalList, and the first declaration in it is private.

  2. The GlobalList is empty but the first function declaration in the FuncList is private.

These correspond to the inputs:

private $a = null;

private function a() {}

But we can only see the token private.

The production we're working on is:

<Script>         ::=  <GlobalListOpt> <FuncListOpt>

So in the first case, we want to shift private, since it will form part of a GlbAssignVar which will start a GlobalList. In the second case, though, we need to reduce an empty GlobalListOpt before shifting the private, which will be part of a FuncDef. In short, we need to decide whether or not to reduce GlobalListOpt ::= <empty> while looking only at the token private. And we can't know.

If we could generate an LALR(2) grammar, there would be no problem, because the token following private will definitely tell us whether it's a private global or a private function. But since that's not an option with this tool, we need to avoid making the decision until we know.

A simple solution is to get rid of GlobalListOpt and fold the optionality into the definition of Script:

<Script>         ::=  <GlobalList> <FuncListOpt>
                   |  <FuncListOpt>

This works because now the parser can always shift the private, even though it doesn't know which production will eventually be reduced. (That's the magic of LR parsing: the parser can keep several productions active at the same time; it doesn't need to commit to one or the other until the production is actually reduced.)

Of course, a simpler solution would be to just allow the user to interleave global definitions with functions:

<Script>         ::= 
                   | <Script> <FuncDef>
                   | <Script> <GlbAssignVar>

which gets rid of FuncList, FuncListOpt, GlobalList and GlobalListOpt, but possibly provides the user with what you consider to be too much flexibility.