2
votes

I inherited an ANTLR grammar, and now I need to write a good, old, YACC/BISON like parser (to be concrete, I use PLY for python). There are many strange rules, and I now struggle with the following:

factor : ('(' expr ')' | procedure | variable) ('.' variable | call)* '!'?

Currently, I just have the rules for the first part, which are

factor :  '(' expr ')' 
       | procedure 
       | variable

And the PLY rules:

def p_factor_1(self, p):                                                                                                                                                                                       
    """ factor : LPAREN expr RPAREN """                                                                                                                                                                        
    p[0] = p[2]                                                                                                                                                                                                

def p_factor_2(self, p):                                                                                                                                                                                       
    """ factor : variable                                                                                                                                                                                      
               | procedure_definition                                                                                                                                                                          
    """                                                                                                                                                                                                        
    p[0] = p[1]  

How can I transform that top piece into something actual suitable for PLY? I need reasonable rules, so that I can construct an AST node for the single expr, procedure, variable, but then also some node for chaining of variable access and chained calls. What makes it worse is that there is also the '!' . The creator of the original grammar did that for giving factorizing the highest precedence, but for the transformation, it is a total pain.

1
Grako has an ANTLR to Grako translator, and provides for simple parser construction in Python.Apalala

1 Answers

2
votes

Just do it methodically :)

           V → ω X             V → V X                V → ω 
V → ω X? ⇒            V → X* ⇒            V → ω | ζ ⇒ 
           V → ω               V →                    V → ζ

Of course, if you have to do more than one of the above in a single production, you'll need to introduce new non-terminals.

So:

factor : ('(' expr ')' | procedure | variable) ('.' variable | call)* '!'?

A. Introduce new non-terminals:

factor          : factor-prefix factor-suffix '!'?
factor-prefix   : '(' expr ')' | procedure | variable
factor-suffix   : factor-continue*
factor-continue : '.' variable | call

B. Substitute according to above rules

  factor          : factor-prefix factor-suffix '!'?
⇒
  factor          : factor-prefix factor-suffix '!'
  factor          : factor-prefix factor-suffix

  factor-prefix   : '(' expr ')' | procedure | variable
⇒
  factor-prefix   : '(' expr ')'
  factor-prefix   : procedure
  factor-prefix   : variable

  factor-suffix   : factor-continue*
⇒
  factor-suffix   : factor-suffix factor-continue
  factor-suffix   :

  factor-continue : '.' variable | call
⇒
  factor-continue : '.' variable
  factor-continue : call