Parsing implied versus explicit times operator

Question

I've been writing a LALR parser using ply and have come across an inconsistency when trying to parse multiplication.

As the full parser link is several thousand lines long I won't include it here, but I've created a simple demonstration:

import ply.lex as lex
import ply.yacc as yacc

tokens = (
    'int',
    'times',
    'plus',
)

precedence = (
    ('left', 'plus'),
    ('left', 'times'),
)

t_ignore = ' \t\n '
t_int = r' \d+ '
t_plus = r' \+ '
t_times = ' \* '

def p_int(args):
    'expr : int'
    args[0] = int(args[1])

def p_times(args):
    '''expr : expr times expr
            | expr expr %prec times'''
    if len(args) == 3:
        args[0] = args[1] * args[2]
    elif len(args) == 4:
        args[0] = args[1] * args[3]

def p_plus(args):
    'expr : expr plus expr'
    args[0] = args[1] + args[3]

lex.lex()
parser = yacc.yacc()

while True:
    s = raw_input('>> ')
    print " = ", parser.parse(s)

There are no shift/reduce conflicts or reduce/reduce conflicts reported by PLY yet I get the following inconsistency:

    >>  1 + 2 3
     =  9
    >>  1 + 2 * 3
     =  7

This seems odd to me since the explicit and implicit times rules have the same precedence. But I think it could be due to the fact that PLY assigns a precedence to the 'times' token and thus shifts it onto the stack in favour of reducing the expression with the p_plus rule. How can I fix this?

Edit: Simpler demonstration.

can you just add open to your precedence association? I havent done grammars in a while — Joran Beasley
That might work in this case, but there are other cases to consider. For example '1 + 2 3' => 9 versus '1 + 2 * 3' => 7. — sn6uv
@JoranBeasley I've edited the question to make the example simpler. — sn6uv
can you add expr to your precedences? ... not sure that might break other stuff — Joran Beasley
Yeah, that doesn't work either. I think because expr is a non-terminal and PLY only lets you assign precedence to terminals. Edit: Something similar does work - adding int to precedence since that's the token that is shifted. — sn6uv

sn6uv sn6uv · Accepted Answer · 2013-04-22T23:51:50

A quick hack: add the int token to the precedence specification (with precedence of times). The int token will then be shifted onto the symbol stack appropriately. That is (per the original question),

precedence = (
    ('left', 'plus'),
    ('left', 'times', 'int'),
)

This works but is messy when dealing with a potentially large number of tokens (open brackets, symbols, floats, etc.).

>>  1 + 2 3
 =  7

I'd still like to know if there's a more elegant solution to this.

Parsing implied versus explicit times operator

2 Answers