2
votes

I have a grammar that I've written using Yacc. The relevant parts of the grammar are excerpted here

postfix
    : primary
    | postfix '[' expr ']'
    | postfix '[' expr ':' expr ']'
    | postfix "." STRING
    | postfix '(' ')'
    | postfix '(' args ')'
    ;

unary
    : postfix
    | '!' unary
    | '-' unary
    | '+' unary
    ;

If you look at the postfix definition you'll notice that I have double quotes around the period in the fourth rule. I had to put this in because I got a shift/reduce conflict without it. I'm a bit confused why the shift/reduce conflict goes away when I change the type of quotes used and I suspect there is something going on here that I've missed. If anyone can explain the difference in these quotes and which one I ought to use I'd appreciate it.

1

1 Answers

4
votes

In bison, literals with single quotes and double quotes are different in that they name different tokens -- so '.' and "." are two distinct tokens. Using both in your grammar is considered bad form, as it is very confusing.

Note that only '-based single character tokens have any real relation to what is between the quotes. Such tokens get a token code equal to the character code for that single character. All other tokens get unique token value chosen by bison, chosen solely so that all distinct tokens get different token numbers, unless they are declared to be aliases.

So while the token '.' will get the token code 46 (assuming ascii), the token "." will get some other code (some number greater than 256). Unless you declare an alias for "." as some named token, there's no easy way for the lexer to know what the token code for "." is and return it.

All of the above applies only to bison; Berkeley yacc and AT&T yacc are different (from each other and from bison).


So the shift/reduce conflict went away when you changed to "." as there is some other use of '.' in your grammar and the two uses of '.' conflict with other. Changing one to "." makes the conflict go away as they are now two distinct tokens that the lexer needs to resolve. Of course, as your lexer probably never returns the "." token, this is probably NOT what you want.