0
votes

I am trying to write a yacc grammer to recognize the following pattern:

AAAA  -> Multiple As
BBBB  -> Multiple Bs
AAAABB -> 2 As followed by (AABB)
AAABBBBB  -> (AAABBB) followed by 2Bs

In general, I would like to group equal blocks of contiguous As and Bs together, in precedence over runs of just As or Bs. A straightforward grammar shows a bunch of conflicts.

I need a way to give precedence to this production.

T -> | AB | ATB

over

U -> | AU

(where T and U are yacc productions, A and B are tokens)

How is this done?

1

1 Answers

2
votes

In general, yacc cannot deal with ambiguous grammars. If you give it an ambiguous grammar (or any non-LALR(1) grammar), it will parse a subset of the language of that grammar, which may or may not be what you want.

With bison, you can use %glr-parser to create a glr parser that can parse an ambiguous grammar. However, you need to add %dprec and %merge directives to the appropriate spots in the grammar to resolve ambiguities, or you'll end up with a runtime error.

However, the language you've described isn't particularly ambiguous. In fact, its fairly trivially LALR(1), so can be easily handled by yacc:

%token A B
%%
input: multiA | multiB | twoA | twoB ;
multiA: A | A A | A A A | A A A moreAs ;
moreAs: A | moreAs A ;
multiB: B | multiB B ;
twoA: A A A A B B ;
twoB: A A A B B B B B ;