I have an EBNF grammar describing a language. I would like to know if there is an equivalent LL(1) BNF grammar, so that I can build a recursive-descent parser for the language without "steamrollering over ambiguities".
The problem I'm facing is that my EBNF contains rules like:
StatementList: Statement (',' Statement)* ','?
If I convert this to BNF using "method 1" - these transformations but using right- rather than left-recursion, I obtain the following. Unfortunately this conversion of the StatementList
EBNF rule can never be part of an LL(1) grammar because ','
is in both the first and follow sets of StatementList1
:
BNF1
----
StatementList: Statement StatementList1 StatementList2
StatementList1: ',' Statement StatementList1 | ε
StatementList2: ',' | ε
So, I came up with "method 2" - an alternative approach to EBNF-to-BNF conversion, but I'm unsure if it is valid. First, pretend that all of the symbols on the right-hand-side of the rule are terminals, and give everything a one-character name. For example, StatementList
becomes S
, Statement
becomes s
and ','
becomes c
. Then we have:
S: s(cs)*c?
Next, treat the RHS as though it's a regular expression. We can convert it to an equivalent right-linear grammar:
S: sT
T: cU | ε
U: sT | ε
Now revert to the original names:
BNF2
----
StatementList: Statement StatementList1
StatementList1: ',' StatementList2 | ε
StatementList2: Statement StatementList1 | ε
AFAICT BNF2 is equivalent to BNF1, but it can form part of an LL(1) grammar.
Is method 2 a valid way to convert EBNF to BNF? If I use it to convert my entire EBNF grammar, will that indeed give me an equivalent grammar in BNF form?
Is it guaranteed that converting any single EBNF rule using method 2 will produce a BNF grammar fragment that could form part of an LL(1) grammar?
Hence is it possible (but of course not guaranteed) that converting my entire EBNF grammar to BNF using method 2 will produce a grammar that is LL(1)? Even though that was not possible using method 1?