Is this a valid method to convert EBNF to BNF?

Question

I have an EBNF grammar describing a language. I would like to know if there is an equivalent LL(1) BNF grammar, so that I can build a recursive-descent parser for the language without "steamrollering over ambiguities".

The problem I'm facing is that my EBNF contains rules like:

StatementList: Statement (',' Statement)* ','?

If I convert this to BNF using "method 1" - these transformations but using right- rather than left-recursion, I obtain the following. Unfortunately this conversion of the StatementList EBNF rule can never be part of an LL(1) grammar because ',' is in both the first and follow sets of StatementList1:

BNF1
----

StatementList:  Statement StatementList1 StatementList2
StatementList1: ',' Statement StatementList1 | ε
StatementList2: ',' | ε

So, I came up with "method 2" - an alternative approach to EBNF-to-BNF conversion, but I'm unsure if it is valid. First, pretend that all of the symbols on the right-hand-side of the rule are terminals, and give everything a one-character name. For example, StatementList becomes S, Statement becomes s and ',' becomes c. Then we have:

S: s(cs)*c?

Next, treat the RHS as though it's a regular expression. We can convert it to an equivalent right-linear grammar:

S: sT
T: cU | ε
U: sT | ε

Now revert to the original names:

BNF2
----

StatementList:  Statement StatementList1
StatementList1: ',' StatementList2 | ε
StatementList2: Statement StatementList1 | ε

AFAICT BNF2 is equivalent to BNF1, but it can form part of an LL(1) grammar.

Is method 2 a valid way to convert EBNF to BNF? If I use it to convert my entire EBNF grammar, will that indeed give me an equivalent grammar in BNF form?
Is it guaranteed that converting any single EBNF rule using method 2 will produce a BNF grammar fragment that could form part of an LL(1) grammar?
Hence is it possible (but of course not guaranteed) that converting my entire EBNF grammar to BNF using method 2 will produce a grammar that is LL(1)? Even though that was not possible using method 1?

Have you considered accepting Laurence's advice of doing an LR parse instead? :-) Just asking... — rici

rici rici · Accepted Answer · 2020-09-22T17:10:21

Converting your EBNF right-hand sides to DFAs and then rewriting the DFAs as right-regular grammars is indeed a valid procedure for converting EBNF to BNF. I believe you will find a precise description of the algorithm in Stephan Heilbrunner's 1979 paper On the definition of ELR(k) and ELL(k) grammars. (Paywalled. Hopefully you have access through your library or institution.)
There is no guarantee that the resulting BNF will be LL(1). There is no guarantee even that the resulting BNF will be LL(k) for any k. There isn't even a guarantee that the resulting BNF will be LR(k), but the odds are better, as Heilbrunner shows.
It's possible that you can produce an LL(1) grammar, of course. But it frequently won't.

Is this a valid method to convert EBNF to BNF?

1 Answers