20
votes

I know there are some vaguely similar questions already relating to BNF (Backus-Naur Form) grammars in Python, but none of them help me much in terms of my application.

I have multiple BNFs that I need to write code for. The code should be able to both generate and recognize legal strings using the BNF grammar.

The first BNF I'm working with is for all real numbers in Python. It is as follows:

<real number>    ::= <sign><natural number> |
                     <sign><natural number>'.'<digit sequence> |
                     <sign>'.'<digit><digit sequence> |
                     <sign><real number>'e'<natural number>
<sign>           ::= ‘’ | ‘+’ | ‘-‘
<natural number> ::= ‘0’ | <nonzero digit><digit sequence>
<nonzero digit>  ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<digit sequence> ::= ‘’ | <digit><digit sequence>
<digit>          ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Any BNF parsers I've found for Python seem extraordinarily complex, or use outside libraries. Is there any simpler way to check against and generate using BNF grammar in Python?

3
BNF == Backus Normal Form? For those of us who don't play around with grammar parsers every day.Ben
@Ben yes, you're correct. Sorry for not clarifying, I'll edit the postJakemmarsh
Are you looking for something that will parse a BNF file to generate a grammar/lexer or something you can write in Python to describe to it an equivalent of BNF?Jon Clements♦
Okay - the friendliest and most versatile library that's still in active development I've used which uses Python based objects to describe grammars is pyparsing.wikispaces.comJon Clements♦
What libraries and tools have you looked at, and what is wrong with them?Marcin

3 Answers

9
votes

This post contains an example of a lexical scanner which doesn't need third-party libraries. It may not do all you want, but you should be able to use it as a basis for something that fits your needs.

I don't know if your applications all relate to lexical scanning - but if not, ply is a fairly easy to use parser (given that you need to know broadly how parsers work).

6
votes

have a look at https://github.com/erikrose/parsimonious

Parsimonious aims to be the fastest arbitrary-lookahead parser written in pure Python—and the most usable. It's based on parsing expression grammars (PEGs), which means you feed it a simplified sort of EBNF notation.

4
votes

I had good experiences with grako.

I used it for parseWKT.

It takes a EBNF as input and generates a PEG parser from it.

I think it would be reasonable simple to write a BNF to EBNF Parser in grako, which would then generate a parser from the EBNF