2
votes

I'm currently implementing an Ada 2005 parser using Pyparsing and the reference manual grammar rules. We need this in order to analyze and transform parts of our aging Ada-codebase to C/C++.

Most things work.

However, one little annoying problem remains:

The grammar rule name when parsing scoped identifiers (rule selected_component) such as the expression "Global_Types.Integer2" fails because it is part of a left-associative grammar rule cycle.

I believe this rule is incorrectly written: the sub-rule direct_name should be placed after the sub-rule direct_name. In fact it should be placed last in the list of alternatives. Otherwise direct_name and in turn name matches and "Global_Types" only and then expects the string to end after that. Not what I want.

Therefore I now move the rule direct_name to the end of name-alternatives...but then I instead get an Pyparsing infinite recursion and Python spits out maximum recursion depth exceeded.

I believe the problem is caused either by the fact that

  • the associativity of the grammar rule selected_component is right-to-left. I've search the reference manual of Pyparsing but haven't found anything relevant. Should we treat dot (.) as an operator with right-to-left associativity or can we solve it throught extensions and restructurings of the grammar rules?

  • or by the fact that there is no check in Pyparsing infinite recursions. I believe this wouldn't be too hard to implement. Use a map from currently active rules (functions) to source position/offset (getTokensEndLoc()) and always fail a rule if the current source input position/offset equals the position related to the rule just entered.

Recursive expressions with pyparsing may be related to my problem.

The problem also seems closely related to Need help in parsing part of python grammar which unfortunately doesn't have answers yet.

Here's the Ada 2005 grammar rule cycle that causes infinite recursion:

Note that this problem is not an Ada-specific issue but is related to all grammars containing left-recursive rules.

1
Ada language lawyers hang out in comp.lang.ada, I'd give them a shot if I were you.Marc C
It would help if you copied the original grammar rules in the question.Apalala
@Apalala: I've added links to the grammar rules.Nordlöw
You might have more luck with a new question that strips your problem down to the bones? Looks like your problem is: how do you handle these three rules? (1) A => B | C (2) C => D "." E (3) D => A | F References to Ada and pyparsing could be putting off some grammar experts who don't know those specific technologies.John Fouhy
As @JohnFouhy says, you could post your PyParsing code for the problematic rules. Note that PyParsing doesn't seem to state it's parsing capabilities (LL(k), LALR(k), PEG?).Apalala

1 Answers

1
votes

For reference, as noted in GNAT: The GNU Ada Compiler, §2.2 The Parser, "The Ada grammar given [in] the ARM is ambiguous, and a table-driven parser would be forced to modify the grammar to make it acceptable to LL (1) or LALR (1) techniques."