I'm currently implementing an Ada 2005 parser using Pyparsing and the reference manual grammar rules. We need this in order to analyze and transform parts of our aging Ada-codebase to C/C++.
Most things work.
However, one little annoying problem remains:
The grammar rule name
when parsing scoped identifiers (rule selected_component) such as the expression "Global_Types.Integer2"
fails because it is part of a left-associative grammar rule cycle.
I believe this rule is incorrectly written: the sub-rule direct_name
should be placed after the sub-rule direct_name
. In fact it should be placed last in the list of alternatives. Otherwise direct_name
and in turn name
matches and "Global_Types"
only and then expects the string to end after that. Not what I want.
Therefore I now move the rule direct_name
to the end of name
-alternatives...but then I instead get an Pyparsing infinite recursion and Python spits out maximum recursion depth exceeded.
I believe the problem is caused either by the fact that
the associativity of the grammar rule selected_component is right-to-left. I've search the reference manual of Pyparsing but haven't found anything relevant. Should we treat dot (
.
) as an operator with right-to-left associativity or can we solve it throught extensions and restructurings of the grammar rules?or by the fact that there is no check in Pyparsing infinite recursions. I believe this wouldn't be too hard to implement. Use a map from currently active rules (functions) to source position/offset (
getTokensEndLoc()
) and always fail a rule if the current source input position/offset equals the position related to the rule just entered.
Recursive expressions with pyparsing may be related to my problem.
The problem also seems closely related to Need help in parsing part of python grammar which unfortunately doesn't have answers yet.
Here's the Ada 2005 grammar rule cycle that causes infinite recursion:
- name =>
- selected_component =>
- prefix =>
- name
Note that this problem is not an Ada-specific issue but is related to all grammars containing left-recursive rules.