0
votes

I have been working with Antlr, trying to parse and store the (grammars in antlr ) .g4 files in a data structure, so that i can be able to mutate over the rules, and then run antlr on garmmars with mutated rules. I have ANTLRv4Parser grammars, I am trying to write a listener that walks down the tree storing the tokens. However, doing that worked but for rules with alternatives the pipe "|" symbol appears to be off. This comes from the following rule in the antlrv4parser grammar, ruleAltList : alternative (OR alternative)* . So it seems i'm struggling to get tokens from child nodes of alternative before the pipe and then after the pipe in enterRuleAltList in my listener, it seems antlr does the preorder traversal so it gets the pipe before heading down to alternative.

so what i want is maybe using the same listener patterns in antlr with some sort of an inorder traversal.

here's snippet from antlrv4parser grammar

ruleAltList: labeledAlt (OR labeledAlt)* ;

the anltrv4parser grammar and other grammars can be found on this link https://github.com/antlr/grammars-v4/tree/master/antlr4

For example if i have the following grammar

grammar c; A : B | C;

I want to be able to store in a data structure as ["A", ":", "B","|","C",";"]

what i get is ["A", ":", "|","B", "C",";"]

So any ideas on how to override the enterRuleAltList method in my listener to have tokens from alternative child node before the OR, which is "|"?

1

1 Answers

0
votes

Reduced representation of the grammar:

parserRuleSpec
   : RULE_REF COLON ruleBlock SEMI 
   ;

ruleBlock
   : ruleAltList
   ;

ruleAltList
   : labeledAlt (OR labeledAlt)*
   ;

labeledAlt
   : terminal
   ;

Collecting all terminals of the nodes as encountered during the walk should result in an ordering ["A", ":", ";", "|", "B", "C"]. (Post the actual full listener code if what was originally given was not a typo.)

enterParserRuleSpec -> A : ;
    enterRuleBlock
       enterRuleAltList -> |
           enterLabeledAlt 
               enterTerminal -> B
           enterLabeledAlt
               enterTerminal -> C

When collecting the terminals, attention must be paid to their order in the list of context children relative to their sibling non-terminals.

Or, possibly, just collect the terminals into a list sorted by token index.