The lexer grammar below contains two sets of rules: (1) rules for tokenizing CSV-formatted input, and (2) rules for tokenizing key/value-formatted input. For (1) I put the tokens on channel(0). For (2) I put the tokens on channel(1). Do you see any problems with my lexer grammar?
Also below is a parser grammar and it also contains two sets of rules: (1) rules for structuring CSV tokens into a parse tree, and (2) rules for for structuring key/value tokens into a parse tree. Do you see any problems with my parser grammar?
When I apply ANTLR to the grammar files, compile, and then run the test rig (with the -gui flag) using this CSV input:
FirstName, LastName, Street, City, State, ZipCode
Mark,, 4460 Stuart Street, Marion Center, PA, 15759
the parse tree is completely wrong - the tree contains no data. I have no idea why the parse tree is wrong. Any suggestions? I have tested each part separately (removed the key/value rules from the lexer and parser grammars and ran it with CSV input, removed the CSV rules from the lexer and parser grammars and ran it with key/value input) and it works fine.
Lexer Grammar
lexer grammar MyLexer;
COMMA : ',' -> channel(0) ;
NL : ('\r')?'\n' -> channel(0) ;
WS : [ \t\r\n]+ -> skip, channel(0) ;
STRING : (~[,\r\n])+ -> channel(0) ;
KEY : ('FirstName' | 'LastName') -> channel(1) ;
EQ : '=' -> channel(1) ;
NL2 : ('\r')?'\n' -> channel(1) ;
WS2 : [ \t\r\n]+ -> skip, channel(1) ;
VALUE : (~[=\r\n])+ -> channel(1) ;
Parser Grammar
parser grammar MyParser;
options { tokenVocab=MyLexer; }
csv : (header rows)+ EOF ;
header : field (COMMA field)* NL ;
rows : (row)* ;
row : field (COMMA field)* NL ;
field : STRING | ;
keyValue : pairs EOF ;
pairs : (pair)+ ;
pair : key EQ value NL2;
key : KEY ;
value : VALUE ;