4
votes

I have the following grammar for parsing person name after normalizing it.

exp : fullName EOF;
fullName : title? f=name m=name? l=name;

title: TITLE;
name : NAME;

TITLE : 'mr'| 'mrs' | 'ms';
NAME : ('a'..'z')+;

WHITESPACE : ('\t' | ' ' | '\r' | '\n'| '\u0020' | '\u000C' )+ -> skip ;

When I parse a name like "mr john me smith" it works correctly but when one of the title tokens appears as a name like "mr john mr smith", I got the following error

line 1:8 extraneous input 'mr' expecting NAME
line 1:16 missing NAME at '<EOF>'
(exp (fullName (title mr) (name john) (name mr smith) (name <missing NAME>)) <EOF>)

Is there a way to use the token according to it position in the rule only and neglect it if it appeared in another location?

2
The lexer operates independently of the parser, so mr will always be parsed as a TITLE. See @SaraVF 's solutionAdrian Leonhard

2 Answers

3
votes

As long as the lexer will not be able to neglect it, the parser rule should be changed to

name : NAME | TITLE;

Modifying the lexer rule will not fix the problem and will generate another error.

1
votes

It's been a while since I used antlr, but try using

NAME : TITLE | ('a'..'z')+;

I don't think you can neglect it.. antlr sees that the token is a TITLE and therefore it stops looking. Saying that titles are also NAMES you have a workaround for this case.