0
votes

I am trying to match a very basic ANTLR grammar. But ANTLR is keep telling me that he got the input '.' and expects '.' .

The full error is:

line 1:0 extraneous input '.' expecting '.'
line 1:2 missing '*' at '<EOF>'

With the grammar:

grammar regex;

@parser::header
{
    package antlr;
}

@lexer::header
{
    package antlr;
}


WHITESPACE : (' ' | '\t' | '\n' | '\r') -> channel(HIDDEN);
COMP       : '.';
KLEENE     : '*';

start : COMP KLEENE;

And input:

.*

Both files have the same charset:

regex.g: text/plain; charset=us-ascii
test.grammar: text/plain; charset=us-ascii

There should be no Lexer rule mix up. Why does this not work as expected?

1
Have you checked that the dot is the same character both in grammar and input, i.e. this is not yet another different-unicode-character-that-looks-the-same mixup?Jiri Tousek
I copied the dot from the grammar into the input file. So, they should be the same.lauw

1 Answers

1
votes

Given your example grammar and this test class:

import org.antlr.v4.runtime.*;

public class Main {
  public static void main(String[] args) {
    String source = ".*";
    regexLexer lexer = new regexLexer(CharStreams.fromString(source));
    regexParser parser = new regexParser(new CommonTokenStream(lexer));
    System.out.println(parser.start().toStringTree(parser));
  }
}

the following is printed to my console:

(start . *)

My guess is you have either dumbed down the grammar too much causing the error in your original grammar to disappear, or you haven't generated new lexer/parser classes.