6
votes

The import statement or the tokenVocab option can be put in a parser grammar to reuse a lexer grammar.

Sam Harwell advises to always use tokenVocab rather than import [1].

Is there any difference between import and tokenVocab? If there's no difference (and Sam says to use tokenVocab), then why have the import statement?

[1] I actually recommend avoiding the import statement altogether in ANTLR. Use the tokenVocab feature instead. [Sam Harwell]

See ANTLR4: Unrecognized constant value in a lexer command

1

1 Answers

16
votes

First, Let's talk about import.

What import does is similar to #include in C/C++ language, which is copying the src to dst. ANTLR4 will try to merge the two grammars if there are conflicts.

Using import is kind of frustrating because there are so many constraints:

  1. Not every kind of grammar can import every other kind of grammar.

    • Lexer grammars can import lexer grammars.
    • Parser grammars can import parser grammars.
    • Combined grammars can import lexer or parser grammars.
  2. When imported, options in grammar will be ignored.

  3. When imported, mode is not allowed in lexer grammar.

So you actually can't import a lexer grammar in a parser grammar, because they are not the same kind. But you can import a lexer in a combined grammar.

Those constraints have narrowed the usage of import. I think the best situation to use import would be separating a big lexer or parser grammar into several parts, to make it easier to manage.

Now, remember that we can't import a lexer grammar in a parser grammar using import? That's why we need tokenVocab, which is designed to use a separate lexer in an parser or combined grammar.

The conclusion of above would be:

  • In a lexer grammar, you can only use import.
  • In a parser grammar, you can only use import to import another parser grammar. You can only use tokenVocab to use another lexer grammar.
  • In a combined grammar, you can use both import and tokenVocab

For the third one, what's the difference now?

The difference is that using tokenVocab needs to compile the lexer first, because tokenVocab is only a option declaring the need of another grammar. While using import does not need that, because it will copy the src to the current grammar.

For example, there are three grammar files:

G1.g4

grammar G1;
r: B;

G2.g4

grammar G2;
import G1

G3.g4

grammar G3;
options { tokenVocab=G2; }
t: A;

If we directly compile G2, it will be OK. But if we try to compile G3, there comes the error:

error(160): G3.g4:3:21: cannot find tokens file ./G1.tokens

However, if we compile G1 first, there will be G1.tokens. Now compiling G3 would be a success.