I'm writing an Eclipse/Xtext plugin for CoffeeScript, and I realized I'll probably need to write a lexer for it by hand. CoffeeScript parser also uses a hand-written lexer to handle indentation and other tricks in the grammar.
Xtext generates a class that extends org.eclipse.xtext.parser.antlr.Lexer
which in turn extends org.antlr.runtime.Lexer
. So I suppose I'll have extend it. I can see two ways to do that
- Override
mTokens()
. This is done by the generated code, changing the internal state. - Override
nextToken()
which seems a natural approach, but then I'll have to keep track of the internal state.
I couldn't find any example how to write even a simple lexer for ANTLR without a grammar file. So the easiest answer would be a pointer to one.
An answer to Xtext: grammar for language with significant/semantic whitespace refers to todotext which handles the problem of indentation by changing the tokens in the underlying input stream. I don't want to go that way, because it would be difficult to handle other tricks of the coffeescript grammar.
UPDATE:
I realized in the meantime that my question was partly Xtext specific.
ITokenSource
- and do whatever you need to do in thenextToken
method. Have you checked out stackoverflow.com/questions/4414166/… There are examples on handling indentation (in Python, for instance) in the Definitive Antlr Reference. – Jimmy