Why my antlr lexer java class is "code too large"?

Question

This is the lexer in Antlr (sorry for a long file):

lexer grammar SqlServerDialectLexer;
/* T-SQL words */
AND: 'AND';
BIGINT: 'BIGINT';
BIT: 'BIT';
CASE: 'CASE';
CHAR: 'CHAR';
COUNT: 'COUNT';
CREATE: 'CREATE';
CURRENT_TIMESTAMP: 'CURRENT_TIMESTAMP';
DATETIME: 'DATETIME';
DECLARE: 'DECLARE';
ELSE: 'ELSE';
END: 'END';
FLOAT: 'FLOAT';
FROM: 'FROM';
GO: 'GO';
IMAGE: 'IMAGE';
INNER: 'INNER';
INSERT: 'INSERT';
INT: 'INT';
INTO: 'INTO';
IS: 'IS';
JOIN: 'JOIN';
NOT: 'NOT';
NULL: 'NULL';
NUMERIC: 'NUMERIC';
NVARCHAR: 'NVARCHAR';
ON: 'ON';
OR: 'OR';
SELECT: 'SELECT';
SET: 'SET';
SMALLINT: 'SMALLINT';
TABLE: 'TABLE';
THEN: 'THEN';
TINYINT: 'TINYINT';
UPDATE: 'UPDATE';
USE: 'USE';
VALUES: 'VALUES';
VARCHAR: 'VARCHAR';
WHEN: 'WHEN';
WHERE: 'WHERE';

QUOTE: '\'' { textMode = !textMode; };
QUOTED: {textMode}?=> ~('\'')*;

EQUALS: '=';
NOT_EQUALS: '!=';
SEMICOLON: ';';
COMMA: ',';
OPEN: '(';
CLOSE: ')';
VARIABLE: '@' NAME;
NAME:
    ( LETTER | '#' | '_' ) ( LETTER | NUMBER | '#' | '_' | '.' )*
    ;
NUMBER: DIGIT+;

fragment LETTER: 'a'..'z' | 'A'..'Z';
fragment DIGIT: '0'..'9';
SPACE
    :
    ( ' ' | '\t' | '\n' | '\r' )+
    { skip(); }
    ;

JDK 1.6 says code too large and can't compile it. Why and how to solve the problem?

Gunther Gunther · Accepted Answer · 2011-06-09T11:27:39

Actually I wouldn't say this is a big grammar, and there must be a reason why it doesn't produce reasonably sized code.

I think the problem is directly related to this rule:

QUOTED: {textMode}?=> ~('\'')*;

Is there any particular reason why you want the QUOTED part as a separate token, rather than leaving it combined with the quote, as Bart also put it in his grammar? This would also make the textMode variable obsolete.

Dropping the QUOTE and replacing QUOTED with

QUOTED: '\'' (~'\'')* '\'';

most probably will solve the problem, even without splitting the grammar.

Why my antlr lexer java class is "code too large"?

3 Answers

A.g

SqlServerDialectLexer.g

EDIT