I am using antlr 3.1.3 and generating a python target. My lexer and parser accept very large files. Based on command-line or dynamic run-time controlled parameters, I would like to capture a portion of the recognized input and stop parsing early. For example, if my language consists of a header and a body, and the body might have gigabytes of tokens, and I am only interested in the header, I would like to have a rule that stops the lexer and parser without raising an exception. For performance reasons, I don't want to read the entire body.
grammar Example;
options {
language=Python;
k=2;
}
language:
header
body
EOF
;
header:
HEAD
(STRING)*
;
body:
BODY { if stopearly: help() }
(STRING)*
;
// string literals
STRING: '"'
(
'"' '"'
| NEWLINE
| ~('"'|'\n'|'\r')
)*
'"'
;
// Whitespace -- ignored
WS:
( ' '
| '\t'
| '\f'
| NEWLINE
)+ { $channel=HIDDEN }
;
HEAD: 'head';
BODY: 'body';
fragment NEWLINE: '\r' '\n' | '\r' | '\n';