exit antlr 3 parser early without raising exception

Question

I am using antlr 3.1.3 and generating a python target. My lexer and parser accept very large files. Based on command-line or dynamic run-time controlled parameters, I would like to capture a portion of the recognized input and stop parsing early. For example, if my language consists of a header and a body, and the body might have gigabytes of tokens, and I am only interested in the header, I would like to have a rule that stops the lexer and parser without raising an exception. For performance reasons, I don't want to read the entire body.

grammar Example;

options {
  language=Python;
  k=2;
}

language:
    header
    body
    EOF
    ;

header:
    HEAD
    (STRING)*
    ;

body:
    BODY { if stopearly: help() }
    (STRING)*
    ;

// string literals
STRING: '"'
    (   
        '"' '"'
    |   NEWLINE
    |   ~('"'|'\n'|'\r')
    )*
    '"'
    ;

// Whitespace -- ignored
WS:
    (   ' '
    |   '\t'
    |   '\f'
    |   NEWLINE
    )+ { $channel=HIDDEN }
    ;

HEAD: 'head';
BODY: 'body';
fragment NEWLINE: '\r' '\n' | '\r' | '\n';

Mike Lischke Mike Lischke · Accepted Answer · 2014-12-18T08:47:53

What about:

body:
    BODY {!stopearly}? => (STRING)*
;

?

That's using a syntantic predicate to enable certain language parts. I use that often to toggle language parts depending on a version number. I'm not 100% certain. It might be you have to move the predicate and the code following it into an own rule.

exit antlr 3 parser early without raising exception

2 Answers