I want to take the .g
files from Apache Hive and build a parser (targeting JavaScript) -- initially, as just a way to validate user-input Hive queries. The files I'm using come from apache-hive-1.0.0-src\ql\src\java\org\apache\hadoop\hive\ql\parse
from the Hive tgz: HiveLexer.g
, HiveParser.g
, FromClauseParser.g
, IdentifiersParser.g
, SelectClauseParser.g
.
I see no indication within the grammar files which version of ANTLR to use, so I've tried running antlr (from apt-get pccts
), antlr3 and antlr4. they all throw errors of some sort, so I have no clue which one to run or if I can (or need to) convert the .g files between versions.
The errors I'm getting are as follows:
antlr -Dlanguage=JavaScript HiveParser.g
(looks like it doesn't support JS anyway):
warning: invalid option: '-Dlanguage=JavaScript'
HiveParser.g, line 17: syntax error at "grammar" missing { QuotedTerm PassAction ! \< \> : }
HiveParser.g, line 17: syntax error at "HiveParser" missing { QuotedTerm PassAction ! \< \> : }
HiveParser.g, line 17: syntax error at ";" missing Eof
HiveParser.g, line 28: lexical error: invalid token (text was ',')
antlr3 -Dlanguage=JavaScript HiveParser.g
:
error(10): internal error: Exception FromClauseParser.g:302:85: unexpected char: '-'@org.antlr.grammar.v2.ANTLRLexer.nextToken(ANTLRLexer.java:347): unexpected stream error from parsing FromClauseParser.g
error(150): grammar file FromClauseParser.g has no rules
error(100): FromClauseParser.g:0:0: syntax error: assign.types: <AST>:299:68: unexpected AST node: ->
error(100): FromClauseParser.g:0:0: syntax error: define: <AST>:299:68: unexpected AST node: ->
error(106): SelectClauseParser.g:151:18: reference to undefined rule: tableAllColumns
antlr4 -Dlanguage=JavaScript HiveParser.g
:
warning(202): HiveParser.g:30:0: tokens {A; B;} syntax is now tokens {A, B} in ANTLR 4
error(50): HiveParser.g:636:34: syntax error: '->' came as a complete surprise to me while looking for rule element
error(50): HiveParser.g:636:37: syntax error: '^' came as a complete surprise to me
error(50): HiveParser.g:638:50: syntax error: '->' came as a complete surprise to me while looking for rule element
error(50): HiveParser.g:638:53: syntax error: '^' came as a complete surprise to me
The antlr3 error referencing @org.antlr.grammar.v2.ANTLRLexer.nextToken
seems suspect. Is it using the v2 lexer instead of v3? If so, maybe v3 is what I should target, but it's somehow not hitting it?
Or is this not an issue with versioning and instead with invocation? Or is Hive built in a way that provides additional files needed?
^
or->
(in the parser) are indicative of ANTLR v3. – Lucas Trzesniewski