8
votes

I am trying to compile the ISO-SQL 2003 grammar from here http://www.antlr3.org/grammar/1304304798093/SQL2003_Grammar.zip. All three versions of it can be found here http://www.antlr3.org/grammar/list.html.

These are the steps I followed,

  1. java -jar antlr-3.3-complete.jar -Xmx8G -Xwatchconversion sql2003Lexer.g
  2. java -jar antlr-3.3-complete.jar -Xmx8G -Xwatchconversion sql2003Parser.g
  3. javac ANTLRDemo.java

ANTLRDemo.java file:

import org.antlr.runtime.*;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class ANTLRDemo {
   static String readFile(String path) throws IOException 
   {
       byte[] encoded = Files.readAllBytes(Paths.get(path));
       return new String(encoded, "UTF-8");
   }

   public static void main(String[] args) throws Exception {
       ANTLRStringStream in = new ANTLRStringStream( readFile(args[0]) );
       sql2003Lexer lexer = new sql2003Lexer(in);
       CommonTokenStream tokens = new CommonTokenStream(lexer);
       sql2003Parser parser = new sql2003Parser(tokens);
       parser.eval();
   }
}

First two steps work fine, then while compiling my main class I get a lot of errors related to Java syntax like these:

./sql2003Parser.java:96985: error: not a statement $UnsignedInteger.text == '1' ./sql2003Parser.java:96985: error: ';' expected $UnsignedInteger.text == '1' ./sql2003Parser.java:102659: error: unclosed character literal if ( !(((Unsigned_Integer3887!=null?Unsigned_Integer3887.getText():null) == '01')) ) {

Please let me know if I am doing something wrong in setting up the parser.
It would be helpful if someone can show me how exactly to setup this grammar using ANTLR.

Edit: After a little more fiddling, I think that these errors are caused by the actions present in lexer and parser rules. Is there a safe way to overcome this?

2

2 Answers

1
votes

You are not doing anything wrong, ANTLR has never been able to generate a working Java parser from these grammar files.

According to a post by Douglas Godfrey to antlr-interest in Oct 2011:

I generated a C parser and lexer. they both generate and compile successfully on my machine with 8GB heap allocated to Antlr.

...

I don't believe that it will ever be possible to get a working parser in Java. A C language parser on the other hand is quite possible.

1
votes

Yes, basically you’re right. The grammar is broken. But also there is an error in your ANTLRDemo.java as there’s no eval() method in Parser class. You should call method with the name of any rule of the parser grammar e.g. query_specification(). In the grammar itself there were some errors looking as a typo, some undefined Java error() method calls, skip() calls in parser that are only suitable in lexer. You see all fixes in this commit. I’ve published my research in this GitHub repository.

I started to fix obvious errors of the grammar, which led to the compilation errors in generated java code. I had the same errors that you posted. Eventually I have fixed all Java syntax errors but faced another one which it impossible to fix directly because it originates from limitation of JVM, the compilation error: code too large. Reading ANTLR mailing list there was a hint to extract some static members of the huge classes into separate interfaces and “implement” them to have a sort of multiple inheritance. With trial and error I ended up with 6 interfaces ”imlemented” by parser in sql2003Parser.java.

But still there are 2 problems:

  • Wrong start rule. Douglas Godfrey wrote grammar that starts with sql2003Parser rule. Unfortunately if you call parser by this start rule, it won’t parse correctly even simplest select a from b. So I call parser by query_specification rule to parse SELECT clause only.
  • Some other errors in grammar. I didn’t dig too deep in the grammar but query_specification fails to parse some random complex SQLs.