0
votes

I am new to Antlr and I have defined a basic grammar using Antlr 3. The grammar compiles and ANTLRWorks generates the Parser and Lexer code without any problems.

The grammar can be seen below:

grammar i;

@header {
package i;
}

module      : 'Module1'| 'Module2';
object      : 'I';
objectType  : 'Name';
filters         : EMPTY | 'WHERE' module;
table       : module object objectType;
STRING      : ('a'..'z'|'A'..'Z')+;
EMPTY           : ' ';

The problem is that when I interpret the table Parser I get a MismatchedSetException. This is due to having the EMPTY. As soon as I remove EMPTY from the grammar, the interpretation works. I have looked on the Antlr website and some other examples and the Empty space is ' '. I am not sure what to do. I need this EMPTY.

When it interprets, I get the following Exception:

Interpreting...
[11:02:14] problem matching token at 1:4 NoViableAltException(' '@[1:1: Tokens : ( T__4 | T__5 | T__6 | T__7 | T__8 | T__9 | T__10 | T__11 | T__12 | T__13 | T__14 | T__15 );])
[11:02:14] problem matching token at 1:9 NoViableAltException(' '@[1:1: Tokens : ( T__4 | T__5 | T__6 | T__7 | T__8 | T__9 | T__10 | T__11 | T__12 | T__13 | T__14 | T__15 );])

As soon as I change the EMPTY to be the following:

EMPTY : '';

instead of:

EMPTY : ' ';

It actually interprets it. However, I am getting the following Exception:

Interpreting...
[10:57:23] problem matching token at 1:4 NoViableAltException(' '@[1:1: Tokens : ( T__4 | T__5 | T__6 | T__7 | T__8 | T__9 | T__10 | T__11 | T__12 | T__13 | T__14 | T__15 | T__16 );])
[10:57:23] problem matching token at 1:9 NoViableAltException(' '@[1:1: Tokens : ( T__4 | T__5 | T__6 | T__7 | T__8 | T__9 | T__10 | T__11 | T__12 | T__13 | T__14 | T__15 | T__16 );])

However, ANLTWorks still generates the Lexer and Parser code.

I hope you can help.

EDIT:

grammar i;

@header {
package i;
}

select      : 'SELECT *' 'FROM' table filters';';
filters : EMPTY | 'WHERE' conditions;
conditions  : STRING operator value;
operator    : '=' | '!=';
true            : 'true';
value           : true;
STRING  : ('a'..'z'|'A'..'Z')+;
EMPTY           : ' ';
1
I doubt that you need to explicitly capture whitespace, but I can't be sure without a better understanding of the usage. Could you give an example of how you plan to use the filters rule? E.g., what would the input look like if you wanted to use the WHERE keyword? Right now it isn't wired to another rule so an example would help. - user1201210
Well, I am unable to give the full details. The above is just an example. I am developing a grammar for my company and the requirement is big. I am unable to provide any information. Are you able to help? Assumming, the whitespace is needed. - user1646481
As stated above, this NoViableAltException still happens without the EMPTY. - user1646481
When I think of a rule named EMPTY, I think of filters: | 'WHERE' module; (the rule filters is satisfied by "empty" input in the first alternative) and not filters: ' ' | 'WHERE' module;'(a single space for the first alt). As soon as you match whitespace explicitly (i.e., you're not skipping it), it has to be accounted for between keywords and everywhere else, so I'm scowling menacingly at that EMPTY rule. ;) - user1201210
EMPTY is just an empty space. So what I would like to do is: when a user enters nothing, the grammar accepts it. - user1646481

1 Answers

1
votes

I'm still a bit unsure about usage, but I think we're talking about the same thing when we say "empty input". Here's an answer to get the ball rolling, starting with a modified grammar.

grammar i;

@header {
package i;
}

module      : 'Module1'| 'Module2';
object      : 'I';
objectType  : 'Name';
filters     : | 'WHERE' module;
table       : module object objectType filters;
STRING      : ('a'..'z'|'A'..'Z')+;
WS          : (' '|'\t'|'\f'|'\n'|'\r')+ {skip();}; //ignore whitespace

Note that I tacked filters onto the end of the table rule to explain what I'm talking about.

This grammar accepts the following input (starting with rule table) as it did before:

Module1 I Name

It works because filters matches even though nothing follows the text Name: it matches on empty input using the first alternative.

The grammar also accepts this:

Module1 I Name WHERE Module2

The filters rule is satisfied with the text WHERE Module2 matching the second alternative (defined as 'WHERE' module in the grammar).

A cleaner approach would be to change filters and table to the following rules (recognizing, of course, that I changed table in the first place).

filters     : 'WHERE' module; //no more '|' 
table       : module object objectType filters?; //added '?'

The grammar matches the same input as before, but the terms a little clearer: instead of saying "filters is required in table and filters matches on empty", we now say "filters are optional in table and filters doesn't match on empty".

It amounts to the same thing in this case. Matching on empty (foo: | etc;) is perfectly valid, but I've run into more problems using it than I have with matching optional (foo?) rules.


Update following your update.

I'm taking a step back here to get us out of the theoretical and into the practical. Here is an updated grammar, Java test code that invokes it, test input, and test output. Please give it a run.

Grammar Altered for testing but follows the same idea as before.

grammar i;

@header {
 package i;
}


selects     : ( //test rule to allow processing multiple select calls. Don't worry about the details.
                {System.out.println(">>select");}
                select
                {System.out.println("<<select");}
               )+ 
            ; 

select      : 'SELECT *' 'FROM' table filters? ';'
              {System.out.println("\tFinished select.");}       //test output
            ;

module      : 'Module1'| 'Module2';
object      : 'I';
objectType  : 'Name';
filters     : 'WHERE' conditions
              {System.out.println("\tFinished filters.");}      //test output
            ;

table       : module object objectType
              {System.out.println("\tFinished table.");}        //test output
            ;

conditions  : STRING operator value
              {System.out.println("\tCondition test on " + $STRING.text);}
            ;
operator    : '=' | '!=';
true_       : 'true';       //changed so that Java code could be generated
value       : true_;
STRING      : ('a'..'z'|'A'..'Z')+;
WS          : (' '|'\t'|'\f'|'\n'|'\r')+ {skip();}; //ignore whitespace

TestiGrammar.java

package i;
import java.io.InputStream;

import org.antlr.runtime.ANTLRInputStream;
import org.antlr.runtime.CharStream;
import org.antlr.runtime.CommonTokenStream;

public class TestiGrammar {
  public static void main(String[] args) throws Exception {
    InputStream resource = TestiGrammar.class.getResourceAsStream("itest.txt");

    CharStream input = new ANTLRInputStream(resource);

    resource.close();

    iLexer lexer = new iLexer(input);
    CommonTokenStream tokens = new CommonTokenStream(lexer);

    iParser parser = new iParser(tokens);
    parser.selects();
  }
}

itest.txt Test input file

SELECT * FROM Module2 I Name;
SELECT * FROM Module2 I Name WHERE foobar = true; 
SELECT * FROM Module2 I Name WHERE dingdong != true;

test output

>>select
    Finished table.
    Finished select.
<<select
>>select
    Finished table.
    Condition test on foobar
    Finished filters.
    Finished select.
<<select
>>select
    Finished table.
    Condition test on dingdong
    Finished filters.
    Finished select.
<<select