0
votes

I am working on an Antlr grammar to parse polynomial expressions in multiple variables. Hence, I created the following grammar:

grammar Function;

parseFunction returns [java.util.List<java.util.List<Object>>]  :   { list = new java.util.ArrayList(); } (p=polypart { list.add($p.list); })+
;

polypart returns [java.util.List<Object> list]: 
  m=NUMBER { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); list.add("0"); }
| s=SIGN m=NUMBER {list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); }
| v=VARIABLE {list = new java.util.ArrayList(); list.add("+"); list.add($v.text); }
| s=SIGN v=VARIABLE {list = new java.util.ArrayList(); list.add($s.text); list.add($v.text); }
| v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add("+"); list.add($v.text); list.add($e.value); } 
| s=SIGN v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add($s.text); list.add($v.text); list.add($e.value); }
| m=NUMBER v=VARIABLE { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); $list.add($v.text); }
| s=SIGN m=NUMBER v=VARIABLE { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); $list.add($v.text); }
| m=NUMBER v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); list.add($v.text); list.add($e.value); }
| s=SIGN m=NUMBER v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); list.add($e.value); }
;

exponent returns [int value]: ('^' n=NUMBER) { $value = 1; if ( $n != null && $n.text.length() > 0) $value = Integer.parseInt($n.text); }
;

VARIABLE    : ('a'..'z'|'A'..'Z')+
;

NUMBER  : ('0'..'9')+
;

SIGN    :   ('+'|'-')
;

WS  :   (' '|'\t')+ {skip();} ;

Apparently, this does not work. Compiling this with Antlr 3.4, I get the warnings that

"SIGN NUMBER VARIABLE" uses alternatives 2,8; "SIGN NUMBER VARIABLE '^' NUMBER" uses alternatives 2,10; "NUMBER VARIABLE" uses alternatives 1,7 and "NUMBER VARIABLE '^' NUMBER" uses alternatives 1,9.

I could live with these warnings (although I am highly interested why they pop up), but the hard thing is the following error I get:

error(201): Function.g:6:47: The following alternatives can never be matched: 7,8,9,10

This happens since they were disabled due to the warnings, so I guess I must resolve the warnings first.


EDIT:

After thinking about the problem quite a bit, I modified my code by switching some lines of code and now I do at least not get any errors any more. However, I did not test it yet and I would love to get rid of the last two warnings, too. The new code is:

grammar Function;

parseFunction returns [java.util.List<java.util.List<Object>>]  :   { list = new java.util.ArrayList(); } (p=polypart { list.add($p.list); })+
;

polypart returns [java.util.List<Object> list]: 
s=SIGN m=NUMBER v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); list.add($e.value); }
| m=NUMBER v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); list.add($v.text); list.add($e.value); }
| s=SIGN m=NUMBER v=VARIABLE { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); $list.add($v.text); }
| m=NUMBER v=VARIABLE { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); $list.add($v.text); }
| s=SIGN v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add($s.text); list.add($v.text); list.add($e.value); }
| v=VARIABLE e=exponent { list = new java.util.ArrayList(); list.add("+"); list.add($v.text); list.add($e.value); } 
| s=SIGN v=VARIABLE {list = new java.util.ArrayList(); list.add($s.text); list.add(1); list.add($v.text); }
| v=VARIABLE {list = new java.util.ArrayList(); list.add("+"); list.add(1); list.add($v.text); }
| s=SIGN m=NUMBER {list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); }
| m=NUMBER { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); }
;

exponent returns [int value]: ('^' n=INTEGER) { $value = 1; if ( $n != null && $n.text.length() > 0) $value = Integer.parseInt($n.text); }
;

VARIABLE    : ('a'..'z'|'A'..'Z')+
;

INTEGER : ('0'..'9')+
;

NUMBER  : ('0'..'9')+(','('0'..'9')+)?
;

SIGN    :   ('+'|'-')
;

WS  :    (' ' | '\t' | '\r'| '\n')+ {skip();} 
;

And now I get the following warnings:

"NUMBER VARIABLE" uses mutiple alternatives: 4,10 (10 is disabled).
"SIGN NUMBER VARIABLE" uses multiple alternatives: 3,9 (9 is disabled).

I would be grateful if anybody could explain to me how to get rid of these last two warnings.


After testing the Parser I can say that it does accept:

X; +X; -X; X^42; +X^42; -X^42

And it does not accept:

42; +42; -42; 42X^42; +42X^42; -42X^42
1

1 Answers

0
votes

If a NUMBER and a VARIABLE are both accepted without a leading sign and an input of the form NUMBER VARIABLE is also accepted, then two interpretations are possible: NUMBER VARIABLE -> polypart polypart -> NUMBER polypart -> NUMBER VARIABLE or NUMBER VARIABLE -> polypart -> NUMBER VARIABLE Hence the cases with the leading SIGN have to be sourced out, then it works!

The following code does compile without warnings or errors:

grammar Function;

parseFunction returns [java.util.List<java.util.List<Object>> list] :   
    { list = new java.util.ArrayList(); }                                              ( f=functionPart { list.add($f.list); } )+
|   { list = new java.util.ArrayList(); } ( fb=functionBegin ) { list.add($fb.list); } ( f=functionPart { list.add($f.list); } )*
;

functionBegin returns [java.util.List<Object> list]:
m=NUMBER v=VARIABLE e=exponent  { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); list.add($v.text); list.add($e.value); }
| m=NUMBER v=VARIABLE           { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); list.add($v.text); }
| v=VARIABLE e=exponent         { list = new java.util.ArrayList(); list.add("+"); list.add("1");     list.add($v.text); list.add($e.value); }  
| v=VARIABLE                    { list = new java.util.ArrayList(); list.add("+"); list.add("1");     list.add($v.text); }
| m=NUMBER                      { list = new java.util.ArrayList(); list.add("+"); list.add($m.text); }
;

functionPart returns [java.util.List<Object> list] :    
s=SIGN m=NUMBER v=VARIABLE e=exponent   { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); list.add($v.text); list.add($e.value); }
| s=SIGN m=NUMBER v=VARIABLE            { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); list.add($v.text); }
| s=SIGN v=VARIABLE e=exponent          { list = new java.util.ArrayList(); list.add($s.text); list.add("1");     list.add($v.text); list.add($e.value); }
| s=SIGN v=VARIABLE                     { list = new java.util.ArrayList(); list.add($s.text); list.add("1");     list.add($v.text); }
| s=SIGN m=NUMBER                       { list = new java.util.ArrayList(); list.add($s.text); list.add($m.text); }
;

exponent returns [int value]: ('^' n=INTEGER) { $value = 1; if ( $n != null && $n.text.length() > 0) $value = Integer.parseInt($n.text); }
;

VARIABLE    : ('a'..'z'|'A'..'Z')+
;

INTEGER : ('0'..'9')+
;

NUMBER  : ('0'..'9')+ (','('0'..'9')+)?
;

SIGN    :   ('+'|'-')
;

WS  :    (' ' | '\t' | '\r'| '\n')+ {skip();} 
;

This grammar, if compiled and used in Java does accept most input values. Apparently, not all valid input values are accepted. As soon as a number not using a comma pops up, like +42, an error is thrown:

line 1:1 no viable alternative at input '+'

However, this does not match the original question and will be clarified here.