I'm having some trouble with a semantic predicate on an ANTLR parser rule. Here's my grammar, intended to recognize a couple different date formats:
grammar sample ;
options { language=Python3; }
@parser::header {
from datetime import datetime
}
month_number returns [val] : INTEGER { 1 <= int($INTEGER.text) <= 12 }? {$val = int($INTEGER.text)} ;
day_number returns [val] : INTEGER { 1 <= int($INTEGER.text) <= 31 }? {$val = int($INTEGER.text)} ;
year_4digit returns [val] : INTEGER { 1900 <= int($INTEGER.text) <= 2100 }? {$val = int($INTEGER.text)} ;
year_2digit returns [val] : '\''? INTEGER {(int($INTEGER.text) >= 65 or int($INTEGER.text) < 40)}?
{$val = (1900 + int($INTEGER.text)) if (int($INTEGER.text) >= 65) else (2000 + int($INTEGER.text))} ;
year_digits returns [val]
: year_4digit {$val = $year_4digit.val}
| year_2digit {$val = $year_2digit.val}
;
mdy returns [val]
: month_number '-' day_number '-' year_digits {$val = datetime($year_digits.val, $month_number.val, $day_number.val)}
| month_number '/' day_number '/' year_digits {$val = datetime($year_digits.val, $month_number.val, $day_number.val)}
;
ymd returns [val]
: year_4digit '-' month_number '-' day_number {$val = datetime($year_4digit.val, $month_number.val, $day_number.val)}
| year_4digit '/' month_number '/' day_number {$val = datetime($year_4digit.val, $month_number.val, $day_number.val)}
;
date_as_numbers returns [val]
: ymd {$val = $ymd.val}
| mdy {$val = $mdy.val}
;
INTEGER: '0'..'9'+ ;
I test that with the following program:
from myPackage.sampleParser import sampleParser
from myPackage.sampleLexer import sampleLexer
from antlr4 import CommonTokenStream
from antlr4 import InputStream
date_input = InputStream("2/12/2017".lower())
lexer = sampleLexer(date_input)
stream = CommonTokenStream(lexer)
parser = sampleParser(stream)
result = parser.date_as_numbers()
print(result.val)
This results in the following error:
line 1:1 rule year_4digit failed predicate: { 1900 <= int($INTEGER.text) <= 2100 }?
line 1:9 rule day_number failed predicate: { 1 <= int($INTEGER.text) <= 31 }?
Traceback (most recent call last):
File "/Users/kwilliams/Library/Preferences/IntelliJIdea2017.3/scratches/scratch_1.py", line 11, in <module>
result = parser.date_as_numbers()
File "/Users/kwilliams/git/myPackage/sampleParser.py", line 482, in date_as_numbers
localctx._ymd = self.ymd()
File "/Users/kwilliams/git/myPackage/sampleParser.py", line 436, in ymd
localctx.val = datetime(localctx._year_4digit.val, localctx._month_number.val, localctx._day_number.val)
TypeError: an integer is required (got type NoneType)
So what I believe is happening is that the predicate in year_4digit
throws an exception because the number 2
isn't in its range, but it returns a year_4digit
match anyway, which hasn't had its val
attribute populated, causing a downstream error about NoneType
. Is that correct?
If so - what's a good solution? Do I need to put the semantic predicates earlier in the rules or something? How would I do a lookahead to the INTEGER
token if that's the right solution?
(Also - I expected to be able to do $INTEGER.int
instead of int($INTEGER.text)
, but maybe that's not available in the Python target? Tangential and minor issue.)
BTW, the above grammar is a smallish excerpt from my real grammar, I'm hoping that there's a solution that doesn't require major changes to this part, potentially causing ripple effects that might take a while to sort out.
Thanks.
ymd
directly instead of thedate_as_numbers
rule. – Ken Williams