can an element contain attribute as parsed by parser generated by ANTLR? if so, how?

Question

I am following this tutorial and successfully replicated its behavior except that I am using Antlr 4.7 instead of the 4.5 that the tutorial was using.

I am trying to build a DSL for expense tracker.

Was wondering if each element can have attributes?

E.g. this is what it looks like now

This is the code for the todo.g4 as seen in https://github.com/simkimsia/learn-antlr-web-js/blob/master/todo.g4

grammar todo;

elements
    : (element|emptyLine)* EOF
    ;

element
    : '*' ( ' ' | '\t' )* CONTENT NL+
    ;

emptyLine
    : NL
    ;

NL
    : '\r' | '\n' 
    ;

CONTENT
    : [a-zA-Z0-9_][a-zA-Z0-9_ \t]*
    ;

Meaning to say the element will also have 2 attributes such as amount and payee. To keep it simple, I will have the same sentence structure so to allow parsing to be done more easily.

the format will be pay [payee] [amount]

the example is pay Acme Corp 123,789.45

so the payee is Acme Corp and the amount is 12378945 as expressed in integers to denote the amount in denominations of cents

another example is pay Banana Inc 700

so the payee is Banana Inc and the amount is 70000 as expressed in integers to denote the amount in denominations of cents

I am guessing I need to change the todo.g4 and then re generate the parser.

Can an element have other attributes? If so, how do I get started?

UPDATE

This is my latest attempts ranked with latest updates on top:

I just figured out how to use grun and testRig. Thanks @Raven for that tip.

latest attempt: My latest expense.g4 (only difference from earlier attempt is the regex for payment)

grammar expense;

payments: (payment NL)* ;  
payment: PAY receiver amount=NUMBER ;  
receiver: surname=ID (lastname=ID)? ;  

PAY: 'pay' ;
NUMBER: ([0-9]+(','[0-9]+)*)('.'[0-9]*)?;
ID: [a-zA-Z0-9_]+ ;
NL: '\n' | '\r\n' ;  
WS: [\t ]+ -> skip ;

Earlier attempt: This is my expense.g4

grammar expense;

payments: (payment NL)* ;  
payment: PAY receiver amount=NUMBER ;  
receiver: surname=ID (lastname=ID)? ;  

PAY: 'pay' ;
NUMBER: [0-9]+ (',' [0-9]+)+ ('.' [0-9]+)? ;  
ID: [a-zA-Z0-9_]+ ;
NL: '\n' | '\r\n' ;  
WS: [\t ]+ -> skip ;

Earlier attempt: https://github.com/simkimsia/learn-antlr-web-js/commit/728813ac275a3f2ad16d7f51ce15fcc27d40045b#commitcomment-25127606

Earlier attempt: https://github.com/simkimsia/learn-antlr-web-js/commit/0c32aec6ffb4b4275db86d54e9788058a2ce8759#commitcomment-25125695

I don't understand all the code, but have an eagle eye for typos. Line 56 : var tokens = new new antlr4.CommonTokenStream(expenseLexer); --> two times new, probable cause of error. — BernardK
@BernardK Thanks. I have removed the extra new and used the payments function as suggested by Raven. I still see empty array when I tried to console.log — Kim Stacks

BernardK BernardK · Accepted Answer · 2017-10-24T17:19:38

Situation on October 24. 2017 at 19:00 UTC+1.

Your grammar works perfectly. I made a full test in Java.

File Expense.g4 :

grammar Expense;

payments
@init {System.out.println("Expense last update 1853");}
    : (payment NL)*
    ;

payment
    : PAY receiver amount=NUMBER
      {System.out.println("Payement found " + $amount.text + " to " + $receiver.text);}
    ;

receiver
    : surname=ID (lastname=ID)?
    ; 

PAY    : 'pay' ;
NUMBER : ([0-9]+(','[0-9]+)*)('.'[0-9]*)? ;
ID     : [a-zA-Z0-9_]+ ;
NL     : '\n' | '\r\n' ;  
WS     : [\t ]+ -> channel(HIDDEN) ; // keep the spaces (witout spaces ==> paydeltaco98)

File ExpenseMyListener.java :

public class ExpenseMyListener extends ExpenseBaseListener {
    ExpenseParser parser;
    public ExpenseMyListener(ExpenseParser parser) { this.parser = parser; }

    public void exitPayments(ExpenseParser.PaymentsContext ctx) {
        System.out.println(">>> in ExpenseMyListener for paymentsss");
        System.out.println(">>> there are " + ctx.payment().size() + " elements in the list of payments");
        for (int i = 0; i < ctx.payment().size(); i++) {
            System.out.println(ctx.payment(i).getText());
        }
    }

    public void exitPayment(ExpenseParser.PaymentContext ctx) {
        System.out.println(">>> in ExpenseMyListener for payment");
        System.out.println(parser.getTokenStream().getText(ctx));
    }
}

File test_expense.java :

import org.antlr.v4.runtime.ANTLRFileStream;
import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.tree.*;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.IOException;

public class test_expense {
    public static void main(String[] args) throws IOException {
        ANTLRInputStream input = new ANTLRFileStream(args[0]);
        ExpenseLexer lexer = new ExpenseLexer(input);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ExpenseParser parser = new ExpenseParser(tokens);
        ParseTree tree = parser.payments();
        System.out.println("---parsing ended");
        ParseTreeWalker walker = new ParseTreeWalker();
        ExpenseMyListener my_listener = new ExpenseMyListener(parser);
        System.out.println(">>>> about to walk");
        walker.walk(my_listener, tree);
    }
}

Input file top.text :

pay Acme Corp 123,456
pay Banana Inc 456789.00
pay charlie pte 123,456.89
pay delta co 98

Execution :

$ export CLASSPATH=".:/usr/local/lib/antlr-4.6-complete.jar"
$ alias
alias a4='java -jar /usr/local/lib/antlr-4.6-complete.jar'
alias grun='java org.antlr.v4.gui.TestRig'
$ a4 Expense.g4 
$ javac Ex*.java
$ javac test_expense.java 
$ grun Expense payments -tokens -diagnostics top.text
[@0,0:2='pay',<'pay'>,1:0]
[@1,3:3=' ',<WS>,channel=1,1:3]
[@2,4:7='Acme',<ID>,1:4]
[@3,8:8=' ',<WS>,channel=1,1:8]
[@4,9:12='Corp',<ID>,1:9]
...
[@32,90:89='<EOF>',<EOF>,5:0]
Expense last update 1853
Payement found 123,456 to Acme Corp
Payement found 456789.00 to Banana Inc
Payement found 123,456.89 to charlie pte
Payement found 98 to delta co

$ java test_expense top.text 
Expense last update 1853
Payement found 123,456 to Acme Corp
Payement found 456789.00 to Banana Inc
Payement found 123,456.89 to charlie pte
Payement found 98 to delta co
---parsing ended
>>>> about to walk
>>> in ExpenseMyListener for payment
pay Acme Corp 123,456
>>> in ExpenseMyListener for payment
pay Banana Inc 456789.00
>>> in ExpenseMyListener for payment
pay charlie pte 123,456.89
>>> in ExpenseMyListener for payment
pay delta co 98
>>> in ExpenseMyListener for paymentsss
>>> there are 4 elements in the list of payments
payAcmeCorp123,456
payBananaInc456789.00
paycharliepte123,456.89
paydeltaco98

can an element contain attribute as parsed by parser generated by ANTLR? if so, how?

3 Answers