1
votes

I'm missing some basic knowledge. Started playing around with ATLR today missing any source telling me how to do the following:

I'd like to parse a configuration file a program of mine currently reads in a very ugly way. Basically it looks like:

A [Data] [Data]
B [Data] [Data] [Data]

where A/B/... are objects with their associated data following (dynamic amount, only simple digits). A grammar should not be that hard but how to use ANTLR now?

  • lexer only: A/B are tokens and I ask for the tokens he read. How to ask this and how to detect malformatted input?
  • lexer & parser: A/B are parser rules and... how to know the parser processed successfully A/B? The same object could appear multiple times in the file and I need to consider every single one. It's more like listing instances in the config file.

Edit: My problem is not the grammer but how to get informed by parser/lexer what they actually found/parsed? Best would be: invoke a function upon recognition of a rule like recursive descent

3

3 Answers

2
votes

ANTLR production rules can have return value(s) you can use to get the contents of your configuration file.

Here's a quick demo:

grammar T;

parse returns [java.util.Map<String, List<Integer>> map]
@init{$map = new java.util.HashMap<String, List<Integer>>();}
 : (line {$map.put($line.key, $line.values);} )+ EOF
 ;

line returns [String key, List<Integer> values]
 : Id numbers (NL | EOF)
   {
     $key = $Id.text;
     $values = $numbers.list;
   }
 ;

numbers returns [List<Integer> list]
@init{$list = new ArrayList<Integer>();}
 : (Num {$list.add(Integer.parseInt($Num.text));} )+
 ;

Num   : '0'..'9'+;
Id    : ('a'..'z' | 'A'..'Z')+;
NL    : '\r'? '\n' | '\r';
Space : (' ' | '\t')+ {skip();};

If you runt the class below:

import org.antlr.runtime.*;
import java.util.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String input = "A 12 34\n" +
                   "B 5 6 7 8\n" +
                   "C 9";
    TLexer lexer = new TLexer(new ANTLRStringStream(input));
    TParser parser = new TParser(new CommonTokenStream(lexer));
    Map<String, List<Integer>> values = parser.parse();
    System.out.println(values);
  }
}

the following will be printed to the console:

{A=[12, 34], B=[5, 6, 7, 8], C=[9]}
1
votes

The grammar should be something like this (it's pseudocode not ANTLR):

FILE ::= STATEMENT ('\n' STATEMENT)*    
STATEMENT ::= NAME ITEM*
ITEM = '[' \d+ ']'
NAME = \w+
1
votes

If you are looking for way to execute code when something is parsed, you should either use actions or AST (look them up in the documentation).