0
votes

I am trying to write a config file grammar and get ANTLR4 to handle it. I am quite new to ANTLR (this is my first project with it).

Largely, I understand what needs to be done (or at least I think I do) for most of the config file grammar, but the files that I will be reading will have arbitrary C code inside of curly braces. Here is an example:

Something like:

@DEVICE: servo "servos are great"
@ACTION: turnRight "turning right is fun"
{
arbitrary C source code goes here;
some more arbitrary C source code;
}
@ACTION: secondAction "this is another action"
{
some more code;
}

And it could be many of those. I can't seem to get it to understand that I want to just ignore (without skipping) the source code. Here is my grammar so far:

/**
ANTLR4 grammar for practicing
*/
grammar practice;


file:       (devconfig)*
    ;

devconfig:  devid (action)+
    ;

devid:      DEV_HDR (COMMENT)?
    ;

action:     ACTN_HDR '{' C_BLOCK '}'
    ;



DEV_HDR:    '@DEVICE: ' ALPHA+(IDCHAR)*
    ;

fragment
ALPHA:      [a-zA-Z]
    ;

fragment
IDCHAR:     ALPHA
    |       [0-9]
    |       '_'
    ;

COMMENT:    '"' .*? '"'
    ;

ACTN_HDR:   '@ACTION: ' ACTION_ID
    ;
fragment
ACTION_ID:  ALPHA+(IDCHAR)*
    ;

C_BLOCK:    WHAT DO I PUT HERE?? -> channel(HIDDEN)
    ;

WS:     [ \t\n\r]+ -> skip
    ;

The problem is that whatever I put in the C_BLOCK lexer rule seems to screw up the whole thing - like if I put .*? -> channel(HIDDEN), it doesn't seem to work at all (of course, there is an error when using ANTLR on the grammar to the tune of ".*? can match the empty string" - but what should I put there if not that, so that it ignores the C code, but in such a way that I can access it later (i.e., not skipping it)?

1

1 Answers

0
votes

Your C_BLOCK rule can be defined just like the usual multi line comment rule is done in so many languages. Make the curly braces part of the rule too:

C_BLOCK: CURLY .*? CURLY -> channel(HIDDEN);

If you need to nest blocks you write something like:

C_BLOCK: CURLY .*? C_BLOCK? .*? CURLY -> channel(HIDDEN);

or maybe:

C_BLOCK:
    CURLY (
      C_BLOCK
      | .
    )*?
    CURLY
;

(untested).

Update: changed code to use the non-greedy kleene operator as suggested by a comment.