1
votes

Currently the grammar for my vector is like its a collection of numbers, strings, vectors and identifiers.

vector:
    '[' elements+=vector_members? (vector_delimiters elements+=vector_members)* ']'
;

vector_delimiters
:
    ','
;

vector_members:
    NUMBER
    | STRING 
    | vector
    | ID
;

Now, is there a way to enforce through grammar such that the vector can contain only elements of a particular type like numbers or strings etc

1

1 Answers

3
votes

Sure, there is a way, but that doesn't mean it's a good idea:

vector
    : '[' ']'
    | '[' elements+=NUMBER (vector_delimiters elements+=NUMBER)*  ']'
    | '[' elements+=STRING (vector_delimiters elements+=STRING )* ']'
    | '[' elements+=ID     (vector_delimiters elements+=ID)*      ']'
    | '[' elements+=vector (vector_delimiters elements+=vector)*  ']'
    ;

See, that's pretty ugly.

This kind of validation should not be part of the grammar. Build a visitor to check your consistency rules. The code will be simpler, more maintainable, and will respect the separation of concerns principle. Let the parser do the parsing, and do the validation in a later stage. As a bonus, you'll be able to provide better error messages than just unexpected token.

As a side note, your initial grammar will accept constructs like this: [ , 42 ]. Your vector rule should rather be:

vector
    : '[' ']'
    | '[' elements+=vector_members (vector_delimiters elements+=vector_members)* ']'
    ;