I came across the following problem when I try to write a grammer for certain assembly lanuage.
The example grammar file looks like.
grammar test;
stat: operation+;
operation : (add | addi);
add : 'ADD' datatype xd ',' xn;
addi : 'ADD.s64' xd ',' '#' imm;
datatype : '.s64'| '.f32';
xd : 'X0' | 'X1';
xn : 'X0' | 'X1';
imm : '0' | '1' | '2' | '3' | '4';
The grammar should be able to parser two assembly instruction
ADD: ex. ADD.s64 X1, X2 or ADD.f32 X1, X2
ADD(imm) ex. ADD.s64 X1, # X3
The problem is that because the add(imm) can only have the .s64 as the datatype. I prefer not make a separate rule for datatype of ADD(imm).
However, when i enter ADD.s64 X1, X3, the parser always match with addi, and report the error "fail to match the #".
I guess it is because the logical of parser is to find the longest match of the text. (which is 'ADD.s64').
I am want to know is there a way, I can do error recovery so that it can then try to match the correct add rules?
add32 : 'ADD.f32' xd ',' xn;
andadd64 : 'ADD.s64' xd ',' (xn | '#' imm);
? – user2956272'.s64', '.f32'
is invalid, andimm : [0-9]+;
is not a valid parser rule (the[...]
is only valid inside lexer rules). Perhaps you posted a scaled down version of your grammar and introduced some errors that are not present in your real grammar? It's always a good idea to post a grammar/code that can be copy+pasted so that others can reproduce the problem you're describing (which is not possible now). – Bart Kiers