I am using ANTLR 3 to do the below.
Assume I have an SQL query. I know that in general it's WHERE, ORDER BY and GROUP BY clauses are optional. In terms of ANTLR's grammar I would describe that like this:
query : select_clause from_clause where_clause? group_by_clause? order_by_clause?
The rule for each clause will obviously start with the respective keyword.
What I actually need is to extract each clause's contents as a string without dealing with its internal structure.
To do this I started with the following grammar:
query : select_clause from_clause where_clause? group_by_clause? order_by_clause? EOF; select_clause : SELECT_CLAUSE ; from_clause : FROM_CLAUSE ; where_clause : WHERE_CLAUSE ; group_by_clause : GROUP_BY_CLAUSE ; order_by_clause : ORDER_BY_CLAUSE ; SELECT_CLAUSE : 'select' ANY_CHAR*; FROM_CLAUSE : 'from' ANY_CHAR*; WHERE_CLAUSE : 'where' ANY_CHAR*; GROUP_BY_CLAUSE : 'group by' ANY_CHAR*; ORDER_BY_CLAUSE : 'order by' ANY_CHAR*; ANY_CHAR : .; WS : ' '+ {skip();};
This one didn't work. I have had further attempts composing a correct grammar with no success. I suspect this task is doable with ANTLR3 but I am just missing smth.
More generally, I would like to be able to collect chars from the input stream into a single token until meeting a specific keyword that would indicate the beginning of a new token. This keyword should be the part of the new token.
Can you help me please?