1
votes

I'd like to write down a formal grammar for describing the command line usage of some GNU/Linux tools.

First, I would like to define a grammar :

Start -> COMMAND AXIS 

AXIS -> EMPTY | INTER

INTER -> VALUE | -OPT

VALUE -> any characters for files 

OPT -> OPION AXIS

OPTION -> WORD

WORD -> out | in | ... | LETTERS

LETTERS -> aLETTER |bLETTER | ... | zLETTER

LETTER -> a| b | c | ... | EMPTY | LETTERS

EMPTY -> 

COMMAND -> ls | tar | touch | openssl | vi | ... | cat 

I'll use this grammar with lex and yacc to parse commands. How can I do to define .l & .c files ??

1
It might not be doable, because the shell language is quite context sensitive. - Basile Starynkevitch
Note that whitespace (particularly newline) is significant on the command line, so you can't just ignore it in your formal grammar as you can in many languages where it is not significant. - Chris Dodd

1 Answers

2
votes

I had trouble following your grammar, but here is a basic simplified version to get you started.

Note: the returned strings are strdup()ed. They should really be freed after use.

Here's cl.l

%{
#define YYSTYPE char*
#include "y.tab.h"
%}

%%

ls|tar|touch|openssl|vi|cat     { yylval = strdup(yytext); return COMMAND; }

[A-Za-z0-9]+    { yylval = strdup(yytext); return VALUE; }

-[A-Za-z0-9]+   { yylval = strdup(yytext); return OPTION; }

[ \t]   /* ignore whitespace */ ;

\n { return EOL; }

%%

and here's cl.y

%{
#include <stdio.h>
#include <string.h>
#define YYSTYPE char *
%}

%token COMMAND VALUE OPTION EOL
%%

start: command EOL  { return 0; }

command: COMMAND  axis {printf("Command %s\n", $1);}
      | COMMAND {printf("Command %s\n", $1);}

axis: inter | axis inter ;

inter: VALUE  {printf("Inter value %s\n", $1);}
       | OPTION {printf("Inter option %s\n", $1);}
%%
int main (void) {
    return yyparse();
}

int yyerror (char *msg) {
    return fprintf (stderr, "Error: %s\n", msg);
}

To build it using yacc:

flex cl.l
yacc -d cl.y
gcc -o cl y.tab.c lex.yy.c -lfl

To build it using bison:

Change #include "y.tab.h" to #include "cl.tab.h" in cl.l

flex cl.l
bison -d cl.y
gcc -o cl cl.tab.c lex.yy.c -lfl