0
votes

I'm trying to create a LALR(1) parser in Yacc/Bison that can accept commands with flexible grammar. An example is controlling the environment in a house by adjusting room temperatures (floats), window positions (ints), and ceiling fans (enum). Example (desired) grammar would be:

set kitchen 22.5, den 0 24.0, bathroom fast 25, kitchen slow

I'm stuck on how to handle the stacked data for what I've called a "room_arg" in the rules. I could add a room_arg arg type to the %union statement, set %type <arg> room_arg and make the "room_arg" rule:

room_arg:  TOK_CIELING_FAN { $$.fan = $1; }
         | TOK_WINDOW      { $$.window_position = $1; }
         | TOK_THERMOSTAT  { $$.temperature = $1; }

But when it comes to doing reduction of the "room_arg" rule to "room_args" - I've got to figure out which stack items are which. Add to that that the "room_cmd" for a particular room ("kitchen" in the example) doesn't require grouping in the command adds to my challenge.

Changing the format of the command would be lovely, but, alas not possible.

Could someone suggest a strategy for implementation, or perhaps improvement to the parser logic to make things simpler?

The grammar to criticize is below.

%{
typedef enum fan_setting_tag { OFF, SLOW, MEDIUM, FAST } fan_setting;

struct room_arg_tag {
   int          window_position;
   fan_setting  fan;
   float    temperature;
   char *   room_name;
} room_arg;
%}
%token TOK_ROOM
%token TOK_CIELING_FAN
%token TOK_WINDOW
%token TOK_THERMOSTAT
%token TOK_SET

%type <fan>     TOK_CIELING_FAN
%type <window>  TOK_WINDOW
%type <tempr>   TOK_THERMOSTAT
%type <str>     TOK_ROOM

%union {
    int         window;
    float       tempr;
    fan_setting fan;
}

%%

command: /* empty */ | cmd command

cmd: TOK_SET room_cmds { execute_command( ... ); }

room_cmds: room_cmd
         | room_cmd ',' room_cmds

room_cmd:  TOK_ROOM room_args

room_args: room_arg
         | room_arg room_args

room_arg:  TOK_CIELING_FAN
         | TOK_WINDOW
         | TOK_THERMOSTAT
1

1 Answers

1
votes

This is a case where you really want inherited attributes. You can simulate them with extra actions, but unfortunately yacc and bison don't give you a way to do them more directly (btyacc provides syntacic sugar to make this much easier.)

Basically what you do is use embedded actions to setup the inherited attributes and use $0 to access them within the lower level rules.

With btyacc you would do:

%union {
    int         window;
    float       tempr;
    fan_setting fan;
    room_arg    *room;
}

%token<fan>     TOK_CIELING_FAN
%token<window>  TOK_WINDOW
%token<tempr>   TOK_THERMOSTAT
%token<str>     TOK_ROOM

%type<room> room_cmds(<room>)
%type<>     room_cmd(<room>), room_args(<room>), room_arg(<room>)

%%

command: /* empty */ | cmd command ;

cmd: TOK_SET room_cmds(new_room_arg()) { execute_command( ... ); } ;

room_cmds($room):
           room_cmd($room) { $$ = $room; }
         | room_cmd($room) ',' room_cmds($room) { $$ = $room; }
;

room_cmd($room):  TOK_ROOM room_args($room) { $room->room_name = $1; } ;

room_args($room): 
           room_arg($room)
         | room_arg($room) room_args($room)
;

room_arg($room):
           TOK_CIELING_FAN { $room->fan = $1; }
         | TOK_WINDOW      { $room->window_position = $1; }
         | TOK_THERMOSTAT  { $room->temperature = $1; }
;

In yacc or bison, this becomes something like:

cmd: TOK_SET { $$ = new_room_arg(); } room_cmds { execute_command(...)' } '
room_cmds:
      room_cmd { $$ = $<room>0; }
    | room_cmd ',' { $$ = $<room>0; } room_cmds { $$ = $<room>0; }
;
room_cmd: TOK_ROOM { $$ = $<room>0; } room_args { $<room>0->room_name = $1; }

and so forth, which is tricky to get right (get all the right types in the right places) and somewhat error prone.

In this specific case, you could just use a global variable (as there's no recursion involved), but that doesn't work too well in more complex cases.