Bison and flex calculator not assigning values properly

Question

I'm coding a Bison-Flex calculator following the Bison documentation example. When I try to assign a value to a variable it works properly when the value doesn't include other variables. For example, the calculator assigns to the variable a the correct value after this operation: a = 2 + 3 + 5 . However, if then I try to assign the value of a to another variable b, like b = a + 1, the calculator stores the value of a+1 in the variable a. If I try to print the name of $1 in my assignment rule in the Bison code, I get that in last example, the program is getting the variable a as $1, not b. I also tried enabling the debug option and it looks like Bison is applying the rules and the reductions properly.

The Bison file, calculadora_bison.y:

%{
#include <math.h>
#include <stdio.h>
#include <string.h>
#include "TablaSimbolos.h"
int yylex (void);
int yyerror (char *s);
%}

%union {
  float valor;
  NodoArbol * tptr;
};

%token <valor> NUM
%token <tptr> VAR FNCT 
%type <valor> exp

%right '='
%left '-' '+'
%left '*' '/'
%left NEG
%right '^'

%%
input: /* vac´ıo */
    | input line
;
line: '\n'
    | exp '\n' { printf ("\t%.10g\n", $1); }
    | exp ';' '\n' {  }
    | error '\n' { yyerrok; }
;
exp: NUM { $$ = $1;}
    | VAR { $$ = $1->valor.val;}
    | VAR '=' exp { $$ = $3; ActualizaNodoFloat($1->nombre, $3);}
    | FNCT '(' exp ')' { $$ = (*($1->valor.fnctptr))($3);}
    | exp '+' exp { $$ = $1 + $3; }
    | exp '-' exp { $$ = $1 - $3; }
    | exp '*' exp { $$ = $1 * $3; }
    | exp '/' exp { $$ = $1 / $3; }
    | '-' exp %prec NEG { $$ = -$2; }
    | exp '^' exp { $$ = pow ($1, $3); }
    | '(' exp ')' { $$ = $2; }
;
%%

struct init{
    char *fname;
    double (*fnct)();
};

struct init arith_fncts[] =
{
  { "atan", atan },
  { "cos",  cos  },
  { "exp",  exp  },
  { "ln",   log  },
  { "sin",  sin  },
  { "sqrt", sqrt },
  { 0, 0 },
};

int init_table(){
    int i;

    CrearArbol();

    for (i = 0; arith_fncts[i].fname != 0; i++){
        InsertarNodoFuncion(FNCT, arith_fncts[i].fname, arith_fncts[i].fnct);
    }

    return 0;
}

int main (){
    init_table ();
    yyparse ();
    return 0;
}

int yyerror (char *s){
    printf ("%s\n", s);
    return 0;
}

My Flex code, calculadora_flex.l:

%{
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "TablaSimbolos.h"
#include "calculadora_bison.tab.h"    

%}

ALNUM  [a-zA-Z][a-zA-Z0-9]*
DIGITO [0-9]*
FLOAT {DIGITO}*\.{DIGITO}*
OPERADORES [-+()=*^/\n;]

%%

{ALNUM} {
        NodoArbol s;
        s = buscaTS(yytext);                
        yylval.tptr = &s;
        return s.tipo;
        }

{FLOAT} {
        yylval.valor = atof(yytext);
        return NUM;
        }

{DIGITO} {      
        yylval.valor = atoi(yytext);
        return NUM;
        }

{OPERADORES} {          
            return (int)yytext[0];
            }


. { };

%%

For the symbol table, I'm using a BST in C. The header file is this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct NodoArbolType{
        int tipo; //Componente léxico
        char * nombre; // Lexema
        union {
            float val; 
            double (*fnctptr)();
        } valor;
        struct NodoArbolType *izda; //Nodos hijo
        struct NodoArbolType *dcha;
    }NodoArbol;

void ActualizaNodoFloat(char * nombrenodo, float valor);

void CrearArbol();

NodoArbol BuscarArbol(char * nombre);

NodoArbol buscaTS(char * nombre);

int InsertarNodoEntero(NodoArbol *newNode);

NodoArbol InsertarNodo(int Key, char * cA);

int InsertarNodoFloat(int Key, char * cA, float val);

int InsertarNodoFuncion(int Key, char * cA, double (*fnct)());

void EliminarArbol();

void ImprimirArbol90grados();

void ImprimirArbolConValores();

int EliminarNodo(char * nombre);

The .c file:

#include "TablaSimbolos.h"
#include "calculadora_bison.tab.h"
//Declaramos el nodo como estático para que solo pueda ser accedido desde aquí
static NodoArbol *raiz;

/**
 * Función que limpia recursivamente a partir de un nodo
 */
static void LimpiarArbol(NodoArbol *T){
    if(T==NULL) return; 
    if(T->izda != NULL) LimpiarArbol(T->izda); //Si sus hijos no son NULL, también
                                               //se limpian
    if(T->dcha != NULL) LimpiarArbol(T->dcha); 
    free(T); //Libera la memoria del puntero al nodo
    return;
}


/*
 * Función que crea un árbol a partir del nodo raíz
 */
void CrearArbol(){
    LimpiarArbol(raiz); //Primero libera el nodo raíz por si existía un árbol previo
    raiz = NULL;
    return;
}

/**
 * Función que determina si un árbol no tiene ningún nodo 
 */
int esVacio(){
    return(raiz==NULL);
}

/*
 * Función que busca en un árbol un nodo a partir de un string que correspondería
 * al campo nombre de los nodos
 */
NodoArbol BuscarArbol(char * nombre){
    NodoArbol *temp;

    temp = raiz; //Puntero auxiliar

    //Mientras temp no sea NULL (haya llegado al final del árbol) o no se 
    //encuentre el string nombre
    while((temp != NULL) && (strcmp(nombre, temp->nombre)!=0)){        
        if(strcmp(nombre, temp->nombre)<0) //Si el string es "menor" que el 
                                        //nombre del nodo a examinar (por ejemplo
                                        // function<while), sigue examinando por
                                        //el nodo hijo izquierdo
            temp = temp->izda;  
        else if(strcmp(nombre, temp->nombre)>0) //Análogamente al caso anterior
                                        //pero cuando sea mayor
            temp = temp->dcha;        
    }
    if(temp == NULL){ //Si el nombre no se ha encontrado, devuelve error
        temp = (NodoArbol *)malloc(sizeof(NodoArbol));
        temp->tipo = -1;
    }
    return(*temp);
}

/**
 * Función que inserta un NodoArbol en el árbol
 */
int InsertarNodoEntero(NodoArbol *nuevoNodo){
    NodoArbol *temp;
    NodoArbol *anterior;

    temp = raiz;
    anterior = NULL;

    //Busca la posición en la que se insertará el nodo
    while((temp != NULL) && (strcmp(nuevoNodo->nombre, temp->nombre)!=0)){
        anterior = temp;
        if(strcmp(nuevoNodo->nombre, temp->nombre)<0)
            temp = temp->izda;
        else if(strcmp(nuevoNodo->nombre, temp->nombre)>0)
            temp = temp->dcha;
    }

    //Añadimos el nodo al árbol
    if(anterior == NULL) //Si anterior==NULL, el nodo es el primero en introducirse
        raiz = nuevoNodo;

    else {
        //Si el nombre de nuevoNodo < anterior->nombre, lo introducimos a la izquierda
        if(strcmp(nuevoNodo->nombre, anterior->nombre)<0)
            anterior->izda = nuevoNodo;
        //En el caso contrario, lo introducimos a la izquierda
        else if(strcmp(nuevoNodo->nombre, anterior->nombre)>0)
            anterior->dcha = nuevoNodo;
    }
   return(nuevoNodo->tipo);
}

/**
 * Función que inserta un nodo en el árbol a partir de su valor y su nombre
 */
NodoArbol InsertarNodo(int valor, char *cA){
    NodoArbol *nuevoNodo;

    nuevoNodo = (NodoArbol *)malloc(sizeof(NodoArbol));
    nuevoNodo->tipo = valor;    
    nuevoNodo->nombre = (char*)malloc(sizeof(char) * strlen(cA));    
    strcpy(nuevoNodo->nombre, cA);
    nuevoNodo->valor.val=0;
    nuevoNodo->izda = nuevoNodo->dcha = NULL;
    InsertarNodoEntero(nuevoNodo);
    return(*nuevoNodo);
}

int InsertarNodoFloat(int Key, char * cA, float val){
    NodoArbol *nuevoNodo;

    nuevoNodo = (NodoArbol *)malloc(sizeof(NodoArbol));
    nuevoNodo->tipo = Key;
    nuevoNodo->valor.val = val;
    nuevoNodo->nombre = (char*)malloc(sizeof(char) * strlen(cA));    
    strcpy(nuevoNodo->nombre, cA);
    nuevoNodo->izda = nuevoNodo->dcha = NULL;    
    return(InsertarNodoEntero(nuevoNodo));
}

int InsertarNodoFuncion(int Key, char * cA, double (*fnct)()){
    NodoArbol *nuevoNodo;

    nuevoNodo = (NodoArbol *)malloc(sizeof(NodoArbol));
    nuevoNodo->tipo = Key;
    nuevoNodo->valor.fnctptr = fnct;
    nuevoNodo->nombre = (char*)malloc(sizeof(char) * strlen(cA));    
    strcpy(nuevoNodo->nombre, cA);
    nuevoNodo->izda = nuevoNodo->dcha = NULL;    
    return(InsertarNodoEntero(nuevoNodo));    
}

/*
 * Función que busca un nodo a partir del nombre y devuelve su valor. Si no lo
 * encuentra, inserta el nodo y devuelve el valor del nuevo nodo.
 */

NodoArbol buscaTS(char * nombre){
    NodoArbol id;   
    id=BuscarArbol(nombre);   
    if(id.tipo==-1){
        id=InsertarNodo(VAR, nombre);
    }    
    return id;
}

void ActualizaNodoFloat(char * nombreNodo, float valor){
    //printf("en actualizarNodoFloat, nombreNodo = %s \n", nombreNodo);

    EliminarNodo(nombreNodo);
    InsertarNodoFloat(VAR, nombreNodo, valor);

    //NodoArbol nodo = BuscarArbol(nombreNodo);
    //printf("en buscar dentro de actualizaNodo%f", nodo.valor.val);
}


/*
 * Función que elimina un árbol desde la raíz, liberando la memoria ocupada por él
 */
void EliminarArbol(){
    LimpiarArbol(raiz);
}

void ImprimirNodoConValores(NodoArbol * nodo){
    if(nodo->izda!=NULL){
        ImprimirNodoConValores(nodo->izda);
    }
    printf("%d - %s", nodo->tipo, nodo->nombre);
    if(nodo->tipo==VAR){
        printf(" --> %f \n", nodo->valor.val);
    }else{
        printf("\n");
    }
    if(nodo->dcha!=NULL){
        ImprimirNodoConValores(nodo->dcha);
    }
}

void ImprimirArbolConValores(){
    ImprimirNodoConValores(raiz);
}

int EliminarNodo(char * nombre){
    NodoArbol * anterior;
    NodoArbol *temp;
    NodoArbol *delParent;    // Parent of node to delete
    NodoArbol *delNode;      // Node to delete

    temp = raiz;
    anterior = NULL;

    //Encontramos el nodo a eliminar
    while((temp != NULL) && (strcmp(temp->nombre, nombre)!=0)){
        anterior = temp;
        if(strcmp(nombre, temp->nombre)<0)
            temp = temp->izda;
        else if(strcmp(nombre, temp->nombre)>0)
            temp = temp->dcha;
    }

    if(temp == NULL){ // No se encontró el nodo    
        printf("Key not found. Nothing deleted. El nombre es %s\n", nombre);
        return 0;
    }else{
        if(temp == raiz){ //Eliminamos la raíz
            delNode = raiz;
            delParent = NULL; 
        }else{
            delNode = temp;
            delParent = anterior;
        }
    }

    // Case 1: Deleting node with no children or one child 
    if(delNode->dcha == NULL){
        if(delParent == NULL)    // If deleting the root    
        {
            raiz = delNode->izda;
            free(delNode);
            return 1;
        }
        else
        {
            if(delParent->izda == delNode)
                delParent->izda = delNode->izda;
            else
                delParent->dcha = delNode->izda;
            free(delNode);
            return 1;
        }
    }
    else // There is at least one child 
    {
        if(delNode->izda == NULL)    // Only 1 child and it is on the right
        {
            if(delParent == NULL)    // If deleting the root    
            {
                raiz = delNode->dcha;
                free(delNode);
                return 1;
            }
            else
            {
                if(delParent->izda == delNode)
                    delParent->izda = delNode->dcha;
                else
                    delParent->dcha = delNode->dcha;
                free(delNode);
                return 1;
            }
        }
        else // Case 2: Deleting node with two children 
        {                          
            temp = delNode->izda;
            anterior = delNode;
            while(temp->dcha != NULL)
            {
                anterior = temp;
                temp = temp->dcha;
            }
            // Copy the replacement values into the node to be deleted 
            delNode->tipo = temp->tipo;
            strcpy(delNode->nombre, temp->nombre);

            // Remove the replacement node from the tree 
            if(anterior == delNode)
                anterior->izda = temp->izda;
            else
                anterior->dcha = temp->izda;
            free(temp);
            return 1;
        }
    }
}

rici rici · Accepted Answer · 2014-12-04T21:18:43

This really has very little to do with bison or flex.

The final problem is here, in your flex action for {ALNUM}; specifically the highlighted line:

{ALNUM} {
        NodoArbol s;
        s = buscaTS(yytext);                
        yylval.tptr = &s;
        return s.tipo;
        }

At that point, s is a local variable (local to the code block), which means that its lifetime ends with the return statement. Consequently, the address assigned to yylval.tptr is a dangling pointer, and what it points to is completely unpredictable. (This is undefined behaviour, so the consequences could be even more dramatic.)

But, really, it is a little weird that buscaTS returns a copy of the entire NodoArbol. Normally, a binary tree lookup would return a pointer to the tree node, since there is no reason for binary search tree nodes to move around.

Unfortunately, in your design the nodes do move around because when you modify the value associated with a symbol, you delete the symbol's node and then recreate it, rather than just changing the associated value. IMO, that's completely unnecessary and highly inefficient, and it also makes the symbol table design more difficult. If you change that, then you can just return the address of a node from BuscarArbol (and consequently from BuscaTS), and then you can use the returned pointer to update the value of the symbol in cases where that is necessary.

A couple of other things I noticed looking around your code:

When BuscarArbol fails to find a symbol, it mallocs a new node and then returns a copy of that node. The newly allocated node is never entered into the tree, and is never freed. If you change BuscarArbol to return a pointer, it could return the pointer to the newly-allocated node, leaving the caller the responsibility of inserting the node into the tree or freeing it. Or it could just return NULL instead of faking a node with type -1. (That would be my preference, FWIW.)
There is a lot of code duplication in the various ways of inserting a new node. I would suggest having a single routine which takes a name and inserts a new node with that name, and returns the pointer to the new node. The caller can then just fix the symbol type and value.
Also, when you do create a new node with a name, you copy the name like this:
```
nuevoNodo->nombre = (char*)malloc(sizeof(char) * strlen(cA));    
strcpy(nuevoNodo->nombre, cA);
```
That's a buffer overrun, because character strings in C are NUL-terminated and consequently require one more byte than the string length. So strcpy will overwrite the byte after the end of the allocation, which is undefined behaviour.

That's just what I noticed. I didn't examine the code in detail.

Bison and flex calculator not assigning values properly

1 Answers