0
votes

Hello(my english isn't very well I hope you will understand) , I have a misson to make a compiler, already made the language in the lex and yacc but i'm pretty stuck, our teacher asked from us to build AST tree from the language and print it with pre-order. he gave us example for the text:

function void foo(int x, y, z; real f) {
     if (x>y) {
        x = x + f;
     }
     else {
        y = x + y + z;
        x = f * 2;
        z = f;
     }
}

the output of the AST tree with pre-order should be:

(CODE
     (FUNCTION
     foo
     (ARGS
          (INT x y z)
          (REAL f)
     )
     (TYPE VOID)
     (BODY 
          (IF-ELSE
          (> x y)
          (BLOCK
                (= x 
                     (+ x f)
                )
           )
          (BLOCK
                (= y
                     (+
                         (+ x y) 
                         z
                     )
                )
                (
                     (= x
                         (* f 2)

                     )
                     (= z f)
                )
          )
    )
)

my question is how should I build the tree? I mean which token will go to left which will go to right so I can get the same output ? like

makeTree($1,$2,$3);

         node,left,right

Help please :)

1
Please describe what you've done so far to solve the problem.ik1ne
searched all of the internet, tried to build the tree again by myself with this example, but without success , already 4 days trying to configure thatLiran.B

1 Answers

2
votes

Stephen Johnson wrote a technical paper to accompany yacc about 42 years ago. Have you read it, and do you understand it?

If yes, a syntax rule like:

expr   :   expr '+' expr      { $$ = node( '+', $1, $3 ); } 

node is effectively creating an abstract syntax tree node; and each reduction performed by yacc is the opportunity to build this tree from the bottom up. That is the most important thing to know about yacc; it builds from the bottom up, and you need to construct your data structures likewise.

When the parse is complete ( for whatever version of complete your grammar yields ), the resultant value ($$) is the root of your syntax tree.

followup You might want to devise a node data structure something like this:

typedef struct Node Node;
typedef struct NodeList NodeList;
struct NodeList {
     int Num;
     Node *List;
};
struct Node {
      int Type;
      union {
          unsigned u; unsigned long ul; char *s, ... ;
          Variable *var;
          Node *Expression;
          NodeList *List;
      } Operands[3];
};

With this you could devise a node of type '+', which defined 2 Operands, corresponding to the Left and Right sides of the '+' opcode. You could also have a node of type IF, which had three operands:

  1. a conditional Expression
  2. a List of Nodes to perform if the conditional was true
  3. a List of Nodes to perform if the conditional was false.

You could have a node of type Func, which had three operands:

  1. A Type of return value.
  2. A List of arguments.
  3. A List of Nodes comprising the body of the function

I would give more examples but formatting lists with this UI is as much fun as kicking a whale down a beach.