Out of interest, i want to learn how to write a parser for a simple language, to ultimately write an interpreter for my own little code-golfing language, as soon as i understood how such things work in general.
So I started reading Douglas Crockfords article Top Down Operator Precedence.
Note: You should probably read the article if you want a deeper understanding of the context of the code snippets below
I have trouble understanding how the var statement and the assignment operator = should work together.
D.C. defines an assignment operator like
var assignment = function (id) {
return infixr(id, 10, function (left) {
if (left.id !== "." && left.id !== "[" &&
left.arity !== "name") {
left.error("Bad lvalue.");
}
this.first = left;
this.second = expression(9);
this.assignment = true;
this.arity = "binary";
return this;
});
};
assignment("=");
Note: [[value]] refers to a token, simplified to its value
Now if the expression function reaches e.g. [[t]],[[=]],[[2]],the result of [[=]].led is something like this.
{
"arity": "binary",
"value": "=",
"assignment": true, //<-
"first": {
"arity": "name",
"value": "t"
},
"second": {
"arity": "literal",
"value": "2"
}
}
D.C. makes the assignment function because
we want it to do two extra bits of business: examine the left operand to make sure that it is a proper lvalue, and set an assignment member so that we can later quickly identify assignment statements.
Which makes sense to me up to the point where he introduces the
var statement, which is defined as follows.
The var statement defines one or more variables in the current block. Each name can optionally be followed by = and an initializing expression.
stmt("var", function () {
var a = [], n, t;
while (true) {
n = token;
if (n.arity !== "name") {
n.error("Expected a new variable name.");
}
scope.define(n);
advance();
if (token.id === "=") {
t = token;
advance("=");
t.first = n;
t.second = expression(0);
t.arity = "binary";
a.push(t);
}
if (token.id !== ",") {
break;
}
advance(",");
}
advance(";");
return a.length === 0 ? null : a.length === 1 ? a[0] : a;
});
Now if the parser reaches a set of tokens like [[var]],[[t]],[[=]],[[1]] the generated tree would look something like.
{
"arity": "binary",
"value": "=",
"first": {
"arity": "name",
"value": "t"
},
"second": {
"arity": "literal",
"value": "1"
}
}
The keypart of my question is the if (token.id === "=") {...} part.
I don't understand why we call
t = token;
advance("=");
t.first = n;
t.second = expression(0);
t.arity = "binary";
a.push(t);
rather than
t = token;
advance("=");
t.led (n);
a.push(t);
in the ... part.
which would call our [[=]] operators led function (the assignment function), which does
make sure that it is a proper lvalue, and set an assignment member so that we can later quickly identify assignment statements. e.g
{
"arity": "binary",
"value": "=",
"assignment": true,
"first": {
"arity": "name",
"value": "t"
},
"second": {
"arity": "literal",
"value": "1"
}
}
since there is no operator with a lbp between 0 and 10, calling expression(0) vs. expression (9) makes no difference. (!(0<0) && !(9<0) && 0<10 && 9<10))
And the token.id === "=" condition prevents assignments to an object member as token.id would either be '[' or '.' and t.led wouldn't be called.
My question in short is:
Why do we not call the, optionally after a variable declaration followable, assignment operators' available led function. But instead manually set the first and second members of the statement but not the assignment member ?
Here are two fiddles parsing a simple string. Using the original code and one using the assignment operators led.
cs? Or would that be considered rude? - Moritz Roessler