48
votes

The design of GHC is based on something called STG, which stands for "spineless, tagless G-machine".

Now G-machine is apparently short for "graph reduction machine", which defines how laziness is implemented. Unevaluated thunks are stored as an expression tree, and executing the program involves reducing these down to normal form. (A tree is an acyclic graph, but Haskell's pervasive recursion means that Haskell expressions form general graphs, hence graph-reduction and not tree-reduction.)

What is less clear are the terms "spineless" and "tagless".

  1. I think that "spineless" refers to the fact that function applications do not have a "spine" of function application nodes. Instead, you have an object that names the function called and points to all of its arguments. Is that correct?

  2. I thought that "tagless" referred to constructor nodes not being "tagged" with a constructor ID, and instead case-expressions are resolved using a jump instruction. But now I'm not sure that's correct. Instead, it seems to refer to the fact that nodes aren't tagged with their evaluation state. Can anyone clarify which (if any) of these interpretations is correct?

4
Excuse me, but did you read this fundamental article about it? citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.53.3729permeakra

4 Answers

30
votes

GHC wiki contains an introductory article about STG written by Max Bolingbroke:

The STG machine is an essential part of GHC, the world's leading Haskell compiler. It defines how the Haskell evaluation model should be efficiently implemented on standard hardware. Despite this key role, it is generally poorly understood amongst GHC users. This document aims to provide an overview of the STG machine in its modern, eval/apply-based, pointer-tagged incarnation by a series of simple examples showing how Haskell source code is compiled.

16
votes

You are right about the "Spineless", that is it, if I'm correct. It is basically described on the 1988 article by Burn, Peyton-Jones and Robson, "The Spineless G-Machine". I've read it, but it is not so fresh in my mind. Basically, on the G-Machine, all stack entries point to an application node except the one on the top, which points to the head of the expression. Those application nodes make access to the arguments indirect, and in some G-Machine descriptions, before applying a function the stack is rearranged, so that the last n nodes on the stack are made to point to the argument instead of the application node. If I am not mistaken, the "Spineless" part is about avoiding having these application nodes (which are called the spine of the graph) on the stack altogether, thus avoiding that re-arrangement before each reduction.

As to the "Tagless" part, you are more correct now that you used to be, but... Using tags on nodes is a very, very old thing. Can you think on how a dynamically-typed language such as LISP was implemented? Every cell must have its value and a tag which says the type. If you want something you must examine the tag and act accordingly. In the case of Haskell, the evaluation state is more important than type, Haskell is statically typed.

In the STG machine, tags are not used. Tags were replaced, maybe through inspiration of OO lanaguages, by a set of function pointers. When you want the value of a node which has not been computed, the function will compute it. When it is already computed, the function returns it. This allows for a lot of creativity in what this function can do without making client code any more complex.

This "Tagless" part yes, is described in the "implementation of functional languages on stock hardware" article by SPJ.

There is also objection to this "tagless" thing. Basically, this involves function pointers, which is an indirect jump in computer architecture terms. And indirect jumps are an obstacle to branch prediction and hence to pipelining in general. Because either the architecture considers there is a data dependency on the jump argument, halting the pipeline, or the architecture assumes it does not know the destination and halts the pipeline.

3
votes
3
votes

The answer by migle is exactly what spinlessness and taglessness of the STGM mean. Today it does not worth trying to understand the names of the two features because the names stem from the history of graph reduction technologies: from G-machine, Spineless G-machine, and Spineless and Tagless G-machine.

The G-machine uses both spine and tags. A spine is a list of edges from the root node of a function application to the node of the function. For example, a function application of "f e1 e2 ... en" is represented as

root = AP left_n en
left_n = AP left_n-1 en-1 ...
left_2 = AP left_1 e1
left_1 = FUN f

in G-machine, and so a spine is a list of edges consisting of left_n -> left_n-1 -> ... -> left_2 -> left_1. It is literally a spine of the function application!

In the same function application, there are tags AP and FUN.

In the next advanced G-machine so called the Spineless G-machine, there is no such spine by representing such a function application in a contiguous block whose first slot points to f, the second slot points to e1, ..., and the n+1-th slot points to en. In this representation, we do not need a spine. But the block starts a special tag designating the number of slots and so on.

In the most advanced G-machine so called the Spineless Tagless G-machine, such a tag is replaced with a function pointer. To evaluate a function application is to jump to the code by the function pointer.

It is unfortunate to find that Simone Peyton Jones's STGM paper does not give compilation/evaluation rules in some abstract level, and so it is very natural for people not to be easy to understand the essence of the STGM.