0
votes

This quote is from SICP that I think is talking about pointers/references in programming languages.

As we have seen, pairs provide a primitive “glue” that we can use to construct compound data objects. Figure 2.2 shows a standard way to visualize a pair—in this case, the pair formed by (cons 1 2). In this representation, which is called box-and-pointer notation, each object is shown as a pointer to a box. The box for a primitive object contains a representation of the object. For example, the box for a number contains a numeral. The box for a pair is actually a double box, the left part containing (a pointer to) the car of the pair and the right part containing the cdr.

In this case the book is talking about pairs (E.G. (cons 1 2)) and how they are represented. However we can also use pairs to construct a list like this:

(cons 1 (cons 2 '()))

Though box-and-pointer-notation is just a notation and useless, I think this looks a lot like a linked list. As I understand it, a linked list is a data structure that contains a value and a pointer to another linked list. Having said that I think cons can be constructed like a linked list. I'm confused by:

The box for a pair is actually a double box, the left part containing (a pointer to) the car of the pair and the right part containing the cdr.

I originally thought the pointer should be on cdr because that would be the next list if we were constructing a list through pairs.

I think this might be a different kind of pointer all together. What exactly does a pointer mean in this case? The only pointer I know is pointers used in c. Does SICP even mention anything about c pointers?

2

2 Answers

7
votes

I originally thought the pointer should be on cdr because that would be the next list if we were constructing a list through pairs.

A cons cell has two values, a car value and a cdr value. What you put into those is not constrained. You can put anything into it.

You can build different data structures out of cons cells. A singly linked list is just one. It could be a binary tree, an assoc list of cons cells providing access via key and value, a circular data structure, and more.

If we do

(cons
      (cons :foo 10)
      (cons :bar 5))

... then how the reference from the cons cell to its car value is done is mostly hidden from the programmer. Most implementations will underneath have some kind of data structure with pointers from a cons cell to its car and cdr components. There usually will be optimizations for small objects like characters and small integers (fixnums) - those can also be directly stored in the car and cdr, instead of using pointers to character objects.

Summary:

A cons cell has two values: a car and a cdr. Both are fully unconstrained: you can reference any other object/value.

Most of the implementation is hidden. In Lisp, all you get is following interface with the basic functions:

  • (consp thing) : the predicate returns T if thing is a cons cell.
  • (car cons-cell) : the car value of a cons cell
  • (cdr cons-cell) : the cdr value of a cons cell
  • (cons thing-0 thing-1) : creates a cons cell, thing-0 and thing-1 can be anything.

Lists are made out of cons cells. But there are other data structures which can be made out of cons cells.

3
votes

Yes, you are correct that cons cells are used to build linked lists in Lisps.

The left part contains the value of the cons. It's a pointer because the value may be a number, an object, another cons (in case of a tree for example) or anything else.

You are also correct that the right part, cdr, contains a pointer to the next cons in the list.

So a list (54 "foobar" 3) would look like this:

            (car cdr)
             /     \
            54      (car cdr)
                     /     \
                  "foobar"  (car nil)
                             / 
                            3