12
votes

There is something I can't understand about Common lisp.

Assume I'm writing a macro similar to this:

(defmacro test-macro () 
   (let ((result (gensym))) 
      `(let ((,result 1))
         (print (incf ,result))))) 

Than I can do

> (test-macro)
2
2

Now I want to see how it expands

> (macroexpand-1 '(test-macro))
(LET ((#:G4315 1)) (PRINT (INCF #:G4315))) ;
T

Ok. There are unique symbols generated with gensym that were printed as uninterned.

So as far as I know the uninterned symbols are the symbols for which the evaluator does't create symbol-data binding internally.

So, if we macro-expand to that form there should be an error on (incf #:G4315). To test this we can just evaluate that form in the REPL:

> (LET ((#:G4315 1)) (PRINT (INCF #:G4315)))
*** - SETQ: variable #:G4315 has no value

So why does the macro that expands to this string works perfectly and the form itself does not?

2

2 Answers

21
votes

Symbols can be interned in a package or not. A symbol interned in a package can be looked up and found. An uninterned symbol can't be looked up in a package. Only one symbol of a certain name can be in a package. There is only one symbol CL-USER::FRED.

You write:

So as far as I know the uninterned symbols are the symbols for which the evaluator does't create symbol-data binding internally.

That's wrong. Uninterned symbols are symbols which are not interned in any package. Otherwise they are perfectly fine. interned means registered in the package's registry for its symbols.

The s-expression reader does use the symbol name and the package to identify symbols during reading. If there is no such symbol, it is interned. If there is one, then this one is returned.

The reader does look up symbols by their name, here in the current package:

 (read-from-string "FOO") -> symbol `FOO`

a second time:

 (read-from-string "FOO") -> symbol `FOO`

it is always the same symbol FOO.

 (eq (read-from-string "FOO") (read-from-string "FOO"))  -> T

#:FOO is the syntax for an uninterned symbol with the name FOO. It is not interned in any package. If the reader sees this syntax, it creates a new uninterned symbol.

 (read-from-string "#:FOO") -> new symbol `FOO`

a second time:

 (read-from-string "#:FOO") -> new symbol `FOO`

Both symbols are different. They have the same name, but they are different data objects. There is no other registry for symbols, than the packages.

 (eq (read-from-string "#:FOO") (read-from-string "#:FOO"))  -> NIL

Thus in your case (LET ((#:G4315 1)) (PRINT (INCF #:G4315))), the uninterned symbols are different objects. The second one then is a different variable.

Common Lisp has a way to print data, so that the identity is preserved during printing/reading:

CL-USER 59 > (macroexpand-1 '(test-macro))
(LET ((#:G1996 1)) (PRINT (INCF #:G1996)))
T

CL-USER 60 > (setf *print-circle* t)
T

CL-USER 61 > (macroexpand-1 '(test-macro))
(LET ((#1=#:G1998 1)) (PRINT (INCF #1#)))
T

Now you see that the printed s-expression has a label #1= for the first symbol. It then later references the same variable. This can be read back and the symbol identities are preserved - even though the reader can't identify the symbol by looking at the package.

Thus the macro creates a form, where there is only one symbol generated. When we print that form and want to read it back, we need to make sure that the identity of uninterned symbols is preserved. Printing with *print-circle* set to T helps to do that.

Q: Why do we use uninterned generated symbols in macros by using GENSYM (generate symbol)?

That way we can have unique new symbols which do not clash with other symbols in the code. They get a name by the function gensym- usually with a counted number at the end. Since they are fresh new symbols not interned in any package, there can't be any naming conflict.

CL-USER 66 > (gensym)
#:G1999

CL-USER 67 > (gensym)
#:G2000

CL-USER 68 > (gensym "VAR")
#:VAR2001

CL-USER 69 > (gensym "PERSON")
#:PERSON2002

CL-USER 70 > (gensym)
#:G2003

CL-USER 71 > (describe *)

#:G2003 is a SYMBOL
NAME          "G2003"
VALUE         #<unbound value>
FUNCTION      #<unbound function>
PLIST         NIL
PACKAGE       NIL                      <------- no package
0
votes

gensym generate a symbol and when you print it you get the "string" representation of that symbol which isn't the same thing as "reader" representation i.e the code representation of the symbol.