Tcl bytecode and procs

Question

I'm using tcl in c and generally speaking we create a new namespace for each execution of a tcl script. If I use proc in the script will it be in the under the same namespace? how does proc converted into bytecode? does it converted under the same namespace of the caller? are will it be on a different namespace and might invalidate the bytecode?

Can you please also give me a brief explanation of what is exactly tcl bytecode and how we get variables (both local and global) from the hashes, and how upvar effect it?

Thanks very much!!!

Donal Fellows Donal Fellows · Accepted Answer · 2017-03-14T19:59:13

Each procedure in Tcl knows what its namespace is; it's the same as the namespace containing its name (there's always one; the global namespace is called ::). As such, it probably doesn't make much sense to make a new namespace for every execution.

The bytecode itself (which executes on a sort of stack machine defined in tclExecute.c in the Tcl source code) is created when it is needed, which is usually when you are about to execute the procedure the first time. You can print out the bytecode using the tcl::unsupported::disassemble command:

% proc example {x} {
    return [expr {$x * 2 + 3}]
}
% puts [tcl::unsupported::disassemble proc example]
ByteCode 0x0x1008d1b10, refCt 1, epoch 15, interp 0x0x100829a10 (epoch 15)
  Source "\n    return [expr {$x * 2 + 3}]"...
  Cmds 2, src 32, inst 9, litObjs 2, aux 0, stkDepth 2, code/src 0.00
  Proc 0x0x103028010, refCt 1, args 1, compiled locals 1
      slot 0, scalar, arg, "x"
  Commands 2:
      1: pc 0-8, src 5-30        2: pc 0-7, src 13-29
  Command 1: "return [expr {$x * 2 + 3}]"...
  Command 2: "expr {$x * 2 + 3}"...
    (0) loadScalar1 %v0     # var "x"
    (2) push1 0     # "2"
    (4) mult 
    (5) push1 1     # "3"
    (7) add 
    (8) done

New-enough versions of Tcl 8.6 also support tcl::unsupported::getbytecode, which provides machine-readable access to the same sorts of information. You don't really want to parse the output of disassemble.

Calling a procedure from a different namespace doesn't invalidate the bytecode for that procedure. (Why would it? That'd be desperately inefficient for library code!) But there are operations that know how to access the outside world. Let's do an example with upvar:

% proc example2 {xvar} {
    upvar 1 $xvar x
    return [expr {[incr x] * 2 + 3}]
}
% puts [tcl::unsupported::disassemble proc example2]
ByteCode 0x0x10300e610, refCt 1, epoch 15, interp 0x0x100829a10 (epoch 15)
  Source "\n    upvar 1 $xvar x\n    return [expr {[incr x] * 2 +"...
  Cmds 4, src 58, inst 33, litObjs 4, aux 0, stkDepth 2, code/src 0.00
  Proc 0x0x103028390, refCt 1, args 1, compiled locals 2
      slot 0, scalar, arg, "xvar"
      slot 1, scalar, "x"
  Commands 4:
      1: pc 0-12, src 5-19        2: pc 13-31, src 25-56
      3: pc 22-30, src 33-55        4: pc 22-24, src 40-45
  Command 1: "upvar 1 $xvar x"...
    (0) push1 0     # "1"
    (2) loadScalar1 %v0     # var "xvar"
    (4) upvar %v1   # var "x"
    (9) pop 
    (10) nop 
    (11) nop 
    (12) nop 
  Command 2: "return [expr {[incr x] * 2 + 3}]"...
    (13) startCommand +19 3     # next cmd at pc 32, 3 cmds start here
  Command 3: "expr {[incr x] * 2 + 3}"...
  Command 4: "incr x"...
    (22) incrScalar1Imm %v1 +1  # var "x"
    (25) push1 2    # "2"
    (27) mult 
    (28) push1 3    # "3"
    (30) add 
    (31) done 
    (32) done

The sequence of operations for upvar itself is to push on the stack the level parameter and the name of the remote variable (which in this case comes from a variable passed in as an argument), then the upvar %v1 which binds the local variable table entry at index 1 to the variable in scope 1 (the caller) called by the name that came from xvar. The binding is done by making the local variable actually be a pointer to the other variable; once made, it is highly efficient. The global command uses a similar mechanism but slightly different (the nsupvar opcode binds to a variable in a named namespace).

There are a few operations that invalidate bytecode, but they're things like relocating a procedure or command that has a compilation function. If you're not using rename, you probably won't ever need to worry about it (and it is entirely automatic).

Doing an upvar into a procedure (typically from another procedure) is a bit different, since then you look up variables by name. The name table for the local variable table is part of the procedure metadata, and any variable not in there is stored in a hash table (it's used when you have an upvar into the procedure that uses a name not used for other purposes as a variable inside that proc; it can happen even if it isn't very common).

If you really want the details, there's no substitute for reading the Tcl source code. Bytecode is generated in quite a few places, but the core of it is tclCompile.c; the related execution engine is in tclExecute.c. Procedures are defined in tclProc.c, namespaces in tclNamesp.c and variables in tclVar.c. There's probably other relevant places to look, but those are the main ones.

Tcl bytecode and procs

1 Answers