6
votes

Using the development version of SWI Prolog (Win x64), I wrote a DCG predicate for a deterministic lexer (hosted on github) (thus all external predicates leave no choice points):

read_token(parser(Grammar, Tables),
       lexer(dfa-DFAIndex, last_accept-LastAccept, chars-Chars0),
       Token) -->
(   [Input],
    {
     dfa:current(Tables, DFAIndex, DFA),
     char_and_code(Input, Char, Code),
     dfa:find_edge(Tables, DFA, Code, TargetIndex)
    }
->  { table:item(dfa_table, Tables, TargetIndex, TargetDFA),
      dfa:accept(TargetDFA, Accept),
      atom_concat(Chars0, Char, Chars),
      NewState = lexer(dfa-TargetIndex,
                       last_accept-Accept,
                       chars-Chars)
    },
    read_token(parser(Grammar, Tables), NewState, Token)
;   {
     (   LastAccept \= none
     ->  Token = LastAccept-Chars0
     ;   (   ground(Input)
         ->  once(symbol:by_type_name(Tables, error, Index, _)),
             try_restore_input(Input, FailedInput, InputR),
             Input = [FailedInput | InputR],
             format(atom(Error), '~w', [FailedInput]),
             Token = Index-Error
         ;   once(symbol:by_type_name(Tables, eof, Index, _)),
             Token = Index-''
         )
     )
    }
).

Now using (;) and -> a lot, I was wondering SWI-Prolog can optimize the recursive read_token(parser(Grammar, Tables), NewState, Token) using Tail-Call-Optimization, or if I have to split up the predicate into several clauses manually. I just don't know how to find out what the interpreter does, especially knowing that TCO is disabled when running the debugger.

1
Does the stack overflow when running the interpreter without debugging?hardmath
Looking at your code snippet, it seems the necessary condition for TCO is met (there is no choice point open when read_token/3 invokes itself). I realize you are asking whether SWI-Prolog really takes advantage of TCO in this specific case.hardmath
No, as I'm on x64 bit, the stack size is pretty high. I'm just asking for advice on the programming technique. I have no files big enough for the lexer to cause an Overflow I suppose (as I can debug the method just fine, where no TCO is applied)Sebastian
I could answer with command line options that reduce the stack size. That would be my rough-and-ready approach to testing for TCO.hardmath
What you are interested in is the local stack size threshold at which your code fails. See discussion at bottom of page How do I enlarge the stacks?.hardmath

1 Answers

6
votes

To answer your question, I first looked for "trivial" goals that might prevent last call optimization. If found some:

     ;   (   ground(Input)
         ->  once(symbol:by_type_name(Tables, error, Index, _)),
             try_restore_input(Input, FailedInput, InputR),
             Input = [FailedInput | InputR],
             format(atom(Error), '~w', [FailedInput]),
             Token = Index-Error
         ;   once(symbol:by_type_name(Tables, eof, Index, _)),
             Token = Index-''
         )

In these two cases, LCO is prevented by those goals alone.

Now, I compliled your rule and looked at the expansion with listing:

?- listing(read_token).
read_token(parser(O, B), lexer(dfa-C, last_accept-T, chars-J), Q, A, S) :-
        (   A=[D|G],
            dfa:current(B, C, E),
            char_and_code(D, K, F),
            dfa:find_edge(B, E, F, H),
            N=G
        ->  table:item(dfa_table, B, H, I),
            dfa:accept(I, L),
            atom_concat(J, K, M),
            P=lexer(dfa-H, last_accept-L, chars-M),
            R=N,
            read_token(parser(O, B),
                       P,
                       Q,
                       R,
                       S)        % 1: looks nice!
        ;   (   T\=none
            ->  Q=T-J
            ;   ground(D)
            ->  once(symbol:by_type_name(B, error, W, _)),
                try_restore_input(D, U, V),
                D=[U|V],
                format(atom(X), '~w', [U]),
                Q=W-X    % 2: prevents LCO
            ;   once(symbol:by_type_name(B, eof, W, _)),
                Q=W-''   % 3: prevents LCO
            ),
            S=A    % 4: prevents LCO
        ).

ad 1) This is the recursive case you most probably are looking for. Here, everything seems nice.

ad 2,3) Already discussed above, maybe you want to exchange goals

ad 4) This is an effect of the precise, steadfast way how {}//1 is handled in DCGs. As a rule of thumb: Implementers rather prefer to be steadfast than to strive for LCO-ness. Please refer to: DCG Expansion: Is Steadfastness ignored?

Please note also that there is a lot more to this than the simple reuse of the call frame. There is a lot of interaction with garbage collection. To overcome all those problems in SWI, an additional GC phase was necessary.

For more, refer to the tiny benchmarks in Precise Garbage Collection in Prolog

So to finally answer your question: Your rule might become optimized; provided there is no choicepoint left prior to the recursive goal.


There is also the real low level approach to this. I never use this for code development: vm_list. The listing shows you ultimately whether or not SWI might consider LCO (provided no choicepoint is there).

i_call and i_callm will never perform LCO. Only i_depart will do. At: 142 i_depart(read_token/5)

?- vm_list(read_token).
========================================================================
read_token/5
========================================================================
   0 s_virgin
   1 i_exit
----------------------------------------
clause 1 ((0x1cc4710)):
----------------------------------------
   0 h_functor(parser/2)
   2 h_firstvar(5)
   4 h_firstvar(6)
   6 h_pop
   7 h_functor(lexer/3)
   9 h_functor((-)/2)
  11 h_const(dfa)
  13 h_firstvar(7)
  15 h_pop
  16 h_functor((-)/2)
  18 h_const(last_accept)
  20 h_firstvar(8)
  22 h_pop
  23 h_rfunctor((-)/2)
  25 h_const(chars)
  27 h_firstvar(9)
  29 h_pop
  30 i_enter
  31 c_ifthenelse(26,118)
  34 b_unify_var(3)
  36 h_list_ff(10,11)
  39 b_unify_exit
  40 b_var(6)
  42 b_var(7)
  44 b_firstvar(12)
  46 i_callm(dfa,dfa:current/3)
  49 b_var(10)
  51 b_firstvar(13)
  53 b_firstvar(14)
  55 i_call(char_and_code/3)
  57 b_var(6)
  59 b_var(12)
  61 b_var(14)
  63 b_firstvar(15)
  65 i_callm(dfa,dfa:find_edge/4)
  68 b_unify_fv(16,11)
  71 c_cut(26)
  73 b_const(dfa_table)
  75 b_var(6)
  77 b_var(15)
  79 b_firstvar(17)
  81 i_callm(table,table:item/4)
  84 b_var(17)
  86 b_firstvar(18)
  88 i_callm(dfa,dfa:accept/2)
  91 b_var(9)
  93 b_var(13)
  95 b_firstvar(19)
  97 i_call(atom_concat/3)
  99 b_unify_firstvar(20)
 101 b_functor(lexer/3)
 103 b_functor((-)/2)
 105 b_const(dfa)
 107 b_argvar(15)
 109 b_pop
 110 b_functor((-)/2)
 112 b_const(last_accept)
 114 b_argvar(18)
 116 b_pop
 117 b_rfunctor((-)/2)
 119 b_const(chars)
 121 b_argvar(19)
 123 b_pop
 124 b_unify_exit
 125 b_unify_fv(21,16)
 128 b_functor(parser/2)
 130 b_argvar(5)
 132 b_argvar(6)
 134 b_pop
 135 b_var(20)
 137 b_var2
 138 b_var(21)
 140 b_var(4)
 142 i_depart(read_token/5)
 144 c_var_n(22,2)
 147 c_var_n(24,2)
 150 c_jmp(152)
 152 c_ifthenelse(27,28)
 155 b_var(8)
 157 b_const(none)
 159 i_call((\=)/2)
 161 c_cut(27)
 163 b_unify_var(2)
 165 h_functor((-)/2)
 167 h_var(8)
 169 h_var(9)
 171 h_pop
 172 b_unify_exit
 173 c_var(10)
 175 c_var_n(22,2)
 178 c_var_n(24,2)
 181 c_jmp(101)
 183 c_ifthenelse(28,65)
 186 b_firstvar(10)
 188 i_call(ground/1)
 190 c_cut(28)
 192 b_functor((:)/2)
 194 b_const(symbol)
 196 b_rfunctor(by_type_name/4)
 198 b_argvar(6)
 200 b_const(error)
 202 b_argfirstvar(22)
 204 b_void
 205 b_pop
 206 i_call(once/1)
 208 b_var(10)
 210 b_firstvar(23)
 212 b_firstvar(24)
 214 i_call(try_restore_input/3)
 216 b_unify_var(10)
 218 h_list
 219 h_var(23)
 221 h_var(24)
 223 h_pop
 224 b_unify_exit
 225 b_functor(atom/1)
 227 b_argfirstvar(25)
 229 b_pop
 230 b_const('~w')
 232 b_list
 233 b_argvar(23)
 235 b_nil
 236 b_pop
 237 i_call(format/3)
 239 b_unify_var(2)
 241 h_functor((-)/2)
 243 h_var(22)
 245 h_var(25)
 247 h_pop
 248 b_unify_exit
 249 c_jmp(33)
 251 b_functor((:)/2)
 253 b_const(symbol)
 255 b_rfunctor(by_type_name/4)
 257 b_argvar(6)
 259 b_const(eof)
 261 b_argfirstvar(22)
 263 b_void
 264 b_pop
 265 i_call(once/1)
 267 b_unify_var(2)
 269 h_functor((-)/2)
 271 h_var(22)
 273 h_const('')
 275 h_pop
 276 b_unify_exit
 277 c_var(10)
 279 c_var_n(23,2)
 282 c_var(25)
 284 b_unify_vv(4,3)
 287 c_var_n(11,2)
 290 c_var_n(13,2)
 293 c_var_n(15,2)
 296 c_var_n(17,2)
 299 c_var_n(19,2)
 302 c_var(21)
 304 i_exit