7
votes

Writing functional notation is often quite costly in terms of auxiliary space consumption. This is particularly critical for canonical writing of lists.

First consider the size of the output: Whereas the usual ignore_ops(false) writing requires at least 2n+1 characters for a list of length n as in [1,2,3], canonical writing requires at least 7n+2 as in '.'(1,'.'(2,'.'(3,[]))). (Added:) There is no way to change the output, as it is defined like that. However, there is ample room for improvement during writing:

And now consider the auxiliary space required to write the output: Writing lists with square brackets does not require any auxiliary space proportional to the length of the list. A naive canonical writing requires space proportional to the length of the list to represent the stack of indentations.

How to write a list canonically without that overhead?


Here is a simple test to check what happens in your system.

First, reduce the maximal virtual memory size to reduce your waiting time, some 180M-ish works for me.

$ ulimit -v -180000

With

nat_sx(N0, s(X)) :-
   N0 > 0,
   N1 is N0-1, nat_sx(N1,X).
nat_sx(0, 0).

?- open('/dev/null',write,Null),
   length(_,I), N is 10^I, nat_sx(N,SX),
   ( Res=unwritten ; write_canonical(Null,SX) ).

SICStus and SWI alike now abort within write_canonical(Null, SX). It would be expected that they rather abort at some point in nat_sx/2. A direct comparison for lists is not possible, as SWI's write_canonical/1 always uses square brackets.

2
Comments are not for extended discussion; this conversation has been moved to chat.Samuel Liew
It is advisible to use the variable name _SX instead of SX to hide the binding at the toplevel. SICStus does this by default. SWI can be instructed to do so by using ?- set_prolog_flag(toplevel_print_anon, false). upfront.repeat
Why wasn't this a problem for me? Did I use Lambdas? Maybe.false
@false thanks a lot for the bonus, really appreciate it. I'm feeling a bit overcompensated for this tbh, the main effort was in the asking. :)Will Ness
@Will: As long as systems have not adopted it, there is no need to stop advertising it.false

2 Answers

6
votes

Just treat '.'( as the opening bracket, ,'.'( as the comma, and increment the integer count of closing parens to be output in the end as the closing bracket.

Thus O(1) auxiliary space.

-1
votes

One might think that implementing write_canonical/1 in Prolog itself would solve the native stack problem. But there is a little twist, if I use this code, then writec/1 cannot use last call optimization because there is a write(')') at the end:

writec(X) :- 
    X =.. [F,Y|L], !,
    writeq(F),
    write('('),
    writec(Y),
    writec_list(L),
    write(')').              /* prevents LCO */
writec(X) :-
    writeq(X).

writec_list([]).
writec_list([X|L]) :-
    write(','),
    writec(X),
    writec_list(L).

For lists in SWI-Prolog would need to add a translation from '[|]' to '.'. But otherwise the code works as expected:

?- writec(f(h(- 1),g(2+3))).
f(h(-(1)),g(+(2,3)))
true.
?- writec([a,b,c]).
'[|]'(a,'[|]'(b,'[|]'(c,[])))
true.

But following the idea of Will Ness we can transform the code as follows. We use as suggested an integer count of the closing parenthesis and handle the closing parenthesis inside the atomic case of writec:

writec2(X) :-
    writec2(0, X).

writec2(N, X) :- 
    X =.. [F,Y], !,
    writeq(F),
    write('('),
    M is N+1,
    writec2(M, Y).
writec2(N, X) :- 
    X =.. [F,Y,Z|L], !,
    writeq(F),
    write('('),
    writec2(Y),
    writec2_list([Z|L], N).
writec2(N, X) :- 
    writeq(X),
    writec2_closing(N).

writec2_closing(0) :- !.
writec2_closing(N) :-
    M is N-1,
    write(')'),
    writec2_closing(M).
    
writec2_list([X], N) :- !,
    write(','),
    M is N+1,
    writec2(M, X).
writec2_list([X,Y|L], N) :-
    write(','),
    writec2(X),
    writec2_list([Y|L], N).

The above code is now fully ammenable to last call optimization. One might try to port it back to a native stack routine using a loop for the last compound argument. A final sanity check, it does indeed the right thing:

?- writec2(f(h(- 1),g(2+3))).
f(h(-(1)),g(+(2,3)))
true.
?- writec2([a,b,c]).
'[|]'(a,'[|]'(b,'[|]'(c,[])))
true.