Structuring SWI-Prolog Code into Modules for Unit Testing of Multiple Algorithms and Data Sets

Question

To elaborate on a discussion in the comments below my last question: I am looking for suggestions on techniques or best practices for structuring SWI-Prolog code in order to be able to use and test alternative, interchangeable implementations of algorithms and their supporting modules.

The current situation can be illustrated using the following small, ficticous example: The user provides some input data (file data.pl) and loads a module with an algorithm to be applied (file graph.pl). The algorithm module itself uses helper predicates from another module (file path.pl) which in turn requires access to the user-provided data:

File 'data.pl' (input data set):

:- use_module(graph).

edge(a,b).
edge(b,c).
edge(c,d).

File 'graph.pl' (algorithm):

:- module(graph, [reachable/2]).
:- use_module(path).

reachable(X,Y) :-
    path(X,Y), !.
reachable(X,Y) :-
    path(Y,X), !.

File 'path.pl' (module with helper predicates, notice it accessing the data in user):

:- module(path, [path/2]).

path(X,X).
path(X,Y) :-
    user:edge(X,Z),
    path(Z,Y).

For the use case of applying the algorithm to a single input data set and the single implementation of the algorithm, this is perfectly fine:

?- [data].
true.

?- reachable(a,a).
true.

?- reachable(a,d).
true.

?- reachable(d,a).
true.

Now suppose that I have a larger number of data sets, and multiple alternative implementations of the graph and path modules (with the same interface, i.e., exported predicates). For the sake of the (small) example, let us assume we files data files data1.pl, data2.pl, helper predicate modules path1.pl, path2.pl, and algorithm modules graph1, graph2.pl.

I want to automate testing these using SWI-Prolog unit tests, and preferably be able to write a test suite that supports both the differing data sets and the different module implementations, without the need to restart Prolog in between. That is to say I want to be able test all combinations in the Cartesian product

{data1.pl, data2.pl} x {path1.pl, path2.pl} x {graph1.pl, graph2.pl}

without copy-pasting/duplicating code.

My question is: How would I go about this in SWI-Prolog? Are there best practices, design patterns or the like on how to structure code into modules for this purpose? Should I perhaps make use of dynamic importing for switching between alternative algorithm modules, and simply use setup and cleanup in unit tests for the data?

I am not sure about using the user module like this. Altogether, why are you not using use_module instead of "consulting" with [ ]? I think the manual should explain the differences. — User9213
Yes, the explicit reference to user is definitely not ideal. I guess the reasoning behind this was that the user would define their data in user space, may import different modules with algorithms all of which can then work on this data. — Jens Classen
Not sure if this could be useful at all, but swi-prolog.org/pldoc/man?predicate=include/1 — Erik Kaplun
@ErikKaplun No, that's not useful at all, that's just a link to the documentation of the include/1 predicate. — Jens Classen

Paulo Moura Paulo Moura · Accepted Answer · 2019-08-04T14:36:47

In order to apply the same set of tests to different implementations of the same predicates, or, more generically, to different implementations of the same interface/protocol, the tests must take the implementation as a dynamic parameter. Ideally, we should also be able to test the different algorithm implementations with different datasets.

A separate concern is how to organize the data and the algorithms that we want to run on the data. There are two sensible approaches. The first option is to view the data as importing or inheriting the algorithm implementations. In this case, the queries (e.g. reachable/2) would be sent to the data. A downside of this solution is that we may need to update the datasets everytime we want to apply a different set of algorithms (e.g. by importing a different module).

The second option is to view the data as a parameter of the algorithms. An easy implementation of this solution would be to add an extra argument to the predicates (e.g. the path and reachable predicates) that would be used to pass a reference to the data (e.g. user in the simple case mentioned in the question). A downside of this solution is that all algorithm related predicates would need the extra parameter (e.g. reachable/2 only calls path/2 and is only this predicate that actually calls edge/2).

All the above questions and corresponding alternative solutions can be easily and cleanly expressed using Logtalk parametric objects instead of Prolog modules and using Logtalk unit test framework, lgtunit, which supports parameterized tests out-of-the-box. Follows an example solution (which is portable and can be used with most Prolog systems).

First, let's make data only about the data. We start by defining a protocol/interface for all data objects:

:- protocol(graph_protocol).

    :- public(edge/2).
    ...

:- end_protocol.

All data objects would implement this protocol. For example:

:- object(graph1,
    implements(graph_protocol)).

    edge(a,b).
    edge(b,c).
    edge(c,d).

:- end_object.

Next, let's define parametric objects holding the algorithms with the single parameter being to pass the dataset object. These objects would likely also implement defined protocols specifying the predicates for which we want to provide alternative implementations. These protocols are omitted here for brevity.

:- object(path(_Data_)).

    :- uses(_Data_, [edge/2]).

    :- public(path/2).
    path(X,X).
    path(X,Y) :-
        edge(X,Z),
        path(Z,Y).

:- end_object.


:- object(reachable(_Data_)).

    :- uses(path(_Data_), [path/2]).

    :- public(reachable/2).
    reachable(X,Y) :-
        path(X,Y), !.
    reachable(X,Y) :-
        path(Y,X), !.

:- end_object.

Note that these objects use your predicate definitions as-is (the uses/2 directive in the reachable/1 object requires Logtalk 3.28.0 or later version).

The default case where the dataset is loaded into user can be simplified by defining:

:- object(reachable ,
    extends(reachable(user))).

:- end_object.

A typical query would be:

?- reachable(graph1)::reachable(a,d).
...

So far, we're only parameterizing the datasets, not the algorithms. We will get there. The tests could be defined also as a parametric object. For example:

:- object(tests(_Data_),
    extends(lgtunit)).

    :- uses(reachable(_Data_), [reachable/2]).

    test(r1) :-
        reachable(a,a).

    test(r2) :-
        reachable(a,d).

    test(r3) :-
        reachable(d,a).

:- end_object.

Testing over multiple datasets would use a goal such as:

lgtunit::run_test_sets([
    tests(graph1),
    tests(graph2),
    tests(graph3)
])

The original question focused on test alternative, interchangeable implementations of algorithms. But the solution is the same. We just need to modify the parametric tests object to take instead the object implementing the algorithm being tested as a parameter:

:- object(tests(_Algorithm_),
    extends(lgtunit)).

    :- uses(_Algorithm_, [reachable/2]).

    cover(reachable(_)).
    cover(path(_)).

    test(r1) :-
        reachable(a,a).

    test(r2) :-
        reachable(a,d).

    test(r3) :-
        reachable(d,a).

:- end_object.

And then, on the query that runs the tests, use whatever combination we want of datasets and algorithms. For example:

lgtunit::run_test_sets([
    tests(reachable1(graph1)), tests(reachable2(graph1)), 
    tests(reachable1(graph2)), tests(reachable2(graph2)),
    ...
])

The list argument of the lgtunit::run_test_sets/1 predicate can also be dynamically created. For example, assuming that all alternative implementations of the reachable/2 predicate implement a reachable_protocol protocol, the test goal could be:

datasets(Datasets),
findall(
    tests(Algorithm),
    (   implements_protocol(Algorithm, reachable_protocol),
        member(Dataset, Datasets),
        arg(1, Algorithm, Dataset)
    ),
    TestObjects
),
lgtunit::run_test_sets(TestObjects)

One noteworthy aspect of running these tests using lgtunit is that, besides, reporting the passed and failed tests, it's also trivial to report full predicate code coverage at the predicate clause level. This means that we not only test the algorithms but also check that all clauses used to implement the algorithms are being used. For this example, using only the graph1 dataset, the test output at the top-level interpreter is:

?- {tester}.
% 
% tests started at 2019/8/5, 7:17:46
% 
% running tests from object tests(graph1)
% file: /Users/pmoura/Desktop/plu/tests.lgt
% 
% g1: success
% g2: success
% g3: success
% 
% 3 tests: 0 skipped, 3 passed, 0 failed
% completed tests from object tests(graph1)
% 
% 
% clause coverage ratio and covered clauses per entity predicate
% 
% path(A): path/2 - 2/2 - (all)
% path(A): 2 out of 2 clauses covered, 100.000000% coverage
% 
% reachable(A): reachable/2 - 2/2 - (all)
% reachable(A): 2 out of 2 clauses covered, 100.000000% coverage
% 
% 2 entities declared as covered containing 4 clauses
% 2 out of 2 entities covered, 100.000000% entity coverage
% 4 out of 4 clauses covered, 100.000000% clause coverage
% 
% tests ended at 2019/8/5, 7:17:46
% 
true.

If you're automating tests (e.g. using a CI server), you can use instead the logtalk_tester script.

What if we want to keep using modules for datasets and/or the algorithms? For the tests object, it's just a question of writing instead:

:- object(tests(_Algorithm_),
    extends(lgtunit)).

    :- use_module(_Algorithm_, [reachable/2]).
    ...

:- end_object.

Logtalk's lgtunit supports testing plain Prolog code and Prolog modules code, besides Logtalk code (indeed, the Logtalk distribution includes a Prolog standards conformance test suite). For a tool overview, see e.g.

https://logtalk.org/tools.html#testing

At the above URL, we'll also find a code coverage report example. For full code example of using the solution sketched above see e.g.

https://github.com/LogtalkDotOrg/logtalk3/tree/master/library/dictionaries

This library provides three alternative implementations of a dictionary API and a single set of tests (using a parametric object) to test all of them.

Last, but not the least, you can use this testing solution with not only SWI-Prolog but also +10 other Prolog systems.

Structuring SWI-Prolog Code into Modules for Unit Testing of Multiple Algorithms and Data Sets

3 Answers