1
votes

I want to count the number of successes AND failures for a predicate that doesn't have any arguments. For simplicity, the predicate I want to test is test_arith/0. test_arith/0 has 3 tests for is/2 (again, for simplicity. I really want to test more complicated predicates that I'm writing, but those details would be a distraction for this question.) (P.S. I saw the other question about counting the number times a predicate is true. I want to count both the successes and the failures in a single pass. I don't want to have to run the real predicates more than once per test case because some of them take a long time to execute. aggregate/3 and aggregate_all seem to be single-minded as well, taking only the successes.)

test_arith(Passes, Failures) :-
    findall(P-F, (test_arith->(P=1,F=0);(P=0,F=1)), Scores),
    summarize_scores(Scores, 0, 0, Passes, Failures).

test_arith :- 5 is 3 +2.  % Test #1: Should pass
test_arith :- 5 is 2 +2.  % Test #2: Should fail
test_arith :- 4 is 2 +2.  % Test #3: Should pass

summarize_scores([], Passes, Failures, Passes, Failures).
summarize_scores([P-F|Scores], Passes_SF, Failures_SF, Passes, Failures) :-
    Next_Passes is P + Passes_SF,
    Next_Failures is F + Failures_SF,
    summarize_scores(Scores, Next_Passes, Next_Failures, Passes, Failures).

When I run

test_arith(P,F).

I get

P = 1,
F = 0.

because test_arith seems to be called only once. I should get

P = 2,
F = 1.

Thanks for any help you can give.

I tried:

test_arith(Passes, Failures) :-
    bagof(P-F, A^(test_arith(A)->(P=1,F=0);(P=0,F=1)), Scores),
    summarize_scores(Scores, 0, 0, Passes, Failures).

test_arith(_) :- 5 is 3 +2.
test_arith(_) :- 5 is 2 +2.
test_arith(_) :- 4 is 2 +2.

test_arith2(Passes) :-
    aggregate(count, A^test_arith(A), Passes).

test_arith(P,F) yields: P = 1, F = 0. test_arith2(P) yields "2". (Which is good that it works, but only 1/4 of what I'm looking for. I need a count of failures and for each predicate to be run only once per test run, 3 calls in this case.)

Then I tried adding a number for each test case:

test_arith(Passes, Failures) :-
    bagof(P-F, A^(test_arith(A)->(P=1,F=0);(P=0,F=1)), Scores),
    summarize_scores(Scores, 0, 0, Passes, Failures).

test_arith(1) :- 5 is 3 +2.
test_arith(2) :- 5 is 2 +2.
test_arith(3) :- 4 is 2 +2.

and got:

test_arith(P,F).
    P = 1,
    F = 0.
2
You need to parameterize test_arith/0. Otherwise it just succeeds twice.false
You need a number for each test case! Then can you count them.false
Tried it...see above.Chelmite
Almost: Currently you get only the successes. And you can imagine that the other numbers fail. Otherwise could enumerate them explicitly: between(1,3,A), ( test_arith(A) -> .... (It is not a good idea to remove A...)false
For the general solution to this you might look into plunit (not PL/Unit)false

2 Answers

0
votes

It looks like there is a typo in your findall, where the "else" of your implication binds F to both 0 and 1, instead of P to 0 and F to 1. Is that copied directly from your code?

If so, that could be an explanation for why the aggragate methods only accept the trues; the false cases would never pass.

Edited to add:

Although I think it's a great practice to take advantage of functions like findall, sometimes you can't beat a good fail-loop; memory usage is quite a bit lower, and I find the performance to be similar. In Sicstus prolog, my approach would almost certainly be something along the lines of this:

## function is passed in.
## call(Foo) succeeds.
evaluate(Foo) :-
    call(Foo),
    incrementSuccess,
    !.
## call(Foo) fails.
evaluate(Foo) :-
    incrementFailure.

incrementSuccess :-
    success(N),
    N2 is N + 1,
    retract(success(N)),
    assert(success(N2)),
    !.
incrementSuccess :-
    assert success(1).

[very similar for incrementFailure].

## A fail loop that evaluates all possible bindings
tally(Foo, _Success, _Failure) :-
    evaluate(Foo),
    fail.
## The catch case that passes out the final tallies.
tally(_, Success, Failure) :-
    success(Success),
    failure(Failure).
0
votes

I think the problem is the 'implicit cut' in (->)/2. Try

test_arith(Passes, Failures) :-
    findall(P-F, (test_arith, P=1,F=0 ; P=0,F=1), Scores),
    summarize_scores(Scores, 0, 0, Passes, Failures).

and you'll get

?- test_arith(P,F).
P = 2,
F = 1.

edit

OT, but I like when I can make the logic more compact, of course with a little help from the library. Here an equivalent definition:

test_arith(Passes, Failures) :-
    findall(R, (test_arith, R=1-0 ; R=0-1), Scores),
    aggregate(r(sum(A),sum(B)), member(A-B, Scores), r(Passes, Failures)).

and then, why to build a list to be immediately scanned?

test_arith(Passes, Failures) :-
    aggregate(r(sum(A),sum(B)), (test_arith, A=1,B=0 ; A=0,B=1), r(Passes, Failures)).

edit the above code is incorrect, being unable to count failures. I was fooled by the fact that it seemed to work with the specific test case.

With the help of @false, here is reify_call/3, a building block that could solve OP' problem (tested in SWI-Prolog, where clause/2 is arguably extended with respect to ISO compatibility, given @false comment to question):

test_arith(Passes, Failures) :-
    aggregate(r(sum(T),sum(F)), reify_call(test_arith, T, F), r(Passes, Failures)).

:- meta_predicate reify_call(0, -, -).

reify_call(Pred, True, False) :-
    clause(Pred, Cl), (call(Cl) -> True = 1, False = 0 ; True = 0, False = 1).