2
votes

I'm new in Prolog and trying to do some programming with Lists
I want to do this :

?- count_occurrences([a,b,c,a,b,c,d], X).
X = [[d, 1], [c, 2], [b, 2], [a, 2]].

and this is my code I know it's not complete but I'm trying:

count_occurrences([],[]).
count_occurrences([X|Y],A):-
   occurrences([X|Y],X,N).

occurrences([],_,0).    
occurrences([X|Y],X,N):- occurrences(Y,X,W), N is W + 1.
occurrences([X|Y],Z,N):- occurrences(Y,Z,N), X\=Z.

My code is wrong so i need some hits or help plz..

6
Where's d in [a,b,c]? Why is it included? Why do a, b, and c get counts of 2? - Sergey Kalinichenko
sry my fault i was trying many possibilities . - Nik
See my other comment, but you need overall a logical plan of attack. If you can't describe the solution in words as logical implications, then you can't write the Prolog. For example, what does occurrences([X|Y],X,N) mean? There's a singleton N and it unifies the second argument with the head of the first argument. But it's semantic meaning is unclear. - lurker
Don't think "print". You just want to collect and Prolog will display the solution. If you sort first, then you count as long as the next one is the same. As soon as it's different, you know you can start a new count for the next one without concern that a prior one will recur. - lurker
See this answer for a pure solution. - false

6 Answers

3
votes

Note that so far all proposals have difficulties with lists that contain also variables. Think of the case:

?- count_occurrences([a,X], D).

There should be two different answers.

   X = a, D = [a-2] ;
   dif(X, a), D = [a-1,X-1].

The first answer means: the list [a,a] contains a twice, and thus D = [a-2]. The second answer covers all terms X that are different to a, for those, we have one occurrence of a and one occurrence of that other term. Note that this second answer includes an infinity of possible solutions including X = b or X = c or whatever else you wish.

And if an implementation is unable to produce these answers, an instantiation error should protect the programmer from further damage. Something along:

count_occurrences(Xs, D) :-
   ( ground(Xs) -> true ; throw(error(instantiation_error,_)) ),
   ... .

Ideally, a Prolog predicate is defined as a pure relation, like this one. But often, pure definitions are quite inefficient.

Here is a version that is pure and efficient. Efficient in the sense that it does not leave open any unnecessary choice points. I took @dasblinkenlight's definition as source of inspiration.

Ideally, such definitions use some form of if-then-else. However, the traditional (;)/2 written

   ( If_0 -> Then_0 ; Else_0 )

is an inherently non-monotonic construct. I will use a monotonic counterpart

   if_( If_1, Then_0, Else_0)

instead. The major difference is the condition. The traditional control constructs relies upon the success or failure of If_0 which destroys all purity. If you write ( X = Y -> Then_0 ; Else_0 ) the variables X and Y are unified and at that very point in time the final decision is made whether to go for Then_0 or Else_0. What, if the variables are not sufficiently instantiated? Well, then we have bad luck and get some random result by insisting on Then_0 only.

Contrast this to if_( If_1, Then_0, Else_0). Here, the first argument must be some goal that will describe in its last argument whether Then_0 or Else_0 is the case. And should the goal be undecided, it can opt for both.

count_occurrences(Xs, D) :-
   foldl(el_dict, Xs, [], D).

el_dict(K, [], [K-1]).
el_dict(K, [KV0|KVs0], [KV|KVs]) :-
    KV0 = K0-V0,
    if_( K = K0,
         ( KV = K-V1, V1 is V0+1, KVs0 = KVs ),
         ( KV = KV0, el_dict(K, KVs0, KVs ) ) ).

=(X, Y, R) :-
   equal_truth(X, Y, R).

This definition requires the following auxiliary definitions: if_/3, equal_truth/3, foldl/4.

2
votes

Here's my solution using bagof/3 and findall/3:

count_occurrences(List, Occ):-
    findall([X,L], (bagof(true,member(X,List),Xs), length(Xs,L)), Occ).

An example

?- count_occurrences([a,b,c,b,e,d,a,b,a], Occ).
Occ = [[a, 3], [b, 3], [c, 1], [d, 1], [e, 1]].

How it works

bagof(true,member(X,List),Xs) is satisfied for each distinct element of the list X with Xs being a list with its length equal to the number of occurrences of X in List:

?- bagof(true,member(X,[a,b,c,b,e,d,a,b,a]),Xs).
X = a,
Xs = [true, true, true] ;
X = b,
Xs = [true, true, true] ;
X = c,
Xs = [true] ;
X = d,
Xs = [true] ;
X = e,
Xs = [true].

The outer findall/3 collects element X and the length of the associated list Xs in a list that represents the solution.

Edit I: the original answer was improved thanks to suggestions from CapelliC and Boris.

Edit II: setof/3 can be used instead of findall/3 if there are free variables in the given list. The problem with setof/3 is that for an empty list it will fail, hence a special clause must be introduced.

count_occurrences([],[]).
count_occurrences(List, Occ):-
    setof([X,L], Xs^(bagof(a,member(X,List),Xs), length(Xs,L)), Occ).
2
votes

If you use SWI-Prolog, you can do :

:- use_module(library(lambda)).

count_occurrences(L, R) :-
    foldl(\X^Y^Z^(member([X,N], Y)
             ->  N1 is N+1,
             select([X,N], Y, [X,N1], Z)
             ;   Z = [[X,1] | Y]),
          L, [], R).
1
votes

One thing that should make solving the problem easier would be to design a helper predicate to increment the count.

Imagine a predicate that takes a list of pairs [SomeAtom,Count] and an atom whose count needs to be incremented, and produces a list that has the incremented count, or [SomeAtom,1] for the first occurrence of the atom. This predicate is easy to design:

increment([], E, [[E,1]]).
increment([[H,C]|T], H, [[H,CplusOne]|T]) :-
    CplusOne is C + 1.
increment([[H,C]|T], E, [[H,C]|R]) :-
    H \= E,
    increment(T, E, R).

The first clause serves as the base case, when we add the first occurrence. The second clause serves as another base case when the head element matches the desired element. The last case is the recursive call for the situation when the head element does not match the desired element.

With this predicate in hand, writing count_occ becomes really easy:

count_occ([], []).
count_occ([H|T], R) :-
    count_occ(T, Temp),
    increment(Temp, H, R).

This is Prolog's run-of-the-mill recursive predicate, with a trivial base clause and a recursive call that processes the tail, and then uses increment to account for the head element of the list.

Demo.

1
votes

You have gotten answers. Prolog is a language which often offers multiple "correct" ways to approach a problem. It is not clear from your answer if you insist on any sort of order in your answers. So, ignoring order, one way to do it would be:

  1. Sort the list using a stable sort (one that does not drop duplicates)
  2. Apply a run-length encoding on the sorted list

The main virtue of this approach is that it deconstructs your problem to two well-defined (and solved) sub-problems.

The first is easy: msort(List, Sorted)

The second one is a bit more involved, but still straight forward if you want the predicate to only work one way, that is, List --> Encoding. One possibility (quite explicit):

list_to_rle([], []).
list_to_rle([X|Xs], RLE) :-
    list_to_rle_1(Xs, [[X, 1]], RLE).

list_to_rle_1([], RLE, RLE).
list_to_rle_1([X|Xs], [[Y, N]|Rest], RLE) :-
    (    dif(X, Y)
    ->   list_to_rle_1(Xs, [[X, 1],[Y, N]|Rest], RLE)
    ;    succ(N, N1),
         list_to_rle_1(Xs, [[X, N1]|Rest], RLE)
    ).

So now, from the top level:

?- msort([a,b,c,a,b,c,d], Sorted), list_to_rle(Sorted, RLE).
Sorted = [a, a, b, b, c, c, d],
RLE = [[d, 1], [c, 2], [b, 2], [a, 2]].

On a side note, it is almost always better to prefer "pairs", as in X-N, instead of lists with two elements exactly, as in [X, N]. Furthermore, you should keep the original order of the elements in the list, if you want to be correct. From this answer:

rle([], []).
rle([First|Rest],Encoded):- 
    rle_1(Rest, First, 1, Encoded).               

rle_1([], Last, N, [Last-N]).
rle_1([H|T], Prev, N, Encoded) :-
    (   dif(H, Prev) 
    ->  Encoded = [Prev-N|Rest],
        rle_1(T, H, 1, Rest)
    ;   succ(N, N1),
        rle_1(T, H, N1, Encoded)
    ).

Why is it better?

  • we got rid of 4 pairs of unnecessary brackets in the code

  • we got rid of clutter in the reported solution

  • we got rid of a whole lot of unnecessary nested terms: compare .(a, .(1, [])) to -(a, 1)

  • we made the intention of the program clearer to the reader (this is the conventional way to represent pairs in Prolog)

From the top level:

?- msort([a,b,c,a,b,c,d], Sorted), rle(Sorted, RLE).
Sorted = [a, a, b, b, c, c, d],
RLE = [a-2, b-2, c-2, d-1].

The presented run-length encoder is very explicit in its definition, which has of course its pros and cons. See this answer for a much more succinct way of doing it.

1
votes

refining joel76 answer:

count_occurrences(L, R) :-
    foldl(\X^Y^Z^(select([X,N], Y, [X,N1], Z)
             ->  N1 is N+1
             ;   Z = [[X,1] | Y]),
          L, [], R).