4
votes

I am learning Prolog using SWI Prolog and I have some doubts about how work this implementation of the 2-3 dictionary in Prolog.

I know the theory of the 2-3 dictionary that are trees whose internal nodes can generate 2 or 3 subtrees having the following properties:

1) All the items are stored in the leaves and are ordered from the smaller to the bigger

2) All the leaves are in the same level

3) The internal nodes do not contains the inserted item but contains label that specify the minimal elements of the subtrees in the following way:

  • If an internal node have 2 subtrees, the label in this internal node contains the MINIMAL ELEMENT of its RIGHT SUBTREE (so if I am searching an item X that is smaller then the label I am sure that it is in the left subtree)
  • If an internal node have 3 subtrees, the label in this internal node contains 2 value: M2 and M3 where M1 is the MINIMAL VALUE that is present in the CENTER SUBTREE and M3 is the MINIMAL VALUE that is the the right subtree

So search if an item exist in this kind of dictionary is pretty simple.

The insertion is more complicated, I have understand the theory of it but I have just some problem with the Prolog interpretation.

This is my program (take from Ivan Bratko book: Programming for artificial intelligence):

/* in(Item, Tree) predicate is TRUE if the searched Item is in the specified Tree */

% BASE CASE: Item found in a leaf, so end
in(Item, l(Item)).  

/* CASE 1: I am searching Item in an internal node having a single label value
           so this internal node have 2 subtrees
*/
in(Item, n2(T1,M,T2)) :- gt(M,Item), % IF M label value lexicographically follows the searched Item value
                         !,
                         in(Item,T1)    % THEN search Item in the left subtree
                         ;      % (; is an OR)
                         in(Item,T2).   % Otherwise search Item in the right subtree

/* CASE 2: I am searching Intem in an internal node having 2 label values so this
           internal node have 3 subtrees
*/

in(Item, n3(T1,M2,T2,M3,T3)) :- gt(M2,Item), % IF M2 label value lexicographically follows the searched Item value
                                !,
                                in(Item,T1) % THEN search Item in the left subtree
                                ;       % (; is an OR)
                                /* IF M3 label value lexicographically follows the searched Item value 
                                   BUT this is NOT TRUE that M2>Item
                                */
                                gt(M3,Item),
                                !,
                                in(Item,T2) % THEN search Item in the central subtree
                                ;       % (; is an OR)
                                in(Item,T3).    % ELSE (it is TRUE that Item>M3) search in the right subtree 


/* 
*/

% Insertion in the 2-3 dictionary

/* Add X to Tree giving Tree1
   CASE 1: After the insertion of X into Tree, Tree1 does not grow upwards (so it means that an internal nodes 
           having 2 subtrees child, now have 3 subtrees child, so the original tree has grown in width)
*/
add23(Tree, X, Tree1) :- ins(Tree, X, Tree1).                    

/* CASE 2: Tree grows upwards: It means that  if after the insertion of X the height of the new tree is 
           increased so the ins/5 predicate determines the two subtrees T1 and T2 which are then combined
           into a bigger tree    
*/
add23(Tree, X, n2( T1, M2, T2)) :- ins(Tree, X, T1, M2, T2).

del23(Tree, X, Tree1) :- add23(Tree1, X, Tree).    % Delete X from Tree giving Tree1


/* BASE CASE: Inserting the X item into a voil (nil) tree means to obtain a tree 
              consisting of the single new leaf l(X)
*/
ins(nil, X, l(X)).

/* BASE CASES: related to inserting a new item X into a tree composed by 
               a single leaf
*/
ins(l(A), X, l(A), X, l(X)) :- gt(X, A).

ins(l(A), X, l(X), A, l(A)) :- gt(A, X).


/* Tree = n2(T1, M , T2) so Tree is a tree having 2 subtrees T1 and T2
   M: is the MINIMAL ELEMENT OF T2

   IF it is TRUE that M>X (the minimal element in T2 is > the new X item)
   I have to insert X in the LEFT subtrees (into T1 that becomes NT1 and
   now have 2 leaves)
*/
ins(n2(T1, M , T2), X, n2(NT1, M, T2)) :- gt(M, X),
                                          ins(T1, X, NT1).


/* Tree = n2(T1, M , T2) so Tree is a tree having 2 subtrees T1 and T2
   M: is the MINIMAL ELEMENT OF T2.

   IF it is TRUE that M>X (the minimal element in T2 is > the new X item) and
   IF I can insert 
*/ 
ins(n2(T1, M, T2), X, n3(NT1a, Mb, NT1b, M, T2)) :- gt(M, X),
                                                    ins(T1, X, NT1a, Mb, NT1b).


ins(n2(T1, M, T2), X, n2(T1, M, NT2)) :- gt(X, M),
                                         ins(T2, X, NT2).

ins( n2( T1, M, T2), X, n3( T1, M, NT2a, Mb, NT2b))  :-
   gt( X, M),
   ins( T2, X, NT2a, Mb, NT2b).


ins( n3( T1, M2, T2, M3, T3), X, n3( NT1, M2, T2, M3, T3))  :-
   gt( M2, X),
   ins( T1, X, NT1).

/* Tree = n3(T1, M2, T2, M3, T3) so Tree is a tree having 3 subtree: T1,T2 and T2
   and the ROOT of Tree is the node (M2,M3)

   M2: MINIMAL ELEMENT of T2 subtree
   M3: MINIMAL ELEMENT of T3 subtree

   If I had the item X then Tree have to grow in height

   IF it is TRUE that M2 > X I could try to insert the X item into T1 subtree
   IF it is TRUE that X is added in T1 and T1 is splitted in NT1a and NT1b having root Mb

   so the new tree is: n2(NT1a, Mb, NT1b), M2, n2(T2, M3, T3)


*/
ins(n3(T1, M2, T2, M3, T3), X, n2(NT1a, Mb, NT1b), M2, n2(T2, M3, T3)) :-
                    gt(M2, X),
                    ins(T1, X, NT1a, Mb, NT1b).


ins(n3(T1, M2, T2, M3, T3), X, n3(T1, M2, NT2, M3, T3)) :-
   gt(X, M2), gt(M3, X),
   ins(T2, X, NT2).

ins( n3( T1, M2, T2, M3, T3), X, n2( T1, M2, NT2a), Mb, n2( NT2b, M3, T3)) :-
   gt( X, M2), gt( M3, X),
   ins( T2, X, NT2a, Mb, NT2b).


ins( n3( T1, M2, T2, M3, T3), X, n3( T1, M2, T2, M3, NT3))  :-
   gt( X, M3),
   ins( T3, X, NT3).

ins( n3( T1, M2, T2, M3, T3), X, n2( T1, M2, T2), M3, n2( NT3a, Mb, NT3b))  :-
   gt( X, M3),
   ins( T3, X, NT3a, Mb, NT3b).

In this implementation I have that:

  • l(X) rappresent an item present in my tree
  • n2(T1,M,T2) rappresents a tree with 2 subtress T1 and T2 where M is the minimal element in T2
  • n3(T1,M2,T2,M3,T3)** rappresents a tree with 3 subtrees: T1,T2,T3 where M2 is the minimal element in T2 and M3 is the minimal element in T3

The first doubt is related to the relation that exist between the add23 and the ins predicate, these:

/* Add X to Tree giving Tree1
   CASE 1: After the insertion of X into Tree, Tree1 does not grow upwoards (so it means that an internal nodes 
           having 2 subtrees child, now have 3 subtrees child, so the original tree has grown in width)
*/
add23(Tree, X, Tree1) :- ins(Tree, X, Tree1).                    

/* CASE 2: Tree grows upwards: It meaans that  if after the insertion of X the height of the new tree is 
           increased so the ins/5 predicate determines the two subtrees T1 and T2 wich are then combined
           into a bigger tree    
*/
add23(Tree, X, n2( T1, M2, T2)) :- ins(Tree, X, T1, M2, T2).

As write in the comments I thinkt that the first one is related to the case in wich the new tree don't grow upwoards (so for example I had a tree that have 2 subtrees and inserting a new item I am creating a new subtree that have its leaves at the same level all others leaves (and so an internal node will now have 2 labels) Is it right?

So in logic I can read it as: if ins(Tree, X, Tree1) predicate is TRUE then I have add X to Tree generating the new Tree1 that have the same height of the old Tree but that contains X item (so it must have an internal node having 3 child)

The second case is related at the case in wicht I have a tree and I have to insert a new item in a node that which has already 3 subtrees so I have to split the old tree in 2 trees that as root have both as label a single label take from the 2 labels of the old original node. So I can insert the new item as leaf and as new root of the new tree.

So in logic I can read it as:

If the predicate ins(Tree, X, T1, M2, T2) is TRUE then it means that it is TRUE the predicate: add23(Tree, X, n2( T1, M2, T2))

This is the case in wich I have split the original tree Tree, that have 3 subtrees, into two different tree: T1 and T2 that as root have a node with a single minimal label take from the old root of the original tree Tree.

So it happens that I will surely have one of these trees will have two subtrees and the other one that have a single subtree, so I can add the new inserted item to this subtree and also use it as new root of the newtree that is increased in height by one level.

Is it right? I am not sure about this...

On the book I have found the explaination of three specific case of ins predicate and I am having many doubts about how it work:

CASE 1:

ins(n2(T1, M , T2), X, n2(NT1, M, T2)) :- gt(M, X),
                                          ins(T1, X, NT1).

This case say me that the original tree Tree is: n2(T1, M , T2) and I insert X into Tree that is a tree having only 2 subtrees: T1 and T2.

M is the MINIMUM ELEMENT OF THE RIGHT SUBTREE

So if gt(M, X) is TRUE it means that M>X so it means that X could be the LEFT SUBTREE that became NT1 (why? it maybe depends by the fact that the old T1 have only 1 leaf in one of its subtree?) and so, at the end, the original tree became n2(NT1, M, T2)

I think that this case it is simple

CASE 2:

ins(n2(T1, M, T2), X, n3(NT1a, Mb, NT1b, M, T2)) :- gt(M, X),
                                                    ins(T1, X, NT1a, Mb, NT1b).

Also this case say me that the original tree Tree is: n2(T1, M , T2) and I insert X into Tree that is a tree having only 2 subtrees: T1 and T2.

M is the MINIMUM ELEMENT OF THE RIGHT SUBTREE

IF it is TRUE that M>X and this is TRUE the predicate: ins(T1, X, NT1a, Mb, NT1b)

it means that I split T1 into 2 subtrees NT1a and NT1b adding a third child to the origincal tree that became n3(NT1a, Mb, NT1b, M, T2)

Ok this is pretty clear but my problem is: in the complete code what is this predicate that have be to satisfy? I am doing confusion...

CASE 3:

ins(n3(T1, M2, T2, M3, T3), X, n2(NT1a, Mb, NT1b), M2, n2(T2, M3, T3)) :-
                    gt(M2, X),
                    ins(T1, X, NT1a, Mb, NT1b).

This case is the case in which the original tree Tree have 3 subtrees and when I have to insert it I have to split the original tree into two tree (n3(T1, M2, T2, M3, T3) and n2(NT1a, Mb, NT1b)) , insert the new item X into one of these subtrees and use the minimal elements of the second subtree as root of the left subtree as root of the new tree (that now is higher than a level)

My doubt is: in the previous case the NewTree is n2(NT1a, Mb, NT1b), M2, n2(T2, M3, T3)

so M2 (the minimum element of the right subtree) is the new root because it is TRUE that M2>X?

Can you give me some hints to understand better these thing?

1

1 Answers

1
votes

First, a couple of style points. You don't need to have separate constructors for n2 and n3, as the arities will deal with that for you. Second, you should be suspicious of any predicate that has cuts in it, especially the red cuts you're using in in/2. It's generally better (and faster) to do explicit case analysis. Also, if you can, put what you're case comparing on in the first argument as that's indexed by default.

I'd rewrite in/2 as

in(leaf(Item), Item).
in(node(Left, M, Right), Item):-
    compare(Order, Item, M),
    in_2(Order, Item, node(Left, M, Right).
in(node(Left, M1, Mid, M2, Right), Item):-
    compare(Order, Item, M1),
    in_3(Order, Item, node(Left, M1, Mid, M2, Right)).

in_2(<, Item, node(Left, _, _)):-
    in(Left, Item).
in_2(=, Item, node(_, _, Right)):-
    in(Right, Item).
in_2(>, Item, node(_, _, Right)):-
    in(Right, Item).

in_3(<, Item, node(Left, _, _, _, _)):-
    in(Left, Item).
in_3(=, Item, node(_, _, Mid, _, _)):-
    in(Mid, Item).
in_3(>, Item, node(Left, M1, Mid, M2, Right)):-
    compare(Order, Item, M2),
    in_3a(Order, Item, node(Left, M1, Mid, M2, Right)).

in_3a(<, Item, node(_, _, Mid, _, _)):-
    in(Mid, Item).
in_3a(=, Item, node(_, _, _, _, Right)):-
    in(Right, Item).
in_3a(>, Item, node(_, _, _, _, Right)):-
    in(Right, Item).

As for tree insertion, it's a bit more complicated than I think you've realised.

One key feature of 2-3 trees is that all the leaves are at the same depth. That means, when you're inserting an item, you have to traverse down the tree to find where among the leaves you need to insert it, and then propagate changes back up the tree to ensure it remains balanced. These slides from a data structures lecture outline the algorithm. Try just translating them into Prolog. The slides lay out clearly what the different cases are. The example code I gave above should help you with the first phase of the insertion algorithm (finding the correct bottom-level node to insert the new leaf). Take the same approach when you're working back up the tree.