0
votes

I have been playing with code to implement insert into an AVL balanced binary tree which will be used in a largish project. I selected AVL instead of red black and other balancing methods because of its simplicity as I need to have confidence that it is correct. My test code works fine - however I had been led to believe that it would be possible to use a variable balance limit/tolerance. I haven't seen this implemented as part of AVL and yeah I don't really require it, but as part of debugging it was something I tested which failed. It looks like higher balance limits might need a propagation up the tree for tree reduction height after some rotations.

Has anyone else successfully tried variable balance limits? Note: I don't want to walk the tree to recalculate balance factors if I can help it. At this stage I will probably just stick with a balance limit of 1 unless someone has some useful thoughts.

In the test code below all seems to work if BALANCELIMIT is 1 - but not if higher

#include <stdio.h>

#define BALANCELIMIT 1      // 1 is good - anything else seems bad
#define MINI(a, b) (a<=b ? a : b)
#define MAXI(a, b) (a>=b ? a : b)

struct TREE { // Standard binary tree structure with a SIGNED balance factor (normally -1, 0, or 1)
    struct TREE *left;
    struct TREE *right;
    signed char balance;
    int key;
} *root = NULL;
int count = 0; // Keep a count of created nodes for debugging purposes

struct TREE *rotateLeft(struct TREE *r) { // Rotate left routine to replace node with node->left
    struct TREE *n = r->left;
    r->left = n->right;
    n->right = r;
    r->balance += 1 - MINI(0, n->balance); // Balance calculation sourced from:-
    n->balance += 1 + MAXI(0, r->balance); // http://oopweb.com/Algorithms/Documents/AvlTrees/Volume/AvlTrees.htm
    return n;
}

struct TREE *rotateRight(struct TREE *r) { // Rotate right routine to replace node with node->right
    struct TREE *n = r->right;
    r->right = n->left;
    n->left = r;
    r->balance += -1 - MAXI(0, n->balance); // Balance calculation sourced from:-
    n->balance += -1 + MINI(0, r->balance); // http://oopweb.com/Algorithms/Documents/AvlTrees/Volume/AvlTrees.htm
    return n;
}

int insert(struct TREE **pnode, int key) {
    struct TREE *node = *pnode;
    if (node == NULL) { // If no node then create one and initialise it to contain new value
        node = malloc(sizeof(struct TREE));
        if (node == NULL) exit(0);
        node->left = NULL;
        node->right = NULL;
        node->balance = 0;
        node->key = key;
        *pnode = node;
        count++;
        return 1;   // Recursion exit - signal tree height change (for new node)
    }

    int cmp = key - node->key;
    if (cmp == 0) return 0;
    if (cmp < 0) {  // Decide if we need to recurse to the left or to the right
        if (insert(&node->left, key)) { // For smaller item recurse left - finish unless a height change was signalled
            if (--node->balance < 0) {  // Adjust node balance - if it got worse then we need to do something...
                if (node->balance >= -BALANCELIMIT) return 1; // If height change is within limit just propagate it upward
                if (node->left->balance > 0) {              // If left path heavy to right then do a double rotation
                    printf("rotateRightLeft node %d to replace %d around %d\n", node->left->right->key, node->key, node->left->key);
                    node->left = rotateRight(node->left);   // Double rotate by first rotating left right node up one level
                    *pnode = rotateLeft(node);              // replacing left - and then rotate it again to replace current node
                } else {
                    printf("rotateLeft node %d to replace %d\n", node->left->key, node->key);
                    *pnode = rotateLeft(node);              // Left path heavy to left so just rotate left node up one level
                }
            }
        }
    } else {
        if (insert(&node->right, key)) { // For larger item recurse right - finish unless a height change was signalled
            if (++node->balance > 0) {  // Adjust node balance - if it got worse then we need to do something...
                if (node->balance <= BALANCELIMIT) return 1; // If height change is within limit just propagate it upward
                if (node->right->balance < 0) {             // If right path heavy to left then do a double rotation
                    printf("rotateLeftRight node %d to replace %d around %d\n", node->right->left->key, node->key, node->right->key);
                    node->right = rotateLeft(node->right);  // Double rotate by first rotating right left node up one level
                    *pnode = rotateRight(node);             // replacing right - and then rotate it again to replace current node
                } else {
                    printf("rotateRight node %d to replace %d\n", node->right->key, node->key);
                    *pnode = rotateRight(node);             // Right path heavy to right so just rotate right node up one level
                }
            }
        }
    }
    return 0;   // Recursion exit - signal no further action required
}

void tree_print(struct TREE *node, int depth) { // Recursive tree print routine
    if (node != NULL) {
        tree_print(node->left, depth+1);
        printf("%*.s%d (%d)\n", depth * 5, "", node->key, node->balance);
        tree_print(node->right, depth+1);
    }
}

int check_count; // node count while checking tree
int check(struct TREE *node) { // Recursive tree check routine - check count, balance factor and key order
    int lh = 0, rh = 0;
    if (node != NULL) {
        check_count++;
        if (node->left != NULL) {
            if (node->key < node->left->key) {
                printf("ERROR node key %d smaller than left node\n", node->key);
                exit(0);
            }
            lh = check(node->left); // check left subtree and get its height
        }
        if (node->right != NULL) {
            if (node->key > node->right->key) {
                printf("ERROR node key %d bigger than right node\n", node->key);
                exit(0);
            }
            rh = check(node->right); // check right subtree and get its height
        }
        if (node->balance != rh - lh) {
            printf("ERROR balance %d for %d should be %d\n", node->balance, node->key, rh-lh);
            exit(0);
        }
    }
    return MAXI(lh, rh) + 1; // Return tree height
}

void tree_check(struct TREE *r) { // wrapper function for tree check routine
    check_count = 0;
    check(r);
    if (check_count != count) {
        printf("ERROR Check count is %d not %d \n", check_count, count);
        exit(0);
    }
}

main() {
    int i;
    for (i=0; i<10; i++) insert(&root, rand() % 30000);
    for (i=0; i<10; i++) {
        int r = rand() % 30000;
        insert(&root, r);
        insert(&root, r+1);
        insert(&root, r+2);
        insert(&root, r-1);
        insert(&root, r-2);
    }
    tree_print(root, 0);
    tree_check(root);
    printf("-- Done ------- %d\n\n\n", count);
}
1

1 Answers

0
votes

Yes, I've successfully tried variable balance limits on AVL trees. Mine did walk up the tree after the insert, adjusting balance factors. I also kept track of the height of each node the whole time too (more out of laziness than anything).

While the formulae you are using (for adjusting the balance factors on the demoted and promoted nodes after a rotation) are correct, they're not enough! (when BALANCELIMIT is more than 1). A rotation (applied to the top of a subtree) reduces the height of the subtree only if it makes it more balanced.

In the case of a double rotation, when BALANCELIMIT is 1, the first rotation is always applied to the top of a subtree with a balance factor of +/- 1, and it switches the sign of the imbalance (it does not change the height of the subtree, so it does not invalidate the balance factor of the node above the subtree).

But when BALANCELIMIT is 2 or more, the first rotation (in a double rotation scenario) may actually reduce the height of the subtree it is applied to. The simplest example is shown here.

-               first rotate       second rotate
- input         output             output
-  1               1                    3
-   \               \                  / \
-    4               3                1   4  
-   /               / \                \
-  3               2   4                2
- /
-2

The problem occurs as a result of the first rotation (on the 2,3,4 subtree). It doesn't only change the balance factors of the nodes numbered 3 and 4. Because it brings the (2,3,4) subtree into better balance, it improves the balance factor of node 1 (from +3 to +2). Your code doesn't allow for this. That's why it breaks. To demonstrate, start with an empty tree, and insert the inputs 1,4,3,2, in that order, into your tree. The first rotation should change the balance factor of node 1, but it doesn't. After the second rotation, node 1 should have a balance factor of 1, but will have an incorrect balance factor of 2.

Adding code to correct the balance factors when rotations "unexpectedly" improve subtree heights (which, as shown above, can occur when BALANCELIMIT is more than 1) is probably possible, but I suspect that it would be rather more complicated than tracking and adjusting heights, and recalculating the balance factors after each rotation.