6
votes

I am trying to write a program that can detect and print two nodes in BST that have been swapped.

In a three level tree, I reached near to the solution using this approach.

If (!AllSubTreeAreValid())
{
//Nodes swapped on same side of main root node
}
else
{
  int max = getMax(root->left);
  int min = getMin(root->right);
  if(max > root->data || min < root->data )
  {
     //Nodes swapped on different sides of main root node
     //Print max and min values

  }
else 
{
 //No node swappped
}
}

//Helper functions
int GetMaxEle(TreeNode* tree)
{
    while(tree->right!=NULL)
    {
        tree=tree->right;
    }
    return tree->info;
}

int GetMinEle(TreeNode* tree)
{
    while(tree->left!=NULL)
    {
        tree=tree->left;
    }
    return tree->info;
}

but the above approach failed when I tried to test with four level tree.

             20

      15            30

   10    17       25     33

9  16  12  18  22  26  31  34

Being in root node 15's right subtree, 12 is still greater (violation).

Being in root node 15's left subtree, 16 is still greater (violation).

So, 16, 12 are the swapped elements in the above BST. How do I find them through the program?

3
what does getmax and getmin do ? it is not clear to me what you are asking.phoxis
@phoxis: getMax and getMin find the max and min elements in the main root's left and right subtree.Cipher
You mean: you had a BST, two nodes have been swapped "illegally" (so your structure is not a BST anymore) and you want to know wich ones?Nicolas Grebille
could you please post your complete code? your approach seems to be valid, so I suspect an error in getMin() and getMax().Lars
Do you want to detect any two swapped nodes or only swapped leaves in the tree?dcn

3 Answers

8
votes

One way to think about this problem is to use the fact that an inorder walk of the tree will produce all of the elements in sorted order. If you can detect deviations from the sorted order during this walk, you can try to locate the two elements that are in the wrong place.

Let's see how to do this for a simple sorted array first, then will use our algorithm to build something that works on trees. Intuitively, if we start off with a sorted array and then swap two (non-equal!) elements, we will end up with some number of elements in the array being out of place. For example, given the array

1 2 3 4 5

If we swap 2 and 4, we end up with this array:

1 4 3 2 5

How would we detect that 2 and 4 were swapped here? Well, since 4 is the greater of the two elements and was swapped downward, it should be greater than both of the elements around it. Similarly, because 2 was swapped up, it should be smaller than both of the elements around it. From this, we could conclude that 2 and 4 were swapped.

However, this doesn't always work correctly. For example, suppose that we swap 1 and 4:

4 2 3 1 5

Here, both 2 and 1 are smaller than their neighboring elements, and both 4 and 3 are larger than theirs. From this we can tell that two of these four somehow were swapped, but it's not clear which ones we should interchange. However, if we take the largest and the smallest of these values (1 and 4, respectively), we end up getting the pair that was swapped.

More generally, to find the elements that were swapped in the sequence, you want to find

  • The greatest local maximum in the array.
  • The smallest local minimum in the array.

These two elements are out of place and should be swapped.

Now, let's think about how to apply this to trees. Since an inorder walk of the tree will produce the sorted sequence with the two elements out of order, one option would be to walk the tree, recording the inorder sequence of elements we've found, then using the above algorithm. For example, consider your original BST:

              20
         /         \
      15             30
     /   \         /   \ 
   10    17      25     33
  / |   /  \    /  \    |  \
9  16  12  18  22  26  31  34

If we linearize this into an array, we get

9 10 16 15 12 17 18 20 22 25 26 30 31 33 34

Notice that 16 is greater than its surrounding elements and that 12 is less than its. This immediately tells us that 12 and 16 were swapped.

A simple algorithm for solving this problem, therefore, would be to do an inorder walk of the tree to linearize it into a sequence like a vector or deque, then to scan that sequence to find the largest local maximum and the smallest local minimum. This would run in O(n) time, using O(n) space. A trickier but more space-efficient algorithm would be to only keep track of three nodes at a time - the current node, its predecessor, and its successor - which reduces the memory usage to O(1).

Hope this helps!

0
votes

I guess your getMin et getMax works witht the hypotheses that the tree is a BST, so

T getMax(tree) {
  return tree -> right == null 
    ? tree -> value 
    : getMax(tree -> right);
}

(or the equivalent code with a loop). If so, your code examines at most three values in the tree. Even if getMax and getMin traversed the full tree to get the actual max/min, you would still base your test on just two comparisons. If you want to check that your tree satisfy the BST property, it is obvious that you have to examine all the values. Its enough to compare each node with its parent.

void CheckIsBst(Tree *tree) {
  if (tree -> left != null) {
    if (tree -> left -> value > tree -> value) {
      // print violation
    }
    CheckIsBst(tree -> left);   
  }
  // same with -> right, reversing < to > in the test
}

Edit : that was wrong, see comment. I believe this one is ok.

void checkIsBst(Tree *Tree, Tree *lowerBound, Tree *upperBound) {
  if(lowerBound!= null && lowerBound -> value > tree -> Value) {
    //violation
  }
  // same for upper bound, check with <
  if (tree -> left != null) {
    if (tree -> left -> value > tree -> value) {
      // print violation
     }
     CheckIsBst(tree -> left, lowerBound, tree);   
  }
  // same for right, reversing comparison 
  // setting lowerBound to tree instead of upperBound
}

Calling from root with null bounds

0
votes

the tree traversal done by templatetypedef works if you are sure there is only one swap. Otherwise I suggest a solution based on your initial code:

int GetMax(TreeNode* tree) {
    int max_right, max_left, ret;

    ret = tree->data;
    if (tree->left != NULL) {
        max_left = GetMax(tree->left);
        if (max_left > ret)
            ret = max_left;
    }
    if (tree->right != NULL) {
        max_right = GetMax(tree->right);
        if (max_right > ret)
            ret = max_right;
    }

    return ret;
}

int GetMin(TreeNode* tree) {
    int min_right, min_left, ret;

    ret = tree->data;
    if (tree->left != NULL) {
        min_left = GetMin(tree->left);
        if (min_left < ret)
            ret = min_left;
    }
    if (tree->right != NULL) {
        min_right = GetMin(tree->right);
        if (min_right < ret)
            ret = min_right;
    }

    return ret;
}

void print_violations(TreeNode* tree) {
    if ((tree->left != NULL) && (tree->right != NULL)) {
        int max_left = GetMax(tree->left);
        int min_right = GetMin(tree->right);
        if (max_left > tree->data && min_right < tree->data) {
            printf("Need to swap %d with %d\n", max_left, min_right);
        }
    }
    if (tree->left != NULL)
        print_violations(tree->left);
    if (tree->right != NULL)
        print_violations(tree->right);
}

It is slower, but it prints all the swaps it identifies. Can be changed to print all violations (e.g. if (max_left > tree->data) print violation). You can improve the performance if you can add two fields to the TreeNode with the max and min precomputed for that subtree.