Consider a binary search tree, where all keys are unique. Nodes haven't parent pointers.
We have up to n/2 marked nodes.
I can delete all of them at O(n2) time ( using postorder traversal and when encounter a marked node delete each at O(n)). But it is inappropriate.
I need an algorithm to delete all marked nodes at O(n) time.
Thanks.
EDIT After deletion I need to have nodes order unchanged.
EDIT-2 So it should look like I have deleted each marked node using the typical deletion (finding the rightmost node at the left subtree and exchanging it with the node to delete).
3 Answers
There are many ways, but here is one that should be easy to get right, and give you a perfectly balanced tree as a side effect. It requires linear extra space, however.
- Create an array of size N minus the number of marked nodes (or N, if you don't know how many are marked and don't want to check it).
- Place the elements in order in the array, by inorder traversal, skipping the marked elements. This takes linear time, and stack space proportional to the height of the tree.
- Rebuild the tree top-down using the array. The mid element in the array becomes the new root, the mid element of the left subarray its new left child, and the mid element of the right subarray its new right subchild. Repeat recursively. This takes linear time and logarithmic stack space.
Update: forgot to say skipping the marked elements, but that was obvious, right? ;)
I have found how to do it!
- We use an inorder traversal.
- We check if node need to be deleted before recursive calls of the function.
- When we find a node to delete we stand flag toFindMax and searching the rightmost node in the left subtree.
- If we encounter another node to delete we push all our variables to the stack and pop them when the node was deleted.
- When we find the maximum in the left subtree we save reference to it and to it's parent.
And then, when we recursively return to the initial node to delete, we delete it (exchange it with the maximum).
static class LinearDeletion {
public static Node MIN_VALUE = new Node(Integer.MIN_VALUE);;
boolean toFindMax = false;
Node parentOfMax = null;
Node max = MIN_VALUE;
Stack<Object> stack = new Stack<>();
public void perform(Node x, Node parent) {
if (x.isToDelete) {
stack.push(toFindMax);
stack.push(parentOfMax);
stack.push(max);
toFindMax = true;
parentOfMax = null;
max = MIN_VALUE;
if (x.left != null) {
perform(x.left, x);
}
if (x.left == null) { //deletion of the node
if (parent.left == x) {
parent.left = x.right;
} else {
parent.right = x.right;
}
} else {
if (x.right == null) {
if (parent.left == x) {
parent.left = x.left;
} else {
parent.right = x.left;
}
} else {
x.key = max.key;
x.isToDelete = max.isToDelete;
if (parentOfMax != x) {
parentOfMax.right = max.left;
} else {
x.left = max.left;
}
}
} // end of deletion
max = (Node) stack.pop();
parentOfMax = (Node) stack.pop();
toFindMax = (boolean) stack.pop();
if (toFindMax) { // check if the current node is the maximum
if (x.key > max.key) {
max = x;
parentOfMax = parent;
}
}
if (x.right != null) {
perform(x.right, x);
}
} else {
if (x.left != null) {
perform(x.left, x);
}
if (toFindMax) {
if (x.key > max.key) {
max = x;
parentOfMax = parent;
}
}
if (x.right != null) {
perform(x.right, x);
}
}
}
}
I don't see why a postorder traversal would be O(n2). The reason that deleting a single node is O(n) is that you need to traverse the tree to find the node, which is an O(n) operation. But once you find a node, it can be deleted in O(1) time.* Thus, you can delete all O(n) marked nodes in a single traversal in O(n) time.
* Unless you need to maintain a balanced tree. However, you don't list that as a requirement.
EDIT As @njlarsson correctly points out in his comment, a delete operation is not normally O(1) even after the node is found. However, since the left and right subtrees are being traversed before visiting a node to be deleted, the minimum (or maximum) elements of the subtrees can be obtained at no additional cost during the subtree traversals. This enables an O(1) deletion.