4
votes

I've seen a thread or two on this already, but I still don't think it's been fully cleared up so...

I've been looking at hash tables with chaining in Introduction to Algorithms (Chapter 11). They say that if a hash table's linked lists are doubly-linked then deletion can be preformed in O(1) time. But I can't see how so.

T is a chained hash table. I want to delete element X from it. 
h(X.key) returns the hash of X's key.
So, to delete X, pass X to the delete routine. 
Then we can find X's slot in the linked list:
T[ h(X.key) ] = j
j is actually the pointer to the head of the linked list as > 1 key hashes to j.
So we still need to now search the linked list at j to find X to delete it, 
regardless of whether or not the list is doubly linked. 
And this search is O(n). So deletion is O(n)???

My point is that don't we still need to search the linked list for X? X is an arbitrary object that we want to store in a hash table. It does not contain pointers. The element in the linked list that holds X will contain pointers to the next and previous elements in the linked list, but not X.

4

4 Answers

3
votes

After digging deeper around here I've stumbled across the answer.

The implicit assumption in the book is that the links in the linked list are part of the element X, rather than there being a separate container-node object that implements the links and points to the element. That is, the object X that we wish to remove contains the pointers to the next and previous elements in the linked list. The authors really should have made this explicit.

Thanks to RBarryYoung for the answer!

1
votes

The cost of a hash table operation is scanning the entries of the selected bucket for the desired key. If there is uniform distribution of the keys, then the average cost of a an operation depends only on the average number of keys per bucket. In other words, on the load factor.

A hash table with a low collision rate and a high number of slots will always offer near O(1) operations.

Chained hash tables remain effective even when the number of table entries n is much higher than the number of slots. Their performance degrades more gracefully (linearly) with the load factor. For example, a chained hash table with 1000 slots and 10,000 stored keys (load factor 10) is five to ten times slower than a 10,000-slot table (load factor 1); but still 1000 times faster than a plain sequential list.

From this we say that the amortized cost of an operation is O(1) when it is assumed that the hash table has sufficient space for the load (AKA the load factor is 1).

1
votes

Why is deletion O(1) in your case?
The following content is quoted from P258 of CLRS 3rd edition:

Note that CHAINED-HASH-DELETE takes as input an element x and not its key k, so that we don’t have to search for x first. If the hash table supports deletion, then its linked lists should be doubly linked so that we can delete an item quickly. If the lists were only singly linked, then to delete element x, we would first have to find x in the list T Œh.x:key/ so that we could update the next attribute of x’s predecessor.

We have to supply the pointer of the object to be deleted. Let the object be X. So, to delete from the doubly linked list that X belongs to, we simply do the following:

X.left.right = X.right
X.right.left = X.left

Done! Deletion is constant. The previous-next pointers are in X itself, so is the key.

0
votes

Both insertion and deletion on a hash table are O(1). You are correct that the linked list must be That means there may be more than one operation required for either operation. The key is that this chaining only happens when there are collisions of the hash function. If collisions were routine, deleting would be O(n). In algorithm textbooks, the illustrations of how this works may show a relatively small number of buckets, but the average number of collisions does not grow with n, only the worst case.