7
votes

It seems to me that the only advantage of heap over binary tree is to find the smallest item in the heap in complexity of O(1) instead of O(log(2)n) in binary tree.

When implementing priority queue you need to delete the smallest item each from the data structre. deleting the smallest item from a tree and both heap done in complexity of O(log(2)n). Althogh deleting item from a tree may be more complex. Deleting item with no childrens acctually very simple.

My question is why use heap instead of binary tree(which is simpler in this case) when implementing priority queue?

5
If you have implemented a heap using a node structure instead of array, then I'll believe what you're saying :).Luiggi Mendoza
A Heap is a pretty simple data structure to implement...trognanders

5 Answers

11
votes

Worst case complexity in case of binary tree will be O(n) when binary tree converges to an array while in heap it remains O(log(n)). you can use balanced binary trees like red black or AVl but then it wud become more complex and would require more memory.

5
votes

Heaps are usually simpler to implement than properly balanced binary trees. Additionally, they require less memory overhead (elements can be stored directly in an array, without having to allocate tree nodes and pointers and everything), potentially speedier performance (largely due to the memory locality of using a single contiguous array)...why wouldn't you use them?

5
votes

Your first choice should depend on anticipated access patterns, and how much data you're likely to be storing:...

  • if there's never much data (n less than 30, say), an unsorted array will be fine;
  • if you almost never add, delete, or update, a sorted array will be fine;
  • if n is less than, say, 1 million, and you're only ever searching for the top element (the one ranked first, or last), heaps will do well (particularly if you are frequently updating elements chosen at random, as you do in an LRU (least-recently-used) queue for a cache, say, because on average such an update is O(1), rather than O(log(n)))
  • if n is less than, say, 1 million, and you're not sure what you'll be searching for, a balanced tree (say, red-black or AVL) will be fine;
  • if n is large (1 million and up, say), you're probably better off with a b-tree or a trie (the performance of balanced binary trees tends to "fall off a cliff" once n is big enough: memory accesses tend to be too scattered, and cache misses really start to hurt)

...but I recommend leaving the option as open as you can, so that you can benchmark at least one of the alternatives and switch to it, if it performs better.

Over the last twenty years, I've only worked on two applications where heaps were the best choice for anything (once for a LRU, and once in a nasty operations-research application, restoring additivity to randomly perturbed k-dimensional hypercubes, where most cells in the hypercube appeared in k different heaps and memory was at a premium) . However, on those two occasions, they performed vastly better than the alternatives: literally dozens of times faster than balanced trees or b-trees.

For the hypercube problem that I mentioned in the last paragraph, my team-lead thought red-black trees would perform better than heaps, but benchmarking showed that red-black trees were slower by far (as I recall, they were about twenty times slower), and although b-trees were significantly faster, heaps beat them comfortably too.

The important feature of the heap, in both the cases I mentioned above, was not the O(1) look-up of the minimum value, but rather the O(1) average update time for an element chosen at random.

-James Barbetti (Well, I thought I was. But captcha keeps telling me I'm not human)

0
votes

If you use a find or search operation a lot then a balanced binary tree is preferred. Line segments intersection code use balanced trees instead of heaps because of this one reason.

0
votes

First of all there are different binary trees (some of them are quite difficult, some of them provide only average O(log n)), and heap is one of them.

The second: while operations on most trees are O(log n), they are more complex, there is constant factor.

Heap needs constant additional memory, while trees usually need to store pointers in every node.

By the way, heap is quite easy and only use arrays (I'm not sure that if it's implemented this way in Java, but I do think so)