Besides the obvious answer of a Priority Queue, when would a heap be useful in my programming adventures?
6 Answers
Use it whenever you need quick access to the largest (or smallest) item, because that item will always be the first element in the array or at the root of the tree.
However, the remainder of the array is kept partially unsorted. Thus, instant access is only possible to the largest (smallest) item. Insertions are fast, so it's a good way to deal with incoming events or data and always have access to the earliest/biggest.
Useful for priority queues, schedulers (where the earliest item is desired), etc...
A heap is a tree where a parent node's value is larger than that of any of its descendant nodes.
If you think of a heap as a binary tree stored in linear order by depth, with the root node first (then the children of that node next, then the children of those nodes next); then the children of a node at index N are at 2N+1 and 2N+2. This property allows quick access-by-index. And since heaps are manipulated by swapping nodes, this allows for in-place sorting.
Heaps are structures meant to allow quick access to the min or the max.
But why would you want that? You could just check every entry on add to see if it's the smallest or the biggest. This way you always have the smallest or the biggest in constant time O(1)
.
The answer is because heaps allow you to pull the smallest or the biggest and quickly know the NEXT smallest or biggest. That's why it's called a Priority Queue.
Real world example (not very fair world, though):
Suppose you have a hospital in which patients are attended based on their ages. The oldest are always attended first, no matter when he/she got in the queue.
You can't just keep track of the oldest one because if you pull he/she out, you don't know the next oldest one. In order to solve this hospital problem, you implement a max heap. This heap is, by definition, partially ordered. This means you cannot sort the patients by their age, but you know that the oldest ones are always in the top, so you can pull a patient out in constant time O(1)
and re-balance the heap in log time O(log N)
.
More sophisticated example:
Suppose you have a sequence of integers and you want to keep track of the median
. The median is the number that is in the middle of an ordered array.
Example:
[1, 2, 5, 7, 23, 27, 31]
In the above case, 7
is the median because the array containing the smaller numbers [1, 2, 5]
is of the same size of the one containing the bigger numbers [23, 27, 31]
. Normally, if the array has an odd number of elements, the median is the arithmetic average of the 2 elements in the middle, e.g (5 + 7)/2
.
Now, how do you keep track of the median? By having 2 heaps, one min heap containing the numbers smaller than the current median and a max heap containing the numbers bigger than the current median. Now, if these heaps are always balanced, the 2 heaps will contain the same number of elements or one will have 1 element more than the other, the most.
When you add a new element to the sequence, if the number is smaller than the current median, you add it to the min heap, otherwise, you add it to the max heap. Now, if the heaps are unbalanced (one heap has more than 1 element more than the other), you pull an element from the biggest heap and add to the smallest. Now they're balanced.
The characteristic of a heap is that it is a structure that maintains data semiordered; thus, it is a good tradeoff between the cost of maintaining a complete order and the cost of searching through random chaos. That characteristic is used on many algorithms, such as selection, ordering, or classification.
Another useful characteristic of a heap is that it can be created in-place from an array!