I would like to explore my analysis regarding Bucket sort as below.
There are many ways that Bucket sort can be implemented. Some of them are as follows.
Type 1:
if we know the range of our elements to be sorted, we can set up
buckets for each possible element, and just toss elements into their corresponding buckets. We then
empty the buckets in order, and the result is a sorted list.
In implementing this algorithm, we can easily use an array to represent our buckets, where the value at
each array index will represent the number of elements in the corresponding bucket. If we have integers
on the range [0..max], then we set up an array of (max + 1) integers and initialize all the values to
zero. We then proceed sequentially through the unsorted array, reading the value of each element, going
to the corresponding index in the buckets array, and incrementing the value there.
Time : O(N)
Space: O(1)
Type 2:
Example:to sort an array of people by age
Age is somewhat different than arbitrary integers for sorting. Because of that it has a small range [0-150] (all people's ages are between 0 and 150). So the quickest way to sort it would be to allocate 151 linked lists (let's call them buckets) and put each person's data structure in the bucket according to his/her age:
Time : O(N+K)
Space: O(N+K)
Type 3 (Variation of type2 as shown in Wikepedia)
The function nextSort is a sorting function for sorting each bucket. if insertion sort used than than worst will be O(n^2) or Merge sort will be used so that I can keep stability than O(nlgn).
- Question:
1>How it is considered linear sort, is it because of Type 1 or Type 2?
2>If I use Type 3 like WIkepedia which sort is efficient sorting each bucket?
I know The reason insertion sort is used in practice is that we expect the buckets to be small, and for small lists, insertion sort is much faster than anything else. Even when implementing merge sort or quicksort, insertion sort is used when the list gets small enough (say below 20 items or so).
3>for type 3 on which basis I can decide range of buckets?
This is important because if you try doing a bucket sort with a huge number of buckets for example much greater than n, the runtime might be dominated by the time required to scan over all the buckets looking for the buckets that you actually used, even if most of them are empty.
I did Analysis based on:
Wikepedia
How could the complexity of bucket sort is O(n+k)?
Design and Analysis of Algorithms
Lecture notes for January 23, 1996
http://www1bpt.bridgeport.edu/~dichter/lilly/bucketsort.htm
http://cs.nyu.edu/courses/fall02/V22.0310-002/lectures/lecture-23.html
How is the complexity of bucket sort is O(n+k) if we implement buckets using linked lists?
What is the worst case complexity for bucket sort?