Why does Java's Arrays.sort method use two different sorting algorithms for different types?

138

votes

Java 6's Arrays.sort method uses Quicksort for arrays of primitives and merge sort for arrays of objects. I believe that most of time Quicksort is faster than merge sort and costs less memory. My experiments support that, although both algorithms are O(n log(n)). So why are different algorithms used for different types?

javaalgorithmquicksortmergesort

Quicksort worst case is N^2 not NlogN. – codaddict

Wait, what happens if you have an array of Integers or something? – Tikhon Jelvis

Isn't this explained in the source you read? – Humphrey Bogart

This information is no longer current. Starting in Java SE 7, MergeSort has been replaced with TimSort and QuickSort has been replaced with Dual-Pivot QuickSort. See my answer below for links to the Java API docs. – Will Byrne

See also stackoverflow.com/questions/15154158/… and for JDK 7+ see stackoverflow.com/questions/32334319/… – rogerdpack

229

votes

The most likely reason: quicksort is not stable, i.e. equal entries can change their relative position during the sort; among other things, this means that if you sort an already sorted array, it may not stay unchanged.

Since primitive types have no identity (there is no way to distinguish two ints with the same value), this does not matter for them. But for reference types, it could cause problems for some applications. Therefore, a stable merge sort is used for those.

OTOH, a reason not to use the (guaranteed n*log(n)) stable merge sort for primitive types might be that it requires making a clone of the array. For reference types, where the referred objects usually take up far more memory than the array of references, this generally does not matter. But for primitive types, cloning the array outright doubles the memory usage.

34

votes

According to Java 7 API docs cited in this answer, Arrays#Sort() for object arrays now uses TimSort, which is a hybrid of MergeSort and InsertionSort. On the other hand, Arrays#sort() for primitive arrays now uses Dual-Pivot QuickSort. These changes were implemented starting in Java SE 7.

12

votes

One reason I can think of is that quicksort has a worst case time complexity of O(n^2) while mergesort retains worst case time of O(n log n). For object arrays there is a fair expectation that there will be multiple duplicate object references which is one case where quicksort does worst.

There is a decent visual comparison of various algorithms, pay particular attention to the right-most graph for different algorithms.

10

votes

I was taking Coursera class on Algorithms and in one of the lectures Professor Bob Sedgewick mentioning the assessment for Java system sort:

"If a programmer is using objects, maybe space is not a critically important consideration and the extra space used by a merge sort maybe not a problem. And if a programmer is using primitive types, maybe the performance is the most important thing so they use quick sort."

1

votes

java.util.Arrays uses quicksort for primitive types such as int and mergesort for objects that implement Comparable or use a Comparator. The idea of using two different methods is that if a programmer’s using objects maybe space is not a critically important consideration and so the extra space used by mergesort maybe’s not a problem and if the programmer’s using primitive types maybe performance is the most important thing so use the quicksort.

For Example: This is the example when sorting stability matters.

That’s why stable sorts make sense for object types, especially mutable object types and object types with more data than just the sort key, and mergesort is such a sort. But for primitive types stability is not only irrelevant. It’s meaningless.

Source: INFO

0

votes

Java's Arrays.sort method uses quicksort, insertion sort and mergesort. There is even both a single and dual pivot quicksort implemented in the OpenJDK code. The fastest sorting algorithm depends on the circumstances and the winners are: insertion sort for small arrays (47 currently chosen), mergesort for mostly sorted arrays, and quicksort for the remaining arrays so Java's Array.sort() tries to choose the best algorithm to apply based on those criteria.

Why does Java's Arrays.sort method use two different sorting algorithms for different types?

6 Answers