Quicksort killer
What is the time complexity of quicksort? The answer that first pops up in my head is O(N logN). That answer is only partly right: the worst case is in fact O(N2). However, since very few inputs take anywhere that long, a reasonable quicksort implementation will almost never encounter the quadratic case in real life.
I came across a very cool paper that describes how to easily defeat just about any quicksort implementation. The paper describes a simple comparer that decides ordering of elements lazily as the sort executes, and arranges the order so that the sort takes quadratic time. This works even if the quicksort is randomized! Furthermore, if the quicksort is deterministic (not randomized), this algorithm also reveals the input which reliably triggers quadratic behavior for this particular quicksort implementation.
The trick takes a dozen of code, works with nearly any quicksort routine, and only uses the quicksort via its interface! How cool is that? Here is a C# implementation of this idea:
class QuicksortKiller : IComparer<int> { Dictionary<int, int> keys = new Dictionary<int, int>(); int candidate = 0; public int Compare(int x, int y) { if (!keys.ContainsKey(x) && !keys.ContainsKey(y)) { if (x == candidate) keys[x] = keys.Count; else keys[y] = keys.Count; } if (!keys.ContainsKey(x)) { candidate = x; return 1; } if (!keys.ContainsKey(y)) { candidate = y; return -1; } return keys[x] - keys[y]; } }
This trick works well when applied to the .Net Array.Sort() method. The following chart displays the number of Compare() calls forced by QuicksortKiller when ran on an array of some size, as well as the number of Compare() calls that made by Array.Sort on a randomly-ordered sequence of the same length:
![]()
This chart clearly shows that the QuicksortKiller comparer triggers the quadratic behavior in Array.Sort.
How does it work?
QuicksortKiller’s trick is to ensure that the pivot will compare low against nearly all remaining elements. But, how can we detect which element is the pivot? We know quicksort will take the pivot and compare it against all other elements.
So, initially we consider all elements to be unsorted. The paper refers to them as "gas". When we compare two gas elements, we arbitrarily choose one of them, and freeze it to a value larger than any previously frozen values. We remember the other element as the pivot candidate. If the candidate is used in the next comparison with a gas element, we will make sure to freeze the pivot candidate, rather than the other element. That way, the pivot will be frozen within two comparisons against other elements.
Constructing a "Bad" Array
If the quicksort implementation is deterministic, and always takes the same steps on the same input, it is easy to generate an array that reliably triggers the quadratic behavior. The MakeBadArray method constructs an array that triggers the quadratic-time behavior of Array.Sort():
static int[] MakeBadArray(int length) { int[] arr = Enumerable.Range(0, length).ToArray(); Array.Sort(arr, new QuicksortKiller()); int[] ret = new int[length]; for (int i = 0; i < length; i++) { ret[arr[i]] = i; } return ret; }
How to defeat the Quicksort killer?
As the original paper explains, the adversary comparer works if the quicksort implementation checks an O(1) number of elements as pivot candidates. But, quicksort implementation that check more than O(1) elements are possible. For example, the quicksort might choose the median element as the pivot at each step, thus always geting a perfect split of the input sequence into two halves. Median can be found deterministically in O(N) running time, and so the total running time is always O(N logN). The median-based quicksort is rarely used in practice because it tends to have a larger constant than other quicksort implementations.
Another solution is to use a regular quicksort algorithm, and degrade to another sorting algorithm if quicksort is not working out. For example, if we reached a recursive depth of 10, and the size of our partition has not reduced by at least a half, we can just ditch quicksort and sort the partition using heapsort.
Sunday 04 May 2008 | Igor Ostrovsky | Programming

How to break quicksort…
You’ve been kicked (a good thing) - Trackback from DotNetKicks.com…
Ever heard of merge sort?
If you require GUARANTEED performance, i.e. that performance will not exceed an upper bound, heapsort is what you want.
Quicksort is great and all, but its inconsistency can be a killer where good performance is required, but bad performance is unacceptable.
And, on the upside, since heapsort requires no non-deterministic internal branching, in machine code compiled applications with a good optimising compiler you can be guaranteed that your pipelines stay saturated and that you get on average better performance than quicksort. The only real downside is that it has very bad cache locality.
The last few points aren’t really applicable to C#, but interesting to note, nonetheless.
Paco: of course I’ve heard of merge sort.
Merge sort guarantees O(N log N) running time, but a typical implementation also requires O(N) additional memory. There are some implementations that do in-place merging and thus only require O(1) extra memory, while still guaranteeing O(N log N) running time. Unfortunately, the constant-memory implementations have a large constant, and also are rather complicated.
Radix sort for the win!
Introsort?
http://www.en.wikipedia.org/wiki/Introsort