Randomization in Algorithm – QuickSort as Example

DinS          Written on 2017/12/29

Randomization is a key concept for many algorithm designs. At first it may seem odd. Algorithm guarantees the final result, but randomization introduces instability. Can you get determined output by randomized methods? Well, the answer is yes, only if you use randomization in a correct way.

Here we’ll introduce QuickSort as a demonstration of randomization.

1. An instance of QuickSort

Notice: fine details may very each time you run QuickSort. We’re talking about randomization after all.

2. What’s QuickSort doing?

QuickSort is in essence a recurrence algorithm. It keeps splitting the original problem to smaller scale and apply QuickSort to them. But what makes it different is that it doesn’t have a combine step. This is very special for recurrence algorithm.

If we look deeper, we’ll find that there’re two major steps in QuickSort: choose a pivot and partition around the chosen pivot. We’ll talk about them in a minute.

The pseudo code for QuickSort may look something like this:

QuickSort(array A, lenth n)

{

If n = 1 return

P = choosePivot(A,n)

Partition A around P

QuickSort(<P, length)

QuickSort(>P, length)

}

3. Partition around pivot

The idea is given a pivot, we rearrange the array so that the left of pivot < pivot, and the right of pivot > pivot. In other words, we’ve put the pivot in its rightful position.

Cool facts about partition: O(n) time, no extra memory, just swap. Besides, we can reduce problem size.

As for implementation, we only need a single scan through the array. The easy way out is using a temporary array, for every elements smaller than pivot, we put them from head to tail; for every elements bigger than pivot, we put them from tail to head.

However, this uses extra O(n) memory. We can do better. In-place implementation needs some deliberation, and we’ll not cover details here. The key is using two tags i,j to mark looked elements and unlooked elements.

In conclusion, after partition we got two arrays and a rightfully-placed pivot. All elements in the left one are smaller than pivot, and all elements in the right one are bigger than pivot. As long as we keep partition, the arrays will be sorted at last.

4. Choose a pivot

Now comes the hard part: how to choose a pivot?

The running time of QuickSort depends on the quality of the pivot.

For an extreme example, suppose we’ll always choose the first element as pivot. For an already sorted array, this strategy will take O(n2), because there will not be any left array at all.

For another extreme case, suppose we always choose the middle number in O(1). Then we’ll always split the problem in half. Now QuiuckSort looks more like MergeSort and will take O(nlogn).

So the question remains: how to choose? Big idea: choose randomly. In fact, we’re talking about randomization in algorithm, this should not be too shocking. To state more formally, in every recursive call, choose the pivot randomly, i.e. each element equally likely. We’re hoping a random pivot is a good enough choice.

5. Running time for QuickSort

Since we’re choosing the pivot randomly, normal algorithm analysis won’t work. We need the tool of probability. Make sure to check the math review section if you don’t understand.

We’ll define Ω = all possible outcomes of random choices in QuickSort. This should be pivot sequences.

Next we’ll define a key random variable. For σ∈Ω, c(σ) = number of comparisons between two input elements made by QuickSort. In a more easy-to-understand description, c(σ) is just the total number of comparisons made during QuickSort. As we know, what QuickSort is doing is just comparing and swapping, and comparing surely outnumbers swapping. Thus the running time of QuickSort is dominated by comparisons. As a result, our remaining goal is to find E[c]. If we can prove that the expected number of comparisons is O(nlogn), we’re done.

How should we do this? The answer is decomposition principle. It’s a trick. We do these steps:

(1)identify random variable Y that you really care about.

(2)express Y as sum of indicator random variables .

(3)apply linearity of expectation .

The key idea is to bring sophisticated random variables to something simpler.

Now we define a notation zi = ith smallest element of A.

For example, for such an array A the zi is as follows.

As we can see, after sorting zi is just where the number should be.

We’ll define a new notation. For σ∈Ω, indices i<j, let xij(σ) = number of times zi, zj get compared in QuickSort. Note that for any two elements zi and zj, they’ll get compared at most once during the whole process of QuickSort. Why? Comparison only takes place during partition. Once zi and zj get compared, one of them must be the pivot, because only pivot gets compared with other elements. And if one of them is the pivot, it will be excluded from other recurrence. As a result, zi and zj can’t be compared once more. To express the above idea in math, xij(σ) = 0 or 1. This is a good indicator.

Let’s summarize a bit. c(σ) = number of comparisons between input elements. xij(σ) = number of comparisons between zi and zj.

Thus for any σ, 

The above expression just means we add all possible comparisons together, and that’s the definition of c(σ).

By linearity of expectation, we have .

E[xij] = 0*Pr[xij=0] + 1*Pr[xij=1]

Thus we have 

Where should we go next? We just need to find Pr[xij=1] and done.

Key claim: for any i<j, .

Proof: fix zi, zj with i<j, consider zi, zi+1, … , zj-1, zj, all together j-i+1 elements.

If all these elements are not picked as pivot, there must be a pivot that is less than zi or bigger than zj. Then all these elements will be passed to the same recursive call until one of these is picked as pivot. What will happen next? There’re two possibilities.

First, zi or zj is picked as pivot. If that’s the case, they’ll get compared once.

Second, for zi+1, …, zj-1, one of them is picked as pivot. It that’s the case, zi or zj will never get compared because they’ll be passed to different recursive calls.

Since we’re picking pivot randomly, among j-i+1 elements, the probability for zi or zj get compared is just the probability they are picked as pivot, and that is .

In the end we have .

The question is how big is this?

We’ll put the 2 to the outside. For each fixed i, the inner sum is 

What is  ? Expressed in calculus,  

In conclusion E[c] <= 2nLn(n). In other word, the running time for QuickSort is O(logn) in the sense of probability.