Big O Notation

DinS          Written on 2017/12/9

1. What is a good algorithm?

To put it plainly, algorithm can be understood as a fixed procedure to solve a general type of problem. That is to say, regardless of concrete setups, as long as the problem given belongs to type A, algorithm of that type promises to solve it.

As the old saying goes, “all roads lead to Rome”. There can be many algorithms to solve one type of problem. Some roads are short and safe, while others are long and dangerous. It is the same with algorithm. Some algorithms are good, while others are bad. We want to pick the good ones.

What is a good algorithm? There’re many criteria, but the most commonly used one is time, how long does it take for an algorithm to solve the problem.

Why do we care about time most? Because space is generally sufficient, and hard disk is rather cheap compared to time. Time is money. As human beings, we can’t wait for too long. If one algorithm can solve the problem in one hour, while the other takes one year, we sure know which one to choose.

However, it’s not easy to judge how long an algorithm takes to solve a problem. First, we can’t do experiments. That’s too costly. If an algorithm really takes one year to finish, we’re done. Second, machines vary from one another. We can’t say the running time on this particular machine applies to others’ as well. Third, though problems belong to the same type, they can vary from each other. Actually problem size is a determining factor. In conclusion, we need a criterion that is independent of concrete machines and problems but at the same time can estimate the approximate time the algorithm takes to finish. Besides, this criterion should be coarse enough to overlook details so that we are spared from trivial matters. But this criterion should also be sharp enough to distinguish good one from bad ones.

Luckily we have this criterion.

2. Counting number of operations

If we think about it, the time an algorithm takes actually is directly related to the number of operations it performs. If algorithm A takes 100 operations while algorithm B takes 1000 operations, we can say for sure that A runs faster than B. However there’s one assumption: each operation takes about the same time. This assumption roughly holds true for nowadays computers. We don’t want to be dragged into the mire of machine level details. As a result, we will assume that basic operations, such as add, minus, multiply, divide and addressing all takes roughly the same time, a constant number.

So now the problem becomes how to count the number of operations. Let’s look at an example: integer multiplication.

Question: how to multiply 1234 * 5678?

We can use the method we learned at school. Now write the number as this.

This should be easy, I hope. But we’re not learning how to multiply numbers here. We’re counting the number of operations.

Let’s see how many basic operations we have taken to get the first line. We actually did 4 times of multiplying and 3 times of adding. What about the second row? The answer is the same. This holds true for the third and fourth row. In short, we did 16 times of multiplying and 12 times of adding to get the four rows. Since we regard multiplying and adding as basic operations, we can say that we did 28 basic operations. But that’s not over yet. We need to add the four rows to get the final result. In this procedure we did 16 times of adding. All in all, we did 44 basic operations.

Let’s try a different example. See this code below.

For (int i = 0; i != 100; ++i)

{

For (int j = 0; j != 100; ++j)

}

How many basic operations will we take to finish these codes?

If we don’t count in the quitting condition, we did 100 * 100 iterations. That’s roughly 10000 basic operations. If we take the “!=” into account, that’ll be roughly 20000 basic operations.

3. Adding the n

In the last two examples, we did count the number of operations. However, that’s only a concrete problem. We need to generalize it to get the true efficiency of the algorithm. To achieve this, we need to consider a problem with size n.

First the grade-school multiplying algorithm. To make matter simple, let’s assume that the two numbers to be multiplied have the same digits. We’ll define the number of digits as n. This should be reasonable because the more digits you have, the more difficult it is. Now I claim that to get each row, you’ll need at most 2n operations. Why is this? Because you’re multiplying n digits with one digits. You actually will do n times of multiplying and at more n times of adding. All in all that’s 2n operations. Moreover, there’ll be n rows, since the second number is of n digits. In short, you’ll do at most 2n2 operations to prepare all the rows. Then how many operations will you need to add all the numbers up? Each row will have at most n+1 digits. 999 * 9 gets 8991, and that’s the most you can get for a single multiplying with 3 digits. Now try to think of the block of rows as a rectangle of numbers. We have n rows in total. That means the height of it is n. What about the width? Notice with each row there’s a shift to the left. This spacing will be n-1 digits. Let’s just think the first row moves n-1 digits to the left, and this gets n+1+n-1, which is 2n. The width of rectangle is at most 2n. Now the total number in this rectangle can’t be more than 2n2. We’re just adding all those digits up, some of them are in fact null. Anyway, the adding process can’t take more than 2n2 operations.

The final conclusion is we’ll take at most 4n2 operations to multiply two numbers of digits n.

What about the loop example. This should be easy enough. Given input n, the inner loop takes n operations and the outer loop also takes n operations. That means n*n operations. If we take the “!=” into account, that’ll be 2n2 operations

4. Big O notation

We’ve got the operations of the above algorithm, but it’s still too detailed. In fact,4n2 and 8n2 don’t have that much difference. To overcome this, we’ll introduce big O notation. What is big O? In practice, you can just think of it as a way to get rid of small constant and lower order terms. For example:

O(4n2) = O(n2)

O(100) = O(1)

O(n3 + 2n) = O(n3)

What’s the meaning of this? Big O notation is a well-chosen ruler. It can distinguish algorithms just to our taste: rough enough and sharp enough. To be more specific, if an algorithm takes O(1), constant time, or O(logcn), logarithm time, we call it a fast algorithm. If an algorithm takes O(nc), polynomial time, we’ll say this algorithm can work out. If an algorithm takes O(an), power time, we say this algorithm takes too much time.

Looking back, our grade-school algorithm is an O(4n2) = O(n2) algorithm. It is acceptable, but we have faster multiplying algorithm, Karatsuba Multiplication. As to how it works, I’ll leave the homework for you.

 

A guiding principles for analysis of algorithm:

(1) Worst-case analysis. This makes life easier as it can spare us the pain of looking into different conditions. We only need to investigate one sample and that’s all.

(2) Asymptotic analysis. Using big O.

(3) Ignore small constant and lower order terms. Think all basic operations as O(1).

With all knowledge equipped, we can understand what a fast algorithm is. A fast algorithm is one whose worst-case running time grows slowly with increasing input size.

5. Difficulty with recurrence

How to analyze an algorithm? We just count the number of operations in the worst case with a problem size n, and use big O notation to simplify the procedure. This works fine for iterations. We just have to multiply. However, we’ll face a huge obstacle when analyzing recursive algorithm. It’s hard to count the operations.

One way to solve it is by drawing a graph. To see how the problem shrinks and count operations for each layer. The other way is to use equations to solve it mathematically. Either way needs flash of insight. That’s actually the fun part of analyzing algorithm.

The good news is, if the subproblems have equal size, you can use Master method to get the answer right away.

We’ll come to these later.