DinS Written on 2017/4/24

1. Visually, what is a vector(向量)?

See featured image above.

Vector is good at static operations such as access and search. Because it is easy to understand and use, vector is recommended as the default data structure.

2. Access

Vector follows call-by-rank, which means we can access its element in O(1) time.

e.g. myVector[2] gives the data in rank 2.

3. Insert and delete

Before insertion, we need to move certain elements to make space for the insertion. After delete, we need to move the following elements to fill the gap. In the worst case we need to move nearly all elements. This can lead to O(n).

e.g. insert a new element at rank 2

First move myVector[2], myVector[3] and myVector[4] one block backward.

Second the vector looks like this. Rank 2 is now free.

Third put the new data at rank 2.

e.g. delete myVector[1]

First simply delete the target element.

Second move myVector[2], myVector[3] and myVector[4] one block forward.

Finally the vector looks like this

4. Expansion

The capacity of vector means how many elements it can store. The size of vector means how many elements it has actually stored. For this vector:

The capacity is 7; the size is 5.

When size meets capacity, the vector needs to grow. The most common practice is to build a new vector with its capacity twice as the old capacity, and move all data into the new vector. In this way the amortized running time(分摊运行时间) can be as low as O(1).

5. Search

Given a value, how can we find the corresponding element?

For unsorted vector, we can only traverse to find the target. This means O(n).

For sorted vector, we can do better. The key is binary search. First we compare the middle elements with target element. If the middle element is smaller than target, we go to the right side to do binary search. If the middle element is bigger than target, we go to the left side to do binary search.

Pseudo code:

| BinSearch(Vector, Target, lo, hi)

| {

| while(lo < hi)

| {

| mi = middle(lo, hi)

| if (Target < Vector[mi]) {hi = mi}

| else if (Target > Vector[mi]) {lo = mi + 1}

| else {return mi}

| }

| return -1

| }

*Notice: This is only a very preliminary version of binary search. For more refined version see a data structure textbook.

e.g. search for 7 in vector

Middle 5 is smaller than the target 7. Go to right side to do the same procedure.

Middle 7 equals the target 7. Hit!

If the interval is [lo, hi] at first, and since each iteration of binary search will reduce the interval to half, after at most log₂(lo – hi) iterations this algorithm will surely stop, either hit or fail. Each iteration only takes O(1). In conclusion binary search needs only O(logn) time.

I think there is another way to think about binary search, especially the log₂(lo – hi) part. How does it come? When the algorithm ends, we fix a single element. Since each iteration reduces the interval by half, if we think one step back, the interval should be 1 * 2. If we think two step back, the interval should be 1 * 2 * 2. In general if we think n steps back, the interval should be 2ⁿ. The algorithm takes only n iterations, after n iterations the interval should be the original interval. Thus we get 2ⁿ = hi – lo. In other words n = log₂(lo – hi). That’s the iteration number.

6. Sort

Sorting is a very deep area of study. There are many different sorting algorithms. You can check MergeSort or QuickSort on these pages.

7. Example – Sieve of Eratosthenes

This is implemented using <vector> in STL.

Notice that in order to avoid O(n) deletion operations, we just cross-out the number.

The result: