DinS Written 2017/12/30

Even the wisest man can’t see through the future. Our best bet is taking the action that seems good now.

Well, this is the main idea behind greedy algorithm. The problem is far too complicated for future planning. We’ll just do what seems good now and proceed to see the final result. Clearly, the main focus for greedy algorithm is its correctness, not running time. They generally run fast, but it’s hard to prove they are correct. Choosing the correct greedy criterion is an art.

Here we’ll introduce Dijkstra Shortest-Path algorithm as a demonstration.

1. An instance of Dijkstra Shortest-Path algorithm

Dijkstra Shortest-Path algorithm is an algorithm about graph. Given a directed graph G=(V,E) with nonnegative edge length, a source vertex s, we use this algorithm to compute L(v) = length of a shortest path from s to v in G, where v is any vertex in V.

See an example below.

Start from source s, L(t) = 6.

2. What is the algorithm doing?

From the example above, our first impression would be the algorithm just selects the shortest path from current vertex. However, that’s not the story. The algorithm is doing something a bit more complicated.

Before deciding which path to take, we’ll first initialize two data structures. X = {s}, which stores vertices processed so far. A[s] = 0, which is the computed shortest path distances. The pseudo codes look like this.

While (x != v) //iteration until we reach target

{

Among all crossing edges (v,w) for X, //all potential candidates

Pick the one that minimize A[v] + l_vw //Dijkstra greedy criterion

Call that edge (v*, w*)

Add w* to X

Set A[w*] = A[v*] + l_v*w*

}

You see, the algorithm is not merely picking the shortest path for current vertex. Rather, it’s calculating all previous crossing edges and finding the one that minimize the distance so far. Let’s look at the example again.

The first step is initialization. X={s} and A[s] = 0.

The second step is examining all crossing edges, which are (s,v) and (s,w), and pick the one that meet the greedy criterion. A[s] + l_sv = 1. A[s] + l_sw = 4. Clearly (s,v) is the one to pick. As a result we add v to X and X= {s, v}. We also update A[] so that A[v] = 1.

What about the next iteration? Same story. We examining all crossing edges, which are (s,w), (v,w) and (v,t). Notice the (s,w) edge here. We not only look edges from v, but also all other edges from vertices in X. That’s the key. A[s] + l_sw = 4. A[v] + l_vw = 3. A[v] + l_vt = 7. Surely the second one is the best choice. As a result we add w to X and X={s,v,w}. We also update A[] so that A[w] = 3.

The final iteration. There’re only two crossing edges this time since s,v,w are all in X. They are (v,t) and (w,t). A[v] + l_vt = 7. A[w] + l_wt = 6. Edge (w,t) is the shortest. Because t is the target, we’re done.

3. Is it correct?

The hard point for greedy algorithm is proving its correctness. Is the Dijkstra greedy criterion correct?

Inductive hypo: all previous iterations are correct. That is to say A[v] = L(v). For vertex v, we indeed have calculated the shortest path.

Current iteration: A[w*] = L(v*) + l_v*w*. We’ve picked the vertex w that we think would be the shortest path.

We need to show that every s->w* path has length >= A[w*]. In a more friendly expression, all other paths are same or longer than the path we choose.

Let P = any s->w* path. P must cross the frontier at least once. This is true because w* is not yet inside X. So to go from s, which is inside X, to w* must pass crossing edges at least once.

Now let’s explain this graph a little. Based on inductive hype, A[y] must be the shortest from s to y. Thus for any path from s to y, its length must be equal or greater than A[y]. This is the first section of P. Next we have explain P must cross the frontier at least once. We’ll mark that crossing edge as (y,z). The length will be L_yz. This is the second section of P. After crossing the frontier, the journey is not finished yet. We have to go from z to w*. Based on condition that there’re no nonnegative path in graph, the length of (z, w*) must be equal or greater than 0. This is the third path of P.

If we put it together, we have length of P >= A[y] + L_yz.

By greedy criterion A[v*] + l_v*w*<= A[y] + l_yz <= length of P.

The first part is actually A[w*], our chosen path. In other words, A[w*] is indeed the shortest path. The correctness of Dijkstra shortest-path algorithm is proved.

4. Running time

Straightforward implementation is O(mn), where n is the number of vertices and m is the number of edges.

We’ll need n-1 iteration of while loop. We need to examine all crossing edges per iteration, the worst case all edges.

Thus O(mn) running time. However, we can improve a bit, using suitable data structures like heap.