DinS Written on 2017/5/1

This article is the second part of Splay Tree. See first part here.

5. Potential analysis(势能分析法)

Intuitively, we know that splay tree can do a good job. To prove the O(logn), however, we need a more rigid analysis. R. E. Tarjan has proved this. We’ll just introduce the method here.

Borrowing the concept of potential in physics, we can think that each splay tree S has some potential Φ(S) > 0. After one operation S becomes S’. As a result, the change in potential can be expressed in this formula.

ΔΦ = Φ(S’) – Φ(S)

More general, if we perform m operations to splay tree S₀, the i^th operation gives us tree S_i. We have

ΔΦ_i = Φ(S_i) – Φ(S_i-1) for 1<= i <= m

From the whole process we get

This confirms the concept in physics. The change in potential is determined only by initial and terminal conditions.

Now let’s define the amortized running time as A, and real running time as T. We have

A = T_i + ΔΦ_i

The real time cost may be a bit bigger or less than amortized time cost. The difference is compensated by the change in potential. Overall, this is a good evaluation. If we sum up all the i^th operation we get

As a result, we can think the sum of A_i as a higher bound for the sum of T_i. This means that if we can evaluate A, we get a good comprehension on T.

Now comes the real proof. Tarjan defined Φ(S) as this:

|v| means the number of offspring node v has.

From this he arrived at a key lemma: during the splay of node v, the running time for each adjustment is less than triple of change in potential of v, i.e. 3[Φ’(v) – Φ(v)]

Let’s prove this.

First let’s discuss the case when v, p, g are in a zigzag position.

We have to do two rotations. This is O(2). Node v, p, g all changed after adjustment. We have

A = T + ΔΦ

= 2 + ΔΦ(v) + ΔΦ(p) + ΔΦ(g)

= 2 + Φ’(v) – Φ(v) + Φ’(p) – Φ(p) + Φ’(g) – Φ(g)

Note that the number of offspring for initial node g and terminal node v are the same. Node v assumes the position of node g after all. This means Φ’(v) = Φ(g). Thus we cancel those terms and have

= 2 – Φ(v) + Φ’(p) – Φ(p) + Φ’(g)

We can apply the same trick again. The number of offspring for initial node p is greater than that of node v. Node p is node v’s parent after all. This means Φ(p) > Φ(v). Thus we have

<= 2 + Φ’(p) + Φ’(g) – 2Φ(v)

Now we have to use the property of logarithm function. Since logarithm function is a concave function, we have

This means log₂a + log₂b <= 2log₂[(a+b)/2] = 2[log₂(a+b) – log₂2] < 2(log₂c – 1)

From this we can get Φ’(p) + Φ’(g) <= 2Φ’(v) – 2, because the sum of terminal p and g equals terminal v. all subtrees are covered. Thus we get

A <= 2[Φ’(v) – Φ(v)]

Second let’s discuss the case when v, p, g are in a straight line.

Similar to the first case, we have to do two rotations. This is O(2). Node v, p, g all changed after adjustment. We have

A = T + ΔΦ

= 2 + ΔΦ(v) + ΔΦ(p) + ΔΦ(g)

= 2 + Φ’(v) – Φ(v) + Φ’(p) – Φ(p) + Φ’(g) – Φ(g)

= 2 – Φ(v) + Φ’(p) – Φ(p) + Φ’(g) //Φ’(v) = Φ(g)

<= 2 + Φ’(p) + Φ’(g) – 2Φ(v) //Φ(p) > Φ(v)

<= 2 + Φ’(v) + Φ’(g) – 2Φ(v) //Φ’(v) > Φ’(p)

Here we can use the concave function property again. We get Φ(v) + Φ’(g) <= 2Φ’(v) – 2, because the sum of initial v and terminal g cover all the subtrees. This equals the terminal v.

<= 2 + Φ’(v) + Φ’(g) + Φ(v) – 3Φ(v)

A <= 3[ Φ’(v) – Φ(v)]

Now we have covered the main cases in splaying. I leave the smaller cases to the readers.

It’s time to give the finishing touch of the proof. Since we know that each splay operation takes at most 3[ Φ’(v) – Φ(v)] time, suppose at one time lowest node v becomes root, which is the worst case. We have

A <= 1 + 3[ Φ(r) – Φ(v)] <= 1 + 3 Φ(r) = O(1 + logn) = O(logn)

Finally we have proved that the amortized running time for splaying is O(logn)!

6. Search

We use the same method to pinpoint the node splay tree as in B.S.T., but the only difference is we need to lift the node to root. The details are covered in double layer rotation strategy. If the node doesn’t exist, we end up at a null. In this case we lift the parent of the null to root. This confirms data locality.

7. Insert

Insertion in splay tree is quite easy. First we apply search in splay tree, and this ends up at a null. Notice that after applying search, the parent of null has been lifted to root. Now all we have to do is to add the new node to root and splice the proper subtree. This process can be shown as below:

(Source: Deng Junhui, Data Structure (C++) 3rd edition, Beijing: Tsinghua University Press, 2013, p210.) （资料来源：邓俊辉：《数据结构（C++语言版）》，清华大学出版社2013年第3版，第210页。）

Let’s look at a concrete example.

We have this splay tree and want to insert node 6. We perform search(6) and ends up at this null position. This means we do splaying at the same time. The node to be lifted to root is 7. First we right rotate node 13 and then left rotate node 5.

After splaying we get this tree.

Node 7 is the root now, but don’t forget our goal. We want to insert node 6 to the tree, and that should be the root. Since 6 is smaller than 7, we attach node 7 and its right subtree to node 6’s right, and attach its left subtree to node 6’s left. We end up with this tree.

After inserting the tree may seem a little unbalanced, but don’t worry. If we do another operation, the tree will self-adjust to a more balance state. This is the charm of splay tree.

8. Delete

Different from other B.B.S.T., deletion in splay tree is rather easy. First we search for the node to delete. This will lift the node to root. After that we simply delete the node and end up with two subtrees. Here we again search for the deleted node in the right subtree. This will surely fail but it can lift the proper node to the root of right subtree. All we have to do now is use this node as root and attach left subtree to it. The process can be shown as below:

(Source: Deng Junhui, Data Structure (C++) 3rd edition, Beijing: Tsinghua University Press, 2013, p211.) （资料来源：邓俊辉：《数据结构（C++语言版）》，清华大学出版社2013年第3版，第211页。）

Let’s look at a concrete example.

We have this splay tree and want to delete node 7. First we search for 7 and this will lift it to the root.

After search operation we have this tree.

We just simply delete node 7 and end up with two subtrees.

Now we search for node 7 again and end up at a null. This search operation will lift the null’s parent to root.

This time we get these two subtrees.

Notice that now the root of right subtree must be the node closest to the deleted node in right subtree. All we have to do is attach the left subtree to it. Finally we get this.

9. Evaluation

Compared with other B.B.S.T., splay tree is easy to program and have O(logn) efficiency on average. Besides, it takes advantage of data locality, which may even increase its efficiency in practice. Overall this is a good choice. However, under certain conditions which need strict O(logn) efficiency, such as nuclear plant, don’t use splay tree.