Splay Tree – Part A

DinS          Written on 2017/5/1

1. Visually, what is a splay tree(伸展树)?

See featured image above.

Splay tree, designed by D. D. Sleator and R. E. Tarjan, is a very unique data structure even in the family of B.B.S.T. It has no explicit rule to control the height of tree. However, by applying a subtle trick, it takes advantage of data locality and can make search, insert and delete operations take O(logn) time.

2. Data locality for B.S.T.

In practice we’ll often find the common phenomenon of data locality. It means that

(1) The data accessed just before is very likely to be visited again in the near future.

(2) The next data to be visited is very likely to be adjacent to some data visited just before.

If we can take advantage of data locality, we can make our data structure more efficient in practice. An example is self-adjust list. By moving the node visited to the head, it can save much time for a single operation.

We can do the same thing in tree, and thus born the splay tree. Concretely, each time we visit a node, we move it to the root by rotations.

3. Single layer rotation(逐层伸展)

Intuitively, we move the node layer by layer to the root. See an example here.

Suppose we want to visit node 10 in this tree. To move it to the root, we first left rotate node 7 to uplift node 10.

After that node 10’s journey continues. We right rotate node 13 to lift node 10 again.

After that we keep on going. Left rotate node 5 to get node 10 to the root.

Finally node 10 makes its way to the root. We achieved our goal.

All seem good, except one thing: efficiency!

Suppose we have a list-like tree.

If we continue to visit the lowest leaf node, say node 5 this time, the tree will behave like this.

This needs O(n) rotations. Now if we visit node 1, the tree will become back to previous state and takes O(n) rotations. This O(n) is not what we are hoping for.

4. Double layer rotation(双层伸展)

Admittedly, the O(n) happens rarely because it ignores data locality completely. But by applying a trick we can make it an average O(logn) even in the worst cases. This trick is called double layer rotation. As the name suggest, this time we’ll not move the node layer by layer, but two layers at one time.

As is the usual case, we locate node v, node v’s parent p and node v’s grandparent g. Depending on the position of v, p, g, we apply different methods.

(1) If v, p, g are in a straight line, we first right/left rotate g and then right/left rotate p.

(2) If v, p, g are in a zigzag position, we first right/left rotate p and then left/right rotate g.

Now you may wonder: what’s the difference? It’s all about node v goes to the top. Well, just looking at these rotations won’t explain the subtlety. Let’s look at a concrete example.

(Source: Deng Junhui, Data Structure (C++) 3rd edition, Beijing: Tsinghua University Press, 2013, p207.) (资料来源:邓俊辉:《数据结构(C++语言版)》,清华大学出版社2013年第3版,第207页。)

Left side is splaying with single layer rotation; right side is splaying with double layer rotation. You should have caught something. Let’s look at an even more general display.

(Source: Deng Junhui, Data Structure (C++) 3rd edition, Beijing: Tsinghua University Press, 2013, p208.) (资料来源:邓俊辉:《数据结构(C++语言版)》,清华大学出版社2013年第3版,第208页。)

If we continue to visit the deepest nodes, the splay tree will change like this. By now you must have noticed that using double layer rotation strategy, not only can we push the node to root, we can also have the “halving height” benefit. This benefit helps keep the cost of rotations low by reducing the height of tree.

This article continues at Splay Tree – Part B.