RedBlack Tree – Part A

DinS          Written on 2017/5/21

1. Visually, what is a Red-Black Tree?

See featured image above.

Red-Black Tree, initially designed by R. Bayer in 1972 as symmetric binary B-tree and later improved by L. J. Guibas and R. Segewick in 1978, is a widely used and prominent member in the family of B.B.S.T. It can control the search, insert and delete operations all in O(logn) time. Moreover, it can ensure that the topological structure changes in constant times for all these operations. This is a key difference from AVL.

2. Rules for Red-Black Tree

The rules for red-black tree are quite complicated. There’re four rules as follows:

(1) Each node is either red or black.

(2) Root is always black.

(3) No two red nodes in a row.

(4) Every root-null path has the same number of black nodes.

These four rules are abstract, especially the third and fourth one. The first and second one is rather easy to understand. As the name suggests, red-black tree refer to nodes with red or black color. The third one actually means that red nodes can only have black children and black parent. The fourth rule is the hardest to understand. Literally we can know what this rule means: from root to leaf, we pass along many nodes. For every route, we must pass the same number of black nodes. However, what’s the consequence of this rule? More generally, we want to know how these four rules help to shape the tree.

Let’s look at the red-black tree in featured image above and analyze the relations between the tree and rules.

This tree looks complicated; the rules are abstract. But don’t be afraid, we’ll go through the rules one by one to confirm whether this is a red-black tree.

Rule #1. I’ve painted all the nodes in the tree. Either node is red or black. This tree meets rule #1.

Rule #2. Obviously, the root node, which is 52, is painted black. This tree meets rule #2.

Rule #3. Let’s check if there is a red node that has red child. If we find one, rule #3 is broken. If not, i.e. red nodes only have black children, the tree meets rule #3. We can see red node 5, 33, 39 and 77 only have black children. The other red nodes are all leaf nodes. No red nodes are in a row. Rule #3 passes.

Rule #4. We really need to do some work to check this rule. To help you out, I’ve drawn layers in the tree. The claim is EVERY root-null path has 3 black nodes. Let’s confirm this one by one. To simplify a bit, I’ll treat the nulls of a leaf node as one null.

You should really take your time to confirm this one by one. This will help you understand what is a red-black tree greatly.

3. How can these rules help control the height of tree?

Claim: every red-black tree with n nodes has height <= 2log2(n+1)

Proof: we can observe that if every root-null path has >= k nodes, then the tree includes(at the top) a perfectly balanced search tree of depth k. See an example here.

While we don’t know what’s going on downside, we are sure the top three layers must be like this. If any single node is missing, the condition that every root-null path has >= 3 nodes can’t hold.

From this observation we can arrive at a key conclusion:

The size n of the tree must be at least 2k – 1. In math terms is like this:

n >= 2k – 1

in other words

k <= log2(n+1)

Now let’s come back to red-black tree. According to the 4th rule and the key conclusion, every root-null path has <= log2(n+1) black nodes. According to the 3rd rule, red nodes are at most as many as black nodes. This means every root-null path has <= 2log2(n+1) nodes, which is exactly the definition of height. Thus we have proved that every red-black tree with n nodes has height <= 2log2(n+1). In other words, O(logn) for search, insert and delete operations.

4. Search

The search operation in red-black tree is exactly the same in a binary search tree.

This article continues at RedBlack Tree – Part B.