Introduction To Binary Trees A binary tree is made of nodes, where each node contains a "left" pointer, a "right" pointer, and a data element. The "root" pointer points to the topmost node in the tree. The left and right pointers recursively point to smaller "subtrees" on either side. A null pointer represents a binary tree with no elements -- the empty tree. The formal recursive definition is: A binary tree is either empty or is made of a single node, where the left and right pointers each point to a binary tree. AVL Trees An AVL tree is a special type of binary tree that is always "partially" balanced. The criteria that is used to determine the "level" of "balanced-ness" is the difference between the heights of subtrees of a root in the tree. The "height" of tree is the "number of levels" in the tree. Or to be more formal, the height of a tree is defined as follows: 1. The height of a tree with no elements is 0 2. The height of a tree with 1 element is 1 3. The height of a tree with > 1 element is equal to 1 + the height of its tallest subtree. An AVL tree is a binary tree in which the difference between the height of the right and left subtrees (or the root node) is never more than one. The idea behind maintaining the "AVL-ness" of an AVL tree is that whenever we insert or delete an item, if we have "violated" the "AVL-ness" of the tree in anyway, we must then restore it by performing a set of manipulations (called "rotations") on the tree. These rotations come in two : single rotations and double rotations . Implementing AVL Trees in C++ Before we begin our AVL tree implementation in C++, lets assume we have a template class named "Compare1" defined as follows: Code: // cmp_t is an enumeration type indicating the result of a // comparison. enum cmp_t { MIN_CMP = -1, // less than EQ_CMP = 0, // equal to MAX_CMP = 1 // greater than }; // Class "Compare" corresponds to an arbitrary Compare1 element // with a keyfield that has an ordering relation. The template parameter // KeyType is the "type" of the keyfield template <class KeyType> class Compare { private: KeyType myKey; public: Compare (KeyType key) : myKey(key) {}; // Use default copy-ctor, assignment, & destructor // Compare this item against the given key & return the result cmp_t Compare(KeyType key) const; // Get the key-field of an item KeyType Key() const { return myKey; } }; Like the "Compare1" class, our AVL tree will also be a template class parameterized by a KeyType: Code: // Class AvlNode represents a node in an AVL tree. The template parameter // KeyType is the "type" of the keyfield // template <class KeyType> class AvlNode { private: Compare1<KeyType> * myData; // Data field AvlNode<KeyType> * mySubtree[2]; // Subtree pointers short myBal; // Balance factor // ... many details omitted }; Calculating New Balances After a Rotation To calculate the new balances after a single left rotation. The left is what the tree looked like BEFORE the rotation and the right is what the tree looks like after the rotation. Capital letters are used to denote single nodes and lowercase letters are used to denote subtrees. The "balance" of a tree is the height of its right subtree less the height of its left subtree. Therefore, we can calculate the new balances of "A" and "B" as follows (ht is the height function): Code: NewBal(A) = ht(b) - ht(a) OldBal(A) = ht(B) - ht(a) = ( 1 + max (ht(b), ht(c)) ) - ht(a) subtracting the second equation from the first yields: NewBal(A) - OldBal(A) = ht(b) - ( 1 + max (ht(b), ht(c)) ) + ht(a) - ht(a) canceling out the ht(a) terms and adding OldBal(A) to both sides yields: NewBal(A) = OldBal(A) - 1 - (max (ht(b), ht(c)) - ht(b) ) Noting that max(x, y) - z = max(x-z, y-z), we get: NewBal(A) = OldBal(A) - 1 - (max (ht(b) - ht(b), ht(c) - ht(b)) ) But ht(c) - ht(b) is OldBal(B) so we get: NewBal(A) = OldBal(A) - 1 - (max (0, OldBal(B)) ) = OldBal(A) - 1 - max (0, OldBal(B)) Thus, for A, we get the equation: NewBal(A) = OldBal(A) - 1 - max (0, OldBal(B)) To calculate the Balance for B we perform a similar computation: NewBal(B) = ht(c) - ht(A) = ht(c) - (1 + max(ht(a), ht(b)) ) OldBal(B) = ht(c) - ht(b) subtracting the second equation from the first yields: NewBal(B) - OldBal(B) = ht(c) - ht(c) + ht(b) - (1 + max(ht(a), ht(b)) ) canceling, and adding OldBal(B) to both sides gives: NewBal(B) = OldBal(B) - 1 - (max(ht(a), ht(b)) - ht(b)) = OldBal(B) - 1 - (max(ht(a) - ht(b), ht(b) - ht(b)) But ht(a) - ht(b) is - (ht(b) - ht(a)) = -NewBal(A), so ... NewBal(B) = OldBal(B) - 1 - max( -NewBal(A), 0) Using the fact that min(x,y) = -max(-x, -y) we get: NewBal(B) = OldBal(B) - 1 + min( NewBal(A), 0) So, for a single left rotation we have shown the the new balances for the nodes A and B are given by the following equations: Code: NewBal(A) = OldBal(A) - 1 - max(OldBal(B), 0) NewBal(B) = OldBal(B) - 1 + min(NewBal(A), 0) If we perform the same calculations we will see that the new balances for a single right rotation are given by the following equations: Code: NewBal(A) = OldBal(A) + 1 - min(OldBal(B), 0) NewBal(B) = OldBal(B) + 1 + max(NewBal(A), 0) Hence, C++ code for single left and right rotations would be: Code: // Indices into a subtree array enum dir_t { LEFT = 0, RIGHT = 1 }; // Return the minumum of two numbers inline int MIN(int a, int b) { return (a < b) ? a : b; } // Return the maximum of two numbers inline int MAX(int a, int b) { return (a > b) ? a : b; } // Note that RotateLeft and RotateRight are *static* member // functions because otherwise they would have to re-assign // to the "this" pointer. template <class KeyType> void AvlNode<KeyType>::RotateLeft(AvlNode<KeyType> * & root) { AvlNode<KeyType> * oldRoot = root; // perform rotation root = root->mySubtree[RIGHT]; oldRoot->mySubtree[RIGHT] = root->mySubtree[LEFT]; root->mySubtree[LEFT] = oldRoot; // update balances oldRoot->myBal -= (1 + MAX(root->myBal, 0)); root->myBal -= (1 - MIN(oldRoot->myBal, 0)); } template <class KeyType> void AvlNode<KeyType>::RotateRight(AvlNode<KeyType> * & root) { AvlNode<KeyType> * oldRoot = root; // perform rotation root = root->mySubtree[LEFT]; oldRoot->mySubtree[LEFT] = root->mySubtree[RIGHT]; root->mySubtree[RIGHT] = oldRoot; // update balances oldRoot->myBal += (1 - MIN(root->myBal, 0)); root->myBal += (1 + MAX(oldRoot->myBal, 0)); } We can make this code more compact however by using only ONE rotate method which takes an additional parameter: the direction in which to rotate. Notice that I have defined LEFT, and RIGHT to be mnemonic constants to index into an array of subtrees. I can pass the constant LEFT or RIGHT to the rotation method and it can calculate the direction opposite the given direction by subtracting the given direction from the number one. It does not matter whether LEFT is 0 or RIGHT is 0 as long as one of them is 0 and the other is 1. If this is the case, then: 1 - LEFT = RIGHT and 1 - RIGHT = LEFT Using this and the same type definitions as before (and the same inline functions), the C++ code for a single rotation becomes: Code: inline dir_t Opposite(dir_t dir) { return dir_t(1 - int(dir)); } // RotateOnce -- static member function that performs a single // rotation for the given direction. template <class KeyType> void AvlNode<KeyType>::RotateOnce(AvlNode<KeyType> * & root, dir_t dir) { AvlNode<KeyType> * oldRoot = root; dir_t otherDir = Opposite(dir); // rotate root = tree->mySubtree[otherDir]; oldRoot->mySubtree[otherDir] = tree->mySubtree[dir]; root->mySubtree[dir] = oldRoot; // update balances if (dir == LEFT) { oldRoot->myBal -= (1 + MAX(root->myBal, 0)); root->myBal -= (1 - MIN(oldRoot->myBal, 0)); } else /* dir == RIGHT */ { oldRoot->myBal += (1 - MIN(root->myBal, 0) ); root->myBal += (1 + MAX(oldRoot->myBal, 0)); } }