Before we get into what a self-balancing tree is, let’s actually talk about why would it be good to to have one haha
Turns out there are some cons to a BST (or at least some things to watch out for).
It’s very important to the execution performance of the BST that it is roughly balanced.
This keeps the operations ~O(log n) which is proportional to the number of edges from the root to a leaf node.
An unbalanced BST could lead to slow performance, similar to a linked list.
This can happen when values are inserted in sorted order.
A complete binary tree is a BST that is filled at all levels except for the last which is filled from left to right.
If the tree is complete, then $log n$ bounds the longest path.
However, maintaining completeness is very costly, so it’s really not worth it in terms of performance.
Instead, it’s easier to maintain height balanced trees.
This means that for each node, the height difference between the left and right subtrees is at most 1.
Some important things to remember:
Null nodes have a height of -1
Leaf nodes have a height of 0
Mathematically, the longest path of a height balanced tree is at worst 44% longer than log n so the operations in a height balanced tree is still O(log n)!
But how do we maintain balance?
The AVL tree maintains height balancing through a series of node rotations.
This is sometimes straightforward but most times not, and some operations require not only single rotations but double rotations as well.
Luckily this class was nice enough to give us not only some nice drawings, but also the full balancing pseudocode!
Implementation of AVL Trees
The AVL node will be based on the following struct:
We will need some helper functions to calculate the heights of nodes for us:
Adding the 1 in the _setHeight function is important since the heights technically start at -1.
There is a lot of code to implement the AVL Tree, but the important thing to remember is that these are basically just binary search trees that need to be rebalanced.
Let’s define balance factor to be the height difference between the right and left child subtrees.
If _bf returns a positive value, then that means the current node is right heavy.
if it returns a negative value, then the current node is left heavy.
Now we have to write the functions to perform the rotations on a given subtree.
The last function we have to implement is our remove function, but much like the regular BST implementation, we need some helper functions for this one.
Luckily, we have a lot of the functions already written from BST, we just need to modify them slightly to incorporate rebalancing.
The user facing remove function assumes that we have a struct constructed called AVLTree which has a root node and a count field.
The _leftMost and _removeLeftMost functions are pretty much the same as their prior BST counterparts.
Here they are again, for convenience.
The last function we need to write is the _removeNode helper function itself~
Tree Sort
We can sort an array $A$ using an AVL tree!
The basic idea is that we:
Copy each element from A into an AVL tree
Copy each element from the AVL tree back into the array
Following this pseudocode, the treeSort function is pretty straightforward, but we do need to write a recursive helper function.