A tree is a data structure which can be used to represent data which falls into a natural hierarchy. These notes primarily deal with binary trees -- see the definitions below.
Vocabulary -- trees in general
A tree is made of nodes. Each node stores some kind of data for us.
One special node, at the top of the tree, is the root.
Each node may have child nodes. If node A has a child node B, we refer to A as the parent node.
Each node except the root node has exactly one parent node. The root node has no parent.
Two nodes are siblings if they have the same parent.
One node is an ancestor of another node if it is the parent of the second node, parent of the parent of the second node, etc.
Nodes that have no children are called leaf nodes.
Any node can be viewed as the root of a subtree: its children, their children, etc.
The depth of a node is the number of steps between it and the root. The root is at depth 0, and the children of a node at depth N are at depth N+1. Confusingly, this is sometimes called the height instead.
The depth (or height) of the tree as a whole is the maximum depth of any node.
Vocabulary -- binary trees in particular
If we limit the number of child nodes to 2 per parent, a left child and a right child, we have what is called a binary tree.
In a binary tree, a node has a left subtree (the subtree starting with its left child) and a right subtree (the subtree starting with its right child).
A binary tree is full if all leaves are at the same depth and every node has either 2 children or none.
A binary tree is complete if it can be viewed as a full tree to which we have added leaf nodes at the bottom starting at the left.
A complete binary tree does not have to be full. A full binary tree is complete.
Operations on a tree
There are various operations we might want to perform on a tree:
There are lots of other possibilities.
Many of these operations can be done using recursion: do something with the data in a node, and call the function recursively to act on the subtrees.
Implementation of a binary tree
A common way to implement a binary tree is to use dynamically allocated links:
class Node { private: int Value; // Of course, we could store any kind of data. Node * Left; Node * Right; public: // lots of stuff };
The root of the binary tree is simply a variable of type Node *.
An advantage of this implementation is that we only use as many nodes as we need.
One disadvantage is that it is hard to locate the parent of a given node.
Another way to implement a binary tree is to use an array of nodes:
An advantage here is that it is easy to identify the parent of a node.
A disadvantages are that there may be space wasted in the array for nodes that don't yet exist. We will need some way to mark the array entries that are not in use (a sentinel or flag value of some sort).
However we implement the tree, we will usually make it a class. If we are using the version with nodes and pointers, the Tree class will be a friend of the Node class. The main program that uses the tree will use only the public data members described in the Tree class's header file, and it will not necessarily know anything about the internal structure of the tree.
Traversing a tree
Sometimes we want to visit each node in a binary tree in some order and do something such as print the data in each node. There are several standard ways to do this using recursion:
Pre-order traversal
In-order traversal
Post-order traversal
In each of these, most of the program code is devoted to handling the root node and to making sure the recursion eventually stops.
Instead of simply printing the data, we could be performing some other function on each node. For instance, a binary tree can be used to help parse an arithmetic expression full of binary operations; each node could contain either an operator or a number. We could build the tree and then traverse it to evaluate the overall expression.
Binary Search Trees
Often we use a binary tree to store data in some sort of order. We will assume here that we want data to be in ascending order.
A binary tree is a binary search tree if:
By using these properties, we can easily search the tree, decide where to add a new node, etc.
Examples (as pseudocode)
Some of these examples are for binary trees in general and some are for binary search trees in particular.
Example
Search a binary search tree for a node with a given value, Target. The return value is a pointer to a node. We do this recursively:
if (Root == 0) return a value indicating failure else if (Root->Value == Target) return the root's location else if (Root->Value is >= tTarget) search Root->Left else search Root->Right endif endif endif
Example
Search a binary search tree for the parent of a node with a given value, Target. We return two values: pointer PParent to the parent and P to the child. A special case occurs when the root node contains the target value; in this case, we return two copies of the root pointer. If the target value is not found, we return P = 0 and PParent = its last value. This is not recursive.
if (Root->Value == Target) return P = Root and PParent = Root else P = Root while (P not = 0) and (P->Value not = Target) if (P->Value >= Target) PParent = P P = P->Left else PParent = P P = P->Right endif endwhile if (P == 0) PParent = 0 endif return P and PParent endif
Example
In a binary tree, suppose we want to find the parent of the in-order successor of a node P (which has a right child). We return two values: pointer SuccParent to the parent, and Succ to the child.
SuccParent = P Succ = P->Right while (Succ->Left not = 0) SuccParent = Succ Succ = Succ->Left endwhile return Succ and SuccParent
Example
Suppose we want to count all the leaves on a binary tree. We do this by calling a function LeafCount which takes one argument P, a pointer to a node. We pass it the Root pointer as its argument P.
if (P not = 0) if (P->Left == 0 and P->Right == 0) return 1; else return LeafCount(P->Left) + LeafCount(P->Right) else return 0
Notice this uses recursion. We start by checking for the simple case which will end the recursion. It is not easy to write this without using recursion.