Tuesday, September 26, 2006 and Thursday, September 28, 2006
Binary Trees

Topics

What is a binary tree?
    - A tree is very similar to a linked list, except there are two pointers in each node instead of a single "next" pointer.
    - Instead of a  Node* list   pointing to the first node in the linked list, a single Node* tree  points to the "root" of the tree.
    - Any node, including the root node, can point to up to 2 "children" nodes, usually called "left" and "right".  If either or both children are missing, then the "left" and/or "right" pointer is NULL.
    - Each node except the root has exactly one "parent" node pointing to it.  Every node except the root must have one and only one parent node.

     - A node with two NULL children is called a "leaf" node. 
    -  A node with at least one child is called an "interior" node.

 

How do we implement a binary tree?
    - Each data object contains the data it is designed to hold, and two pointers to the "left" and "right" children.
    - Binary trees also must have a pointer to the first object in the chain, usually called "tree" or "root".
 

Example  .h file
class TreeNode
{
public:
    TreeNode(int);

    void setData(int);
    int getData();
    void setLeft(TreeNode*);
    TreeNode* getLeft();   
 
    void setRight(TreeNode*);
    TreeNode* getRight();   
 


private:
    int data;
    TreeNode *left, *right;
};

 

Example  .cpp file

TreeNode::TreeNode(int d)

{
    data = d;
    left = right = NULL;
}

 

void TreeNode::setData(int d)
{
    data = d;
}

 

int TreeNode::getData()
{
    return data;
}

 

void TreeNode::setLeft(TreeNode* l)
{
    left = l;
}

 

TreeNode* TreeNode::getLeft()
{
    return left;
}

 

void TreeNode::setRight(TreeNode* r)
{
    right = r;
}

 

TreeNode* TreeNode::getRight()
{
    return right;
}

 

int main()
{
    int i;
    TreeNode *root = NULL;
    TreeNode *newnode, *current;

    // this is a simple, non-recursive tree building
    //   algorithm -- not normally used and not
    //   particularly important.
    bool goleft = false;

    for (i = 0; i < 16; i++)
    {
        newnode = new TreeNode(i);

        if (root == NULL)
            root = newnode;             
        else
        {
            w
hile (root->getLeft() != NULL &&
                   root->getRight() != NULL)
            {
                //assignment is intentional here
                // alternates between left and right
                if (goleft = !goleft)
                    root = root->getLeft();
                else
                    root = root->getRight();     
            }
            if (root->getLeft() == NULL)
                root->setLeft(newnode);
            else if (root->getRight() == NULL)
                root->setRight(newnode);
            else
                cout << "Algorithm error" << endl;
        }
    }

    // print out left-most leaf of tree
    current = root;
    while(current->getLeft() != NULL)
        current = current->getLeft();

    cout << "Left-most  leaf: " <<
            current->getData() << endl;

    // print out right-most leaf of tree
    current = root;
    while(current->getRight() != NULL)
        current = current->getLeft();

    cout << "Right-most leaf: " <<
            current->getData() << endl;


    return 0;
}

Output
Left-most  leaf: 15
Right-most leaf: 13
 
 

 

Use of  friend  to access private data in tree nodes   
    - The  friend   statement is often used in trees in a similar way it is used with linked lists -- to give another class or function the ability to access private instance variables directly inside the tree node.
    - As with linked lists, the 
friend   statement should not be abused.   


Binary Search Trees
    - There was no particular order to the tree created in the code immediately above.
    - This does not make for a particularly useful tree.

    - Rules are often added to binary trees that make them more useful.
    - One such tree is the Binary Search Tree.
    - Binary Search Trees are only useful with data that is comparable and orderable.
    -   In other words, the data must have some kind of < and == functions available.
    - Rules for a Binary Search Tree:

    -   For every node in the tree:
    -      If this node has a left child, the left child's data must be < this node's data.
    -      If this node has a right child, the right child's data must be > this node's data.
    -      Nodes with duplicate data are not allowed.
    - Result (can be proven mathematically):
    -      Every node to the left of any node has data less than that node.
    -      Every node to the right on any node has data greater than that node.
    - Binary trees are very efficient ways to store and retrieve a lot of ordered data.

 

Insertion into a binary search tree

   
    - Insertion into a binary search tree requires that you walk down the tree, going left if the value to insert is less than the current node and right if the value to insert is greater than the current node, until you get to the one and only NULL pointer at the bottom of the tree where the value can be inserted.  Then create a new node and replace the NULL pointer with the pointer to the new node.
    - Assume  current   is a pointer to the root of the binary search tree and newnode   is a pointer to the new node.
   
- First, compare the value in 
newnode   to the value in current.  If  newnode's value is less than current's value, and the left child is not NULL, do  current = current->left;  If the left child is NULL, insert  newnode at current->left by doing  current->left = newnode;  
 
-
Otherwise, if  newnode's value is greater than current's value, and the right child is not NULL, do  current = current->right;  If the right child is NULL, insert  newnode at current->right by doing  current->right = newnode;       
    - If  newnode's value is the same as current's value, do nothing.  Duplicate nodes do not work well with binary trees.  If you must keep track of duplicates, add a count variable to each node's data, and increment or decrement the count as needed.     

 

Example  .h file
class TreeNode
{
    friend int main();
public:
    TreeNode(int);

private:

    int data;
    TreeNode* left;
    TreeNode* right;
}

TreeNode* insert(int, TreeNode*};

 

 

Example  .cpp file

TreeNode* insert(int number, TreeNode* root}
{

   TreeNode* newnode;

   if (root == NULL)
   {
        newnode = new TreeNode;
        newnode->data = number;
        newnode->left = NULL;
        newnode->right = NULL;
        return newnode;    
   }
   // the following two else if's use recursion.  They
   //    could be easily replaced by a simple while loop.
   else if (number < root->data)
        root->left = insert(number, root->left);
   else if (number > root->data)
   	root->right= insert(number, root->right);
   // ignore if (number == root->data)
   return root;	
}

 

int main()
{
 

	Node* root = NULL;
	Node* current = NULL;
	Node* newnode = NULL;
	int number;
		
	root = insert(10, root);
	insert(5, root);
	insert(17, root);
	insert(8, root);
	insert(3, root);
	insert(15, root);
	insert(19, root);
	insert(2, root);
	insert(7, root);
	insert(9, root);
	insert(5, root);
	insert(5, root);
	
	cout << root->left->right->left->data << endl;
	
	cout << "Enter a number to see if it is in the tree: ";
	cin >> number;
	cin.ignore();
	
	Node* result;
	
	if ( result = search(number, root) )
		cout << result->data << " was found!" << endl;
	else
		cout << number << " is not in the tree." << endl;
	return 0;
} 

Output


Enter a number to see if it is in the tree:  34
34 is not in the tree.

 

Deletion from a binary search tree

    - There are common ways to delete from a tree. 

   -  The first common way deletes the selected node and moves the child pointers so that the Search Tree properties are preserved (for every node, all left children are less and all right children are more)

    - The algorithm:
    -  If the selected node has no children, set the pointer from the parent to NULL and delete the node.
    -  If the selected node as only one child, set the pointer from the parent to that child and delete the node.
    -  If the selected node has two children, and the left child has no right child, set the left->right pointer to the selected node's right child, set the pointer from the parent to the selected node's left child, and delete the node.
   -   If the selected node has two children, and the left child has a right child, go left, right, right, etc. until you get to a NULL right child.  Set that child to the selected node's right child, set the pointer from the parent of the select node to the selected node's left child, and delete the node.

   -  "Delete the node" in the description above means returning the node to dynamic memory using
delete.
  

Example  function that deletes the selected node directly:
 

Node* remove(int number, Node* root)
{
   if (root == NULL)
 	;
   // the following two else if's use recursion.  They
   //    could be easily replaced by a simple while loop.
   else if (number < root->data)
        root->left = remove(number, root->left);
   else if (number > root->data)
        root->right = remove(number, root->right);
   else // number == root->data
   {
   	Node* condemned;
	condemned = root;
   	if (root->right == NULL)
		root = root->left;
	else
	{
		Node* current = root->right;
		while (current->left)
			current = current->left;
		current->left = root->left;
		
		root = root->right;
	}
	delete condemned;
    }
   
   return root;	
}

 

   -  The second common way leaves the selected node, but copies the data from the next lowest node.  That node is found by going left, right, right, etc., until there is no right child.  The the node that just had its data copied is deleted.
    - The algorithm:
    -  If the selected node has no children, set the pointer from the parent to NULL and delete the node.
    -  If the selected node as only one child, set the pointer from the parent to that child and delete the node.
    -  If the selected node has two children, and the left child has no right child, copy the data from the left child, and delete the left child.
   -   If the selected node has two children, and the left child has a right child, go left, right, right, etc. until you get to a NULL right child.  Copy the data from that node to the selected node, and then delete that node.

   -  "Delete the node" in the description above means returning the node to dynamic memory using
delete.
  

Example  function that deletes by copying the next lowest node:
 

Node* removebydatacopy(int number, Node* root)
{
   if (root == NULL)
 	;
   // the following two else if's use recursion.  They
   //    could be easily replaced by a simple while loop.
   else if (number < root->data)
        root->left = remove(number, root->left);
   else if (number > root->data)
   	root->right = remove(number, root->right);
   else // number == root->data
   {
        Node* condemned = root;
 
 	if (root->left == NULL)
 		root = root->right;
	// there is a root->left
	else if (root->right == NULL)
		root = root->left;
	// there is both a root->left and a root->right
	else
	{
		// search for leaf at root->left->right->right...
		Node* current = root->left;
		if (current->right == NULL)
		{
			root->data = current->data;
			root->left = current->left;
			condemned = current;
		}
		else // current->right is not NULL
		{
			while (current->right->right)
				current = current->right;
			// current->right->right is now NULL
			condemned = current->right;
			root->data = condemned->data;
			current->right = current->right->left;
		}
	}
	delete condemned;
   }
   
   return root;	
}
 

Traversal (also called walking) of a binary search tree - visiting every node.

   - This is an iterative (looped) walk of every node of the binary search tree, in lowest value to highest value order. 

   -  The recursive code, which we will study later, is much more elegant.  You would rarely use iterative code like this to walk a tree.  You should try to generally understand how this code works, but it is not necessary that you be able to replicate it in your programs. 

   -  Without recursion, you must use a stack to write walk the tree.

   -  The code starts by pushing the root node, if any, on the stack.

   -  The code then pops the top node off the stack, and loops down that subtree to the left-most node, storing each right node and then each node visited on a stack for later processing.  When it gets to the leftmost node in the tree, it prints it and finishes the loop.

   -  By popping the top node on the stack, walking to the leftmost node of its subtree, and printing that leftmost node, eventually all nodes will be printed, in ascending order.

   - You must also keep track if the leftmost subtree has already been processed, as you push each node on the stack.  Then when you pop a node off the stack, you must prevent re-processing the same left subtree if it has already been processed.

 
void printtree(Treenode* root)
{ 
	// parallel arrays used so this print function can be entirely self-contained
	// stack of Treenode pointers and stack of boolean - if the left subtree has been walked yet.
	Treenode* stack[100] = {0};  // 100 is an arbitrary size for the array, and is somewhat dangerous
	int goleft[100] = {0};
	int top = -1;
	// printlevel array is only needed for formatting
	int printlevel[100] = {0};
	int level = 0;
	// these hold information about the current node
	Treenode* current;
	int doleft;
	int i;
	cout << endl;  // formatting only
	if (root == NULL)
		cout << "Empty Tree" << endl;
	else
	{
		// push root on stack and indicate the left subtree needs printing
		stack[++top] = root;
		goleft[top] = 1;
		printlevel[top] = 0;  // formatting only
		do
		{
			// pop current node off stack
			current = stack[top];
			doleft = goleft[top];
			level = printlevel[top--];  // formatting only
			
			// walk down the left side of the left subtree
			if (doleft)
			{
				// walk down the left, pushing any right node, and then the current node
				//   until there are no more left nodes.  Make sure bottom right node is also pushed
				while(1)
				{
					if (current->right != NULL)
					{
						stack[++top] = current->right;
						goleft[top] = 1;
						printlevel[top] = level + 1;  // formatting only
					}
					// no more left subnodes, and the right subtree, if any, has been pushed on the stack
					if (current->left == NULL)
						break;
					// since there is a left subtree, push this node on the stack and go to the left node
					stack[++top] = current;
					goleft[top] = 0;
					printlevel[top] = level++;  // formatting only
					current = current->left;
				}
			}
			for (i = 0; i < level; i++)  // formatting only
				cout << "  ";            // formatting only
			cout << current->data << endl;  // process data, in this case print it.
		} while (top != -1);
	}
	cout << endl;  // formatting only
}

Readings

Deitel and Deitel Fifth Edition, 21.7 - same caveats as the Deitel linked list readings .

Deitel and Deitel Fourth Edition, 17.7 - same caveats as the Deitel linked list readings .

 

 

Back to Csc 125 Computer Science II, Programming in C++
  Scott Badman   Office: B132   Phone: 353-2250   sbadman@parkland.edu  

Parkland College, 2400 W. Bradley Avenue, Champaign, IL 61821