Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PGDCA II Semester Data Structures and Algorithms Hierarchical Data Structure - Tree 1. Binary Tree a. Definition b. Terms: root, leaf, level, depth, father, son, brother c. Types i. Strictly Binary Tree ii. Complete Binary Tree iii. Almost Complete Binary Tree 2. Basic Operations in Binary Tree 3. Binary Tree Traversal a. Pre-order Traversal b. In-order Traversal c. Post-order Traversal 4. Binary Search Tree a. Definition b. Operations on Binary Search Tree i. Inserting a node ii. Searching a node iii. Deleting a node 5. Huffman Algorithm Binary Tree >> Definition The simplest form of tree is a binary tree. A binary tree consists of a. a node (called the root node) and b. left and right sub-trees. Both the sub-trees are themselves binary trees. A binary tree is a hierarchical (tree) data structure in which each node has at most two children. Typically the child nodes are called left and right. One common use of binary trees is binary search trees; another is binary heaps. Definition [ Binary Tree ] A binary tree is a finite set of elements that is either empty or it consists of a node called root and two other disjoint subsets. And these two subsets are themselves binary trees called left sub tree and the right sub tree. Each element of a binary tree is called a node of the tree. Figure 1: A binary tree Binary Tree >> Terms Root: Node at the "top" of a tree - the one from which all operations on the tree commence. The root node may not exist (a NULL tree with no nodes in it) at all. It may have 0, 1 or 2 sons in a binary tree. Consider a figure 1, node 56 is a root. Node 15 and 78 are sons of a node 56. Node 15 is a father node of two nodes 10 and 45. A node that does not have any son is called a leaf. Node 80 is the right son of node 78 whereas node 10 is the left son of a node 15. Tow nodes are brothers if they are left and right sons of the same father. The level of a node in a binary tree is Page 1 of 8 Prepared by: Manoj Shakya PGDCA II Semester Data Structures and Algorithms defined as follows: The root of the tree has level 0, and the level of any other node in the tree is one more than the level of its father. For example, the level of a node 45 is 2 and that of a node 15 is 1. The depth of a binary tree is the maximum level of any leaf in the tree. This equals the length of the longest path from the root to any leaf. Thus the depth of the figure 1 is 2. Binary Tree >> Types Strictly Binary Tree A tree is said to be strictly binary tree if every non-leaf node has non empty left and right sub trees. Figure 2 Figure 3 Figure 4 In the above diagram, figure 2 is not a strictly binary tree because node 78 does not contain a right sub tree though it contains left sub tree. Similarly, figure 3 is not a strictly binary tree because node 15 does not have right sub tree. But figure 4 is a strictly binary tree because each non leaf node has right and left sub trees. Complete Binary Tree A complete binary tree of depth d is the strictly binary tree whose leaves are at level d. Figure 5 Figure 6 Figure 7 In the above diagram, figure 5 is not a complete binary tree because it is not a strictly binary tree. Figure 6 is a strictly binary tree but still it is not a complete binary tree because every leaf is not at the same level. Leaf nodes 78 and 30 from figure 6 are not at the depth level of a binary tree. Figure 7 is a complete binary tree because it is strictly binary tree and every leaf node is at level d. So, “every complete binary tree is a strictly binary tree but every strictly binary tree is not a complete binary tree”. If d is the depth of a complete binary tree, what is the total number of nodes there? Almost Completer Binary Tree A binary tree of depth d is an almost complete binary tree if: 1. For any node n in the tree with a right descendent at level d, n must have a left son and every left descendent of n is either a leaf at level d or has two sons. 2. Any node n at level less than d-1 has two sons. Explanation of point 1: If any non-leaf node has right descendent that it must have left descendent. If right descendent goes to level d (depth) that each left descendent must have two sons or is a leaf at level d. Page 2 of 8 Prepared by: Manoj Shakya PGDCA II Semester Data Structures and Algorithms Binary trees Strictly Binary Tree Complete Binary Tree Almost Complete Binary Tree No No No No No Yes Yes No No Yes No Yes Yes Yes Yes Basic Operations in Binary Tree Return type Pointer to node Function name getNode(value) Pointer to node getLeftSon (p) Pointer to node getRightSon(p) Pointer to node getFather(p) Pointer to node getBrother(p) boolean isRight(p) boolean isLeft(p) Description Creates a node and returns the pointer to that node. Returns a pointer to a left son of node(p). Returns a pointer to a right son of node(p). Returns a pointer to a father of node(p). Returns a pointer to a brother of node(p). Checks whether node(p) is right or not. Checks whether node(p) is left or not. Page 3 of 8 Prepared by: Manoj Shakya PGDCA II Semester Data Structures and Algorithms boolean isLeaf(p) void setLeft(p, value) void setRight(p, value) Checks whether node(p) is leaf node or not. Creates and fix a node to the left of a node(p). Creates and fix a node to the right of a node(p). Binary Tree Traversal There are basically three different ways in which we traverse a binary tree. 1. Preorder traversal (also known as depth – first order) 2. Inorder traversal (a.k.a. symmetric order) 3. Postorder traversal (a.k.a end order) Binary Tree Traversal >> Preorder Traversal Preorder traversal of a binary tree consists of following three recursive operations. a. Visit the root. b. Traverse the left sub-tree in preorder. c. Traverse the left sub-tree in preorder. void doPreOrder(nodeptr &tree){ nodeptr p = tree; if(p!= null){ printf(“%d”, p->key); doPreOrder(p->left); doPreOrder(p->right); } } Binary Tree Traversal >> Inorder Traversal Inorder traversal of a binary tree consists of following three recursive operations. a. Traverse the left sub-tree in inorder. b. Visit the root. c. Traverse the left sub-tree in inorder. void doInOrder(nodeptr &tree){ nodeptr p = tree; if(p!= null){ doInOrder(p->left); printf(“%d”, p->key); doInOrder(p->right); } } Binary Tree Traversal >> Postorder Traversal Postorder traversal of a binary tree consists of following three recursive operations. a. Traverse the left sub-tree in postorder. b. Traverse the left sub-tree in postorder. c. Visit the root. void doPostOrder(nodeptr &tree){ nodeptr p = tree; if(p!= null){ doPostOrder(p->left); Page 4 of 8 Prepared by: Manoj Shakya PGDCA II Semester Data Structures and Algorithms doPostOrder(p->right); printf(“%d”, p->key); } } Result of preorder traversal Result of inorder traversal Result of postorder traversal Figure 12: Examples of : 56 15 10 : 05 10 15 : 05 10 30 binary tree 05 30 78 60 70 80 30 56 60 70 78 80 15 70 60 80 78 56 traversals Binary Search Tree >> Definition Definition [ Binary Search Tree ] A binary search tree is a binary tree that is either empty or in which each node contains a key that satisfies the following conditions: • All keys (if any) in left sub-tree of the root are smaller than that of a root. • The key in the root is always less than all the keys in the right sub-tree. • The left and right sub-trees of the root are again binary search trees. Figure 13: A binary search tree. The major advantage of binary search trees is that the related sorting algorithms and search algorithms such as in-order traversal can be very efficient. Binary Search Tree >> Operations on Binary Search Tree >> Inserting a Node void insertNode(nodeptr &ptr, int v){ nodeptr p, q; int t; p = q = ptr; while(p!=null){ q=p; if(v<p->key) p=p->left; else p=p->right; } if(v<q->key){ setLeft(q,v); else Page 5 of 8 Prepared by: Manoj Shakya PGDCA II Semester Data Structures and Algorithms setRight(q,v); } Binary Search Tree >> Operations on Binary Search Tree >> Searching a Node nodeptr searchNode(nodeptr &ptr, int x){ nodeptr p = root; while(p!=null){ if(x==p->key) return p; else if (x<p->key) p=p->left; else p=p->right; } return null; } Binary Search Tree >> Operations on Binary Search Tree >> Deleting a Node void deleteNode(nodeptr &tree, int value){ nodeptr p = tree; nodeptr f = null; nodeptr rightchi = null; nodeptr leftchild p = search(tree, value); if(p == null){ printf("no record found.\n"); return; } else{ /* deleting leaf node */ if(isleaf(p) == 1){ f = p->father; if(f->left == p){ /* this means p is left leaf node */ f->left = null; }else{ f->right = null; } free(p); } /* deleting a node that has no left child but has right child */ else if(p->left == null){ f = p->father; rightchild = p->right; if(f->left == p){ /* this means p is left leaf node */ f->left = rightchild; }else{ f->right = rightchild; } rightchild->father = f; free(p); } /* deleting a node that has no right child but has left child */ else if(p->right == null){ Page 6 of 8 Prepared by: Manoj Shakya PGDCA II Semester Data Structures and Algorithms f = p->father; leftchild = p->left; if(f->left == p){ /* this means p is left leaf node */ f->left = leftchild; }else{ f->right = leftchild; } leftchild->father = f; free(p); } else{ nodeptr max = getMaxNode(tree->left); int data = max->info; deleteNode(tree, data); p->info = data; } } } Huffman Coding Huffman codes are a widely used and very effective technique for compressing data; savings of 20% to 90% are typical, depending on the characteristics of the data being compressed. We consider the data to be a sequence of characters. Huffman's greedy algorithm uses a table of the frequencies of occurrence of the characters to build up an optimal way of representing each character as a binary string. Prefix codes We consider here only codes in which no codeword is also a prefix of some other codeword. Such codes are called prefix codes. It is possible to show (although we won't do so here) that the optimal data compression achievable by a character code can always be achieved with a prefix code, so there is no loss of generality in restricting attention to prefix codes. Constructing a Huffman Tree Huffman invented a greedy algorithm that constructs an optimal prefix code called a Huffman code. In the pseudocode that follows, we assume that C is a set of n characters and that each character c C is an object with a defined frequency f [c]. The algorithm builds the tree T corresponding to the optimal code in a bottom-up manner. It begins with a set of |C| leaves and performs a sequence of |C| - 1 "merging" operations to create the final tree. A min-priority queue Q, keyed on f , is used to identify the two least-frequent objects to merge together. The result of the merger of two objects is a new object whose frequency is the sum of the frequencies of the two objects that were merged. HUFFMAN(C) 1 n |C| 2 Q C 3 for i 1 to n - 1 4 do allocate a new node z 5 left[z] x EXTRACT-MIN (Q) 6 right[z] y EXTRACT-MIN (Q) 7 f [z] f [x] + f [y] 8 INSERT(Q, z) 9 return EXTRACT-MIN(Q) // Return the root of the tree. For our example, Huffman's algorithm proceeds as shown in the following figure. Since there are 6 letters in the alphabet, the initial queue size is n = 6 and 5 merge steps are required to build the Page 7 of 8 Prepared by: Manoj Shakya PGDCA II Semester Data Structures and Algorithms tree. The final tree represents the optimal prefix code. The codeword for a letter is the sequence of edge labels on the path from the root to the letter. The steps of Huffman's algorithm for the frequencies given in Figure 16.3. Each part shows the contents of the queue sorted into increasing order by frequency. At each step, the two trees with lowest frequencies are merged. Leaves are shown as rectangles containing a character and its frequency. Internal nodes are shown as circles containing the sum of the frequencies of its children. An edge connecting an internal node with its children is labeled 0 if it is an edge to a left child and 1 if it is an edge to a right child. The codeword for a letter is the sequence of labels on the edges connecting the root to the leaf for that letter. (a) The initial set of n = 6 nodes, one for each letter. (b)-(e) Intermediate stages. (f) The final tree Line 2 initializes the ascending/min-priority queue Q with the characters in C. The for loop in lines 3-8 repeatedly extracts the two nodes x and y of lowest frequency from the queue, and replaces them in the queue with a new node z representing their merger. The frequency of z is computed as the sum of the frequencies of x and y in line 7. The node z has x as its left child and y as its right child. (This order is arbitrary; switching the left and right child of any node yields a different code of the same cost.) After n - 1 mergers, the one node left in the queue-the root of the code tree-is returned in line 9. Page 8 of 8 Prepared by: Manoj Shakya