Download Hierarchical Data Structure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linked list wikipedia , lookup

Lattice model (finance) wikipedia , lookup

Quadtree wikipedia , lookup

B-tree wikipedia , lookup

Red–black tree wikipedia , lookup

Interval tree wikipedia , lookup

Binary tree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
PGDCA II Semester
Data Structures and Algorithms
Hierarchical Data Structure - Tree
1. Binary Tree
a. Definition
b. Terms: root, leaf, level, depth, father, son, brother
c. Types
i. Strictly Binary Tree
ii. Complete Binary Tree
iii. Almost Complete Binary Tree
2. Basic Operations in Binary Tree
3. Binary Tree Traversal
a. Pre-order Traversal
b. In-order Traversal
c. Post-order Traversal
4. Binary Search Tree
a. Definition
b. Operations on Binary Search Tree
i. Inserting a node
ii. Searching a node
iii. Deleting a node
5. Huffman Algorithm
Binary Tree >> Definition
The simplest form of tree is a binary tree. A binary tree consists of
a. a node (called the root node) and
b. left and right sub-trees. Both the sub-trees are themselves binary trees.
A binary tree is a hierarchical (tree) data structure in which each node has at most two children.
Typically the child nodes are called left and right. One common use of binary trees is binary
search trees; another is binary heaps.
Definition [ Binary Tree ]
A binary tree is a finite set of elements that is either empty or it consists of a node called root
and two other disjoint subsets. And these two subsets are themselves binary trees called left
sub tree and the right sub tree. Each element of a binary tree is called a node of the tree.
Figure 1: A binary tree
Binary Tree >> Terms
Root: Node at the "top" of a tree - the one from which all operations on the tree commence. The
root node may not exist (a NULL tree with no nodes in it) at all. It may have 0, 1 or 2 sons in a
binary tree. Consider a figure 1, node 56 is a root. Node 15 and 78 are sons of a node 56. Node
15 is a father node of two nodes 10 and 45. A node that does not have any son is called a leaf.
Node 80 is the right son of node 78 whereas node 10 is the left son of a node 15. Tow nodes are
brothers if they are left and right sons of the same father. The level of a node in a binary tree is
Page 1 of 8
Prepared by: Manoj Shakya
PGDCA II Semester
Data Structures and Algorithms
defined as follows: The root of the tree has level 0, and the level of any other node in the tree is
one more than the level of its father. For example, the level of a node 45 is 2 and that of a node
15 is 1. The depth of a binary tree is the maximum level of any leaf in the tree. This equals the
length of the longest path from the root to any leaf. Thus the depth of the figure 1 is 2.
Binary Tree >> Types
Strictly Binary Tree
A tree is said to be strictly binary tree if every non-leaf node has non empty left and right sub
trees.
Figure 2
Figure 3
Figure 4
In the above diagram, figure 2 is not a strictly binary tree because node 78 does not contain a
right sub tree though it contains left sub tree. Similarly, figure 3 is not a strictly binary tree
because node 15 does not have right sub tree. But figure 4 is a strictly binary tree because each
non leaf node has right and left sub trees.
Complete Binary Tree
A complete binary tree of depth d is the strictly binary tree whose leaves are at level d.
Figure 5
Figure 6
Figure 7
In the above diagram, figure 5 is not a complete binary tree because it is not a strictly binary tree.
Figure 6 is a strictly binary tree but still it is not a complete binary tree because every leaf is not at
the same level. Leaf nodes 78 and 30 from figure 6 are not at the depth level of a binary tree.
Figure 7 is a complete binary tree because it is strictly binary tree and every leaf node is at level d.
So, “every complete binary tree is a strictly binary tree but every strictly binary tree is not a
complete binary tree”. If d is the depth of a complete binary tree, what is the total number of
nodes there?
Almost Completer Binary Tree
A binary tree of depth d is an almost complete binary tree if:
1. For any node n in the tree with a right descendent at level d, n must have a left son and
every left descendent of n is either a leaf at level d or has two sons.
2. Any node n at level less than d-1 has two sons.
Explanation of point 1:
If any non-leaf node has right descendent that it must have left descendent.
If right descendent goes to level d (depth) that each left descendent must have two sons or is a
leaf at level d.
Page 2 of 8
Prepared by: Manoj Shakya
PGDCA II Semester
Data Structures and Algorithms
Binary trees
Strictly Binary Tree
Complete Binary Tree
Almost Complete Binary Tree
No
No
No
No
No
Yes
Yes
No
No
Yes
No
Yes
Yes
Yes
Yes
Basic Operations in Binary Tree
Return type
Pointer to node
Function name
getNode(value)
Pointer to node
getLeftSon (p)
Pointer to node
getRightSon(p)
Pointer to node
getFather(p)
Pointer to node
getBrother(p)
boolean
isRight(p)
boolean
isLeft(p)
Description
Creates a node and returns the
pointer to that node.
Returns a pointer to a left son
of node(p).
Returns a pointer to a right son
of node(p).
Returns a pointer to a father of
node(p).
Returns a pointer to a brother of
node(p).
Checks whether node(p) is right
or not.
Checks whether node(p) is left or
not.
Page 3 of 8
Prepared by: Manoj Shakya
PGDCA II Semester
Data Structures and Algorithms
boolean
isLeaf(p)
void
setLeft(p, value)
void
setRight(p, value)
Checks whether node(p) is leaf
node or not.
Creates and fix a node to the
left of a node(p).
Creates and fix a node to the
right of a node(p).
Binary Tree Traversal
There are basically three different ways in which we traverse a binary tree.
1. Preorder traversal (also known as depth – first order)
2. Inorder traversal (a.k.a. symmetric order)
3. Postorder traversal (a.k.a end order)
Binary Tree Traversal >> Preorder Traversal
Preorder traversal of a binary tree consists of following three recursive operations.
a. Visit the root.
b. Traverse the left sub-tree in preorder.
c. Traverse the left sub-tree in preorder.
void doPreOrder(nodeptr &tree){
nodeptr p = tree;
if(p!= null){
printf(“%d”, p->key);
doPreOrder(p->left);
doPreOrder(p->right);
}
}
Binary Tree Traversal >> Inorder Traversal
Inorder traversal of a binary tree consists of following three recursive operations.
a. Traverse the left sub-tree in inorder.
b. Visit the root.
c. Traverse the left sub-tree in inorder.
void doInOrder(nodeptr &tree){
nodeptr p = tree;
if(p!= null){
doInOrder(p->left);
printf(“%d”, p->key);
doInOrder(p->right);
}
}
Binary Tree Traversal >> Postorder Traversal
Postorder traversal of a binary tree consists of following three recursive operations.
a. Traverse the left sub-tree in postorder.
b. Traverse the left sub-tree in postorder.
c. Visit the root.
void doPostOrder(nodeptr &tree){
nodeptr p = tree;
if(p!= null){
doPostOrder(p->left);
Page 4 of 8
Prepared by: Manoj Shakya
PGDCA II Semester
Data Structures and Algorithms
doPostOrder(p->right);
printf(“%d”, p->key);
}
}
Result of preorder traversal
Result of inorder traversal
Result of postorder traversal
Figure 12: Examples of
: 56 15 10
: 05 10 15
: 05 10 30
binary tree
05 30 78 60 70 80
30 56 60 70 78 80
15 70 60 80 78 56
traversals
Binary Search Tree >> Definition
Definition [ Binary Search Tree ]
A binary search tree is a binary tree that is either empty or in which each node contains a key
that satisfies the following conditions:
• All keys (if any) in left sub-tree of the root are smaller than that of a root.
• The key in the root is always less than all the keys in the right sub-tree.
• The left and right sub-trees of the root are again binary search trees.
Figure 13: A binary search tree.
The major advantage of binary search trees is that the related sorting algorithms and search
algorithms such as in-order traversal can be very efficient.
Binary Search Tree >> Operations on Binary Search Tree >> Inserting a Node
void insertNode(nodeptr &ptr, int v){
nodeptr p, q;
int t;
p = q = ptr;
while(p!=null){
q=p;
if(v<p->key)
p=p->left;
else
p=p->right;
}
if(v<q->key){
setLeft(q,v);
else
Page 5 of 8
Prepared by: Manoj Shakya
PGDCA II Semester
Data Structures and Algorithms
setRight(q,v);
}
Binary Search Tree >> Operations on Binary Search Tree >> Searching a Node
nodeptr searchNode(nodeptr &ptr, int x){
nodeptr p = root;
while(p!=null){
if(x==p->key)
return p;
else if (x<p->key)
p=p->left;
else
p=p->right;
}
return null;
}
Binary Search Tree >> Operations on Binary Search Tree >> Deleting a Node
void deleteNode(nodeptr &tree, int value){
nodeptr p = tree;
nodeptr f = null;
nodeptr rightchi = null;
nodeptr leftchild
p = search(tree, value);
if(p == null){
printf("no record found.\n");
return;
}
else{
/* deleting leaf node */
if(isleaf(p) == 1){
f = p->father;
if(f->left == p){
/* this means p is left leaf node */
f->left = null;
}else{
f->right = null;
}
free(p);
}
/* deleting a node that has no left child but has
right child */
else if(p->left == null){
f = p->father;
rightchild = p->right;
if(f->left == p){ /* this means p is left
leaf node */
f->left = rightchild;
}else{
f->right = rightchild;
}
rightchild->father = f;
free(p);
}
/* deleting a node that has no right child but
has left child */
else if(p->right == null){
Page 6 of 8
Prepared by: Manoj Shakya
PGDCA II Semester
Data Structures and Algorithms
f = p->father;
leftchild = p->left;
if(f->left == p){ /*
this means p is left
leaf node */
f->left = leftchild;
}else{
f->right = leftchild;
}
leftchild->father = f;
free(p);
}
else{
nodeptr max = getMaxNode(tree->left);
int data = max->info;
deleteNode(tree, data);
p->info = data;
}
}
}
Huffman Coding
Huffman codes are a widely used and very effective technique for compressing data; savings of
20% to 90% are typical, depending on the characteristics of the data being compressed. We
consider the data to be a sequence of characters. Huffman's greedy algorithm uses a table of the
frequencies of occurrence of the characters to build up an optimal way of representing each
character as a binary string.
Prefix codes
We consider here only codes in which no codeword is also a prefix of some other codeword.
Such codes are called prefix codes. It is possible to show (although we won't do so here) that
the optimal data compression achievable by a character code can always be achieved with a
prefix code, so there is no loss of generality in restricting attention to prefix codes.
Constructing a Huffman Tree
Huffman invented a greedy algorithm that constructs an optimal prefix code called a Huffman
code. In the pseudocode that follows, we assume that C is a set of n characters and that each
character c
C is an object with a defined frequency f [c]. The algorithm builds the tree T
corresponding to the optimal code in a bottom-up manner. It begins with a set of |C| leaves and
performs a sequence of |C| - 1 "merging" operations to create the final tree. A min-priority queue
Q, keyed on f , is used to identify the two least-frequent objects to merge together. The result of
the merger of two objects is a new object whose frequency is the sum of the frequencies of the
two objects that were merged.
HUFFMAN(C)
1 n
|C|
2 Q
C
3 for i 1 to n - 1
4
do allocate a new node z
5
left[z]
x
EXTRACT-MIN (Q)
6
right[z]
y
EXTRACT-MIN (Q)
7
f [z]
f [x] + f [y]
8
INSERT(Q, z)
9 return EXTRACT-MIN(Q)
// Return the root of the tree.
For our example, Huffman's algorithm proceeds as shown in the following figure. Since there are
6 letters in the alphabet, the initial queue size is n = 6 and 5 merge steps are required to build the
Page 7 of 8
Prepared by: Manoj Shakya
PGDCA II Semester
Data Structures and Algorithms
tree. The final tree represents the optimal prefix code. The codeword for a letter is the sequence
of edge labels on the path from the root to the letter.
The steps of Huffman's algorithm for the frequencies given in Figure 16.3. Each part
shows the contents of the queue sorted into increasing order by frequency. At each step,
the two trees with lowest frequencies are merged. Leaves are shown as rectangles
containing a character and its frequency. Internal nodes are shown as circles containing
the sum of the frequencies of its children. An edge connecting an internal node with its
children is labeled 0 if it is an edge to a left child and 1 if it is an edge to a right child. The
codeword for a letter is the sequence of labels on the edges connecting the root to the leaf
for that letter. (a) The initial set of n = 6 nodes, one for each letter. (b)-(e) Intermediate
stages. (f) The final tree
Line 2 initializes the ascending/min-priority queue Q with the characters in C. The for loop in lines
3-8 repeatedly extracts the two nodes x and y of lowest frequency from the queue, and replaces
them in the queue with a new node z representing their merger. The frequency of z is computed
as the sum of the frequencies of x and y in line 7. The node z has x as its left child and y as its
right child. (This order is arbitrary; switching the left and right child of any node yields a different
code of the same cost.) After n - 1 mergers, the one node left in the queue-the root of the code
tree-is returned in line 9.
Page 8 of 8
Prepared by: Manoj Shakya