* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Tree
Survey
Document related concepts
Transcript
CS121 Data Structures Trees Each entry in a List (Stack, Queue) has at most one predecessor and one successor. In a Tree each entry has at most one predecessor, but can have more than one successor. CS121 © JAS 2005 2-1 CS121 Data Structures In general trees can have any number of successors, but usually we restrict ourselves to a maximum of two successors - BINARY TREES. A tree is either • Empty, or • Consists of a value and two sub-trees (left and right) CS121 © JAS 2005 2-2 CS121 Data Structures For each non-empty tree there is a unique node with no predecessor – the root node Nodes which have no successors are termed leaf nodes All other nodes have exactly one predecessor and either one or two successors – internal nodes CS121 © JAS 2005 2-3 CS121 Data Structures Various subsets of trees can be identified – • Binary Trees (maximum of two successors) • Strictly Binary Trees (nodes have either 0 or 2 successors) • Full Binary Trees (all leaves at same depth) • Complete Binary Trees (all levels except last one are full, and last level is filled from left) CS121 © JAS 2005 2-4 CS121 Data Structures Various characteristics of trees can be measured • Number of nodes - Nodes • Number of leaves - Leaves • Depth (longest path from root to leaf) – Depth Nodes(EmptyTree) = 0 Nodes(Tree) = 1 + Nodes(Tree.Left) + Nodes(Tree.Right) CS121 © JAS 2005 2-5 CS121 Data Structures Leaves(EmptyTree) = 0 Leaves(RootOnly) = 1 Leaves(Tree) = Leaves(Tree.Left) + Leaves(Tree.Right) Depth(Tree) = 1 + maximum(Depth(Tree.Left), Depth(Tree.Right)) Depth(EmptyTree) = – 1 CS121 © JAS 2005 2-6 CS121 Data Structures Example Application – Representing an Arithmetic Expression Nodes represent operators Leaves represent operands ( b ^ 2 – 4 * a * c ) / ( 2 * a ) / – * ^ b * 2 * 4 2 a c a CS121 © JAS 2005 2-7 CS121 Data Structures Traversing a tree (visit all nodes in a tree) • Left, Root, Right • Left, Right, Root • Root, Left, Right - Inorder or Symmetric - Postorder - Preorder or Depth-First • Level by Level - Breadth-First Each of these traversals can be reversed CS121 © JAS 2005 2-8 CS121 Data Structures Defining a tree by operations/methods Create a tree either an empty a tree or, a tree consisting solely of a root IsEmpty – determine if a tree is empty CS121 © JAS 2005 2-9 CS121 Data Structures Accessing/Moving round a tree GetData – view data in a node GetLeft – move to/access left sub-tree GetRight – move to/access right sub-tree Adding to a tree can be application dependent simplest methods add a sub-tree to an empty left (or right) subtree Adding elsewhere involves reshaping the tree CS121 © JAS 2005 2-10 CS121 Data Structures Defining a Tree Class public class BTNode { private Object data; private BTNode left, right; public BTNode(Object initialData, BTNode initialLeft, BTNode initialRight) { data = initialData; left = initialLeft; right = initialRight; } CS121 © JAS 2005 2-11 CS121 Data Structures public Object getData() { return data; } public BTNode getLeft() { return left; } public BTNode getRight() { return right; } CS121 © JAS 2005 2-12 CS121 Data Structures public Object getLeftmostData() { if (left == null) return data; else return left.getLeftmostData(); } public Object getRightmostData() { ……… } CS121 © JAS 2005 2-13 CS121 Data Structures public void inorderPrint() { if (left != null) left.inorderPrint(); System.out.println(data); if (right != null) right.inorderPrint(); } CS121 © JAS 2005 2-14 CS121 Data Structures public BTNode removeLeftmost() { if (left == null) return right; //leftmost node is root else { left = left.removeLeftmost(); return this; } } CS121 © JAS 2005 2-15 CS121 Data Structures public void setData(Object newData) { data = newData; } public void setleft(BTNode newLeft) { left = newLeft; } CS121 © JAS 2005 2-16 CS121 Data Structures public static BTNode treeCopy(BTNode src) { BTNode leftCopy, RightCopy; if (src == null) return null; else { leftCopy = treeCopy(src,left); rightCopy = treeCopy(src.right); return new BTNode(src.data,leftCopy,rightCopy); } } CS121 © JAS 2005 2-17 CS121 Data Structures Binary Search Trees In a binary search tree, a node N with a key K is inserted so that • the keys in the left subtree of N are less than K, and • the keys in the right subtree of N are greater than K. To summarise, for each node N in a binary search tree: Keys in left subtree of N < Key K in node N < Keys in right subtree of N. CS121 © JAS 2005 2-18 CS121 Data Structures public BTNode addEntry(Object newData) { if (this == null) return (new BTNode(newData,null,null)); else { if newData.lessthan(data) left = left.addEntry(newData); else right = right.addEntry(newData); } return this; } CS121 © JAS 2005 2-19 CS121 Data Structures public boolean findEntry(Object wanted) { if this == null return false; else if wanted.lessthan(data) return left.findEntry(wanted); else if wanted.greaterthan(data) return right.findEntry(wanted); else //assume equal to data return true; } CS121 © JAS 2005 2-20 CS121 Data Structures Deleting From a Binary Search Tree • If node to be deleted is a leaf – no problem, just delete • If node has only one subtree – replace node with child • If node has two subtrees – Determine larger sub-tree (left/right) – Select least/greatest node of larger sub-tree – Move that node up – This will always be a leaf node or a node with only one sub-tree so process ends CS121 © JAS 2005 2-21 CS121 Data Structures h d e f b a l c e j g i CS121 © JAS 2005 n k m o 2-22 CS121 Data Structures Binary Tree Application - Huffman Codes Aim is to encode more frequent letters with shorter bit sequences Algorithm • Count frequencies of letters • Arrange into ascending order (linked list or binary search tree?) • Using two least frequent letters construct binary tree such that – Left leaf is least frequent letter; label left branch 1 – Right leaf is next least frequent letter; label right branch 0 – Root is given value = sum of leafs • Replace first two entries in frequency list with entry representing sum (ensure list remains sorted) • Repeat 2-23 CS121 © JAS 2005 CS121 Data Structures Encoding a string Replace each letter with bit code derived from path from root of tree to leaf containing that letter Decoding a string Read binary code such that 1 – descend left 0 – descend right at leaf – output letter CS121 © JAS 2005 2-24 CS121 Data Structures This is a simple message 0 1 1 1 m 0 1 g 0 1 h l 0 1 0 s mgh 4 lpat 5 ei 6 smgh 9 1 1 e 0 0 1 p a 0 t CS121 © JAS 2005 0 i a e g h i l m p s t gh lp at 2 3 1 1 3 1 2 1 5 1 2 2 3 2-25 CS121 Data Structures Binary Tree Application - Priority Queues A heap is a complete binary tree such that no node has a value bigger than its parent. This can provide a representation of a priority queue. The highest priority entry will always be at the root. Removal of the root does, however, require restructuring the tree. To maintain the complete tree property – move the last item in the tree (rightmost leaf on bottom level) to the root. To re-establish the priority property swap new root with highest priority child and repeat until it is moved to the correct position CS121 © JAS 2005 2-26 CS121 Data Structures Implementation of Heaps Obviously Heaps can be implemented using references, but as (almost) all the operations are based on swapping values and the trees are complete it is also feasible to use arrays to implement a heap tree. CS121 © JAS 2005 2-27 CS121 Data Structures Trees implemented using an array a b c d e f g h i j k l m n o a b d h c e i j f k l CS121 © JAS 2005 g m n o 2-28 CS121 Data Structures If a Tree is represented by array ATree Tree.root == ATree[0] If Tree == ATree[i] then Tree.left == ATree[2i+1] and Tree.right == ATree[2i+2] CS121 © JAS 2005 2-29 CS121 Data Structures Analysis of Binary Tree For a complete binary tree every comparison halves the size of the tree to be searched. Searching is thus said to be O(logn) n is the number of items to be searched log is the logarithm to base 2 If we inserted sorted data into a binary search tree we would end up with a linear list For a linear list the search time is O(n) CS121 © JAS 2005 2-30 CS121 Data Structures Balanced binary search trees have the best search times – O(logn) In the worst case, unbalanced trees can have O(n) search times AVL trees – as each new key is inserted, the tree remains balanced Guarantees O(logn) search times To keep tree balanced insertion worst case is O(n) CS121 © JAS 2005 2-31 CS121 Data Structures AVL (Adelson-Velskii and Landis) trees are almost balanced binary trees They have both O(logn) insertion (for one key) and search times Height of binary tree = length of longest path from the root to some leaf (The empty binary tree is considered to have a height of -1) The AVL property for a node, N, in a binary tree is that the heights of the left and right subtrees of node N are either equal or if they differ by 1 An AVL tree is a binary tree in which each of its nodes has the AVL property. CS121 © JAS 2005 2-32 CS121 Data Structures If inserting into an AVL tree causes the AVL property to be lost at a node, we can apply some shape-changing tree transformations, rotations, to restore the AVL property There are four different rotations Single left Single right Double left = single right then single left Double right = single left then single right CS121 © JAS 2005 2-33 CS121 Data Structures private AVLNode RotateRight( AVLNode tree) { AVLNode newTree = tree.Left; tree.Left = newTree.Right; newTree.Right = tree; return newTree; } CS121 © JAS 2005 2-34 CS121 Data Structures newTree tree ORY 210 0 210 JFK 10 0 10 BRU CS121 © JAS 2005 Right Rotation 2-35 CS121 Data Structures JFK BRU ARN ORY DUS ZRH MEX ORD GLA Right then Left Double Left GCM NRT CS121 © JAS 2005 2-36 CS121 Data Structures AVL Tree Summary Difference in height between the left and right subtrees of every node is 0 or 1 If this property is lost by inserting a new key, rotations can be performed on the subtree to regain the property An extra field will be required in the node record to retain the height of the subtrees It can be shown that in the worst case insertions, searches and deletions take at most O(logn) time Remember binary trees have a worst case of O(n) when a chain occurs. In fact it can be proven using fibonacci trees that AVL trees are at worst 44% less efficient than complete trees. Proof not required for course, but can be found in text books. CS121 © JAS 2005 2-37 CS121 Data Structures Two-Three Trees (2-3 Trees) Another option is to arrange for all subtrees to be perfectly balanced with respect to their heights, but to permit the number of search keys stored in nodes to vary Each node in a 2-3 tree is permitted to contain either one or two search keys, and to have either two or three descendants All leaves of the tree are empty trees that lie on exactly one bottom level. CS121 © JAS 2005 2-38 CS121 Data Structures H<L L H J<L<N J N D A E F I CS121 © JAS 2005 K L O P 2-39 CS121 Data Structures B<H B H B<D D AA B E F J I CS121 © JAS 2005 K N L O P 2-40 CS121 Data Structures H<M M H H L J<M<N D J J N N L^ A E F I CS121 © JAS 2005 KK LM O P 2-41 CS121 Data Structures The worst case 2-3 tree is one in which every node has 1 key and two children – a complete binary tree For a 2-3 tree with k levels and n elements we have (k+1) n=2 -1 Hence k+1=log(n+1) As splits are passed up the tree to parent nodes, there can only ever be O(logn) splits during the insertion Thus we can determine that search and deletions take O(logn) in the worst cases. CS121 © JAS 2005 2-42 CS121 Data Structures B-Trees Increase the number of children in a non-root node of a 2-3 tree to be, say, from 100 to 200, then we have a B-tree of order 200 Such a tree can store 8 million records in just 3 levels 2003 = 8 million or log2008000000=3 Therefore when storing records on disc we would only need 3 disc accesses in order to find any record in our tree of 8 million records A binary tree would require around 23 disc access A B-tree of order 200 would require 3 or in the worst case 5 accesses CS121 © JAS 2005 2-43 CS121 Data Structures Binary trees may be fast when stored in fast primary memory, but trees with higher branching factors are far more efficient when stored on slow external memory devices. For efficiency, all of an ordered key sequence in a B-tree is read into internal memory at once, and a fast binary search is used to find a key in this ordered sequence. This either gives the key itself, or the node which must be read from disc next. In general, in a B-tree of order m , each internal node except the root and the leaves must have between upper(m/2) and m children. The root can either be a leaf or it can contain from 2 to m children. All leaves lie on the same bottommost level and are empty. CS121 © JAS 2005 2-44 CS121 Data Structures Summary Concepts and terminology of trees– branches, nodes, children, parents, ancestors, descendants, internal nodes, leaves, levels, paths Binary trees, definition, and recursive definition, complete binary tree definition, representing expressions using binary trees, traversing binary trees Implementing trees in Java Binary search trees Huffman coding tree Deleting from a binary tree CS121 © JAS 2005 2-45 CS121 Data Structures Analysis of the binary tree – big-O notation Priority queues; represented by heaps; reheapifying AVL trees – definition, how to calculate the heights and balance factors, how to identify AVL and non-AVL trees, how to insert, how to perform rotations, benefits 2-3 trees – how to insert into, benefits B-trees – benefits, general idea of how they work CS121 © JAS 2005 2-46 CS121 Data Structures CS121 © JAS 2005 2-47