Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Unit 11: Data Structures & Complexity syllabus We discuss in this unit • • • • Graphs and trees Binary search trees Hashing functions Recursive sorting: quicksort, mergesort basic programming concepts object oriented programming topics in computer science 1 unit 11a Graphs and Trees Graph: a data representation which includes nodes and edges, where each edge connects two nodes Example: the internet Tree: a connected graph with no loops, and a root Example: inheritance tree 2 unit 11a Graphs and Trees b) a) ROOT c) internal vertices LEAF NODES 3 unit 11a Binary Trees A rooted tree is called a binary tree if every internal vertex has no more than 2 children The tree is called a full binary tree if every internal vertex has exactly 2 children An ordered rooted tree is a rooted tree where the children of each internal vertex are ordered; we call the children of a vertex the left child and the right child, if they exist A rooted binary tree of height H is called balanced if all its leaves are at levels H or H-1 4 unit 11a Binary tree: example Lou Hal Ed Max Ken Joe Sue Ted 5 unit 11a Tree Properties Theorem: A tree with N vertices has N-1 edges Theorem: There are at most 2 H leaves in a binary tree of height H Corallary: If a binary tree with L leaves is full and balanced, then its height is H = log2 L Theorem: There are at most (2 H+1–1) nodes in a binary tree of height H 6 unit 11a Binary Search Tree (BST) A special kind of binary tree in which: 1. Each vertex contains a distinct key value 2. The key values in the tree can be compared using “greater than” and “less than” 3. The key value of each vertex in the tree is less than every key value in its left subtree, and greater than every key value in its right subtree unit 11a Example: Binary Search Tree Lou Hal Ed Max Ken Joe Sue Ted 8 unit 11a Shape of a BST Depends on its key values and their order of insertion: Insert the elements ‘J’ ‘E’ ‘F’ ‘T’ ‘A’ in that order. The first value to be inserted is put into the root. ‘J’ 9 unit 11a Inserting ‘E’ into the BST Thereafter, each value to be inserted begins by comparing itself to the value in the root, moving left it is less, or moving right if it is greater. This continues at each level until it can be inserted as a new leaf. ‘J’ ‘E’ 10 unit 11a Inserting ‘F’ into the BST Begin by comparing ‘F’ to the value in the root, moving left it is less, or moving right if it is greater. This continues until it can be inserted as a leaf. ‘J’ ‘E’ ‘F’ 11 unit 11a Inserting ‘T’ into the BST Begin by comparing ‘T’ to the value in the root, moving left it is less, or moving right if it is greater. This continues until it can be inserted as a leaf. ‘J’ ‘T’ ‘E’ ‘F’ 12 unit 11a Inserting ‘A’ into the BST Begin by comparing ‘A’ to the value in the root, moving left it is less, or moving right if it is greater. This continues until it can be inserted as a leaf. ‘J’ ‘T’ ‘E’ ‘A’ ‘F’ 13 unit 11a Order of insertion what BST is obtained by inserting the elements ‘A’ ‘E’ ‘F’ ‘J’ ‘T’ in that order? ‘A’ ‘E’ ‘F’ ‘J’ ‘T’ 14 unit 11a Another binary search tree ‘J’ ‘T’ ‘E’ ‘A’ ‘H’ ‘M’ ‘P’ ‘K’ Add nodes containing these values in this order: ‘D’ ‘B’ ‘L’ ‘Q’ ‘S’ ‘V’ ‘Z’ 15 unit 11a Task: is ‘F’ in the tree? ‘J’ ‘T’ ‘E’ ‘A’ ‘D’ ‘B’ ‘V’ ‘M’ ‘H’ ‘Z’ ‘P’ ‘K’ ‘L’ ‘Q’ ‘S’ 16 unit 11a Search(x) 1. 2. 3. 4. start at the root of the tree which contains y: the tree is empty x is not present x = y (the item at the root) the root is returned x < y recursively search the left subtree x > y recursively search the right subtree 17 unit 11a Operations Search(x) Insert(x) Delete(x) tree algs/demo 18 unit 11a Complexity Search(x) – O(H) Insert(x) – O(H) Delete(x) – O(H) worst case O(n) when tree is a list best case O(log n) when tree is full and balanced 19 unit 11a Traversal Algorithms A traversal algorithm is a procedure for systematically visiting every vertex of an ordered binary tree Tree traversals are defined recursively Three traversals are named: preorder inorder postorder 20 unit 11a PREORDER Traversal Algorithm Let T be an ordered binary tree with root r: If T has only r, then r is the preorder traversal Otherwise, suppose T1, T2 are the left and right subtrees at r; the preorder traversal is • visit r • traverse T1 in preorder • traverse T2 in preorder 21 unit 11a Preorder Traversal Visit first ROOT ‘J’ ‘T’ ‘E’ ‘A’ ‘H’ Visit left subtree second ‘M’ ‘Y’ Visit right subtree last result: J E A H T M Y 22 unit 11a INORDER Traversal Algorithm Let T be an ordered binary tree with root r: If T has only r, then r is the inorder traversal Otherwise, suppose T1, T2 are the left and right subtrees at r; then • traverse T1 in inorder • visit r • traverse T2 in inorder 23 unit 11a Inorder Traversal Visit second ROOT ‘J’ ‘T’ ‘E’ ‘A’ ‘H’ Visit left subtree first ‘M’ ‘Y’ Visit right subtree last result: A E H J M T Y 24 unit 11a POSTORDER Traversal Algorithm Let T be an ordered binary tree with root r: If T has only r, then r is the postorder traversal Otherwise, suppose T1, T2 are the left and right subtrees at r; then • traverse T1 in postorder • traverse T2 in postorder • visit r 25 unit 11a Postorder Traversal Visit last ROOT ‘J’ ‘T’ ‘E’ ‘A’ ‘H’ Visit left subtree first ‘M’ ‘Y’ Visit right subtree second result: A H E M Y T J 26 unit 11a A Binary Expression Tree ROOT ‘-’ ‘8’ INORDER TRAVERSAL: ‘5’ 8 - 5 PREORDER TRAVERSAL: - 8 5 POSTORDER TRAVERSAL: 8 5 - has value 3 27 unit 11a Binary Expression Tree A special kind of binary tree in which: 1. Each leaf node contains a single operand 2. Each nonleaf node contains a single binary operator 3. The left and right subtrees of an operator node represent subexpressions that must be evaluated before applying the operator at the root of the subtree 28 unit 11a A Binary Expression Tree ‘*’ ‘+’ ‘4’ ‘3’ ‘2’ What value does it have? ( 4 + 2 ) * 3 = 18 29 unit 11a A Binary Expression Tree ‘*’ ‘+’ ‘4’ ‘3’ ‘2’ What infix, prefix, postfix expressions does it represent? 30 unit 11a A Binary Expression Tree ‘*’ ‘+’ ‘4’ ‘3’ ‘2’ Infix: ((4+2)*3) Prefix: * + 4 2 3 evaluate from right Postfix: 4 2 + 3 * evaluate from left 31 unit 11a Levels Indicate Precedence When a binary expression tree is used to represent an expression, the levels of the nodes in the tree indicate their relative precedence of evaluation Operations at higher levels of the tree are evaluated later than those below them; the operation at the root is always the last operation performed 32 unit 11a Example ‘*’ ‘-’ ‘8’ ‘/’ ‘5’ ‘3’ ‘+’ ‘4’ ‘2’ What infix, prefix, postfix expressions does it represent? 33 unit 11a A binary expression tree ‘*’ ‘/’ ‘-’ ‘8’ ‘5’ ‘3’ ‘+’ ‘4’ ‘2’ Infix: ((8-5)*((4+2)/3)) Prefix: *-85 /+423 Postfix: 8 5 - 4 2 + 3 / * has operators in order used 34 unit 11a Hash tables Goal: accesses data with complexity O(1), even when the data needs to be dynamically administered (mostly by inserting new or deleting existing data) Solution: array Problem: access is only O(1) if the index of the element is known – otherwise O(log n) Solution: each data item to be stored is associated with a hash value which gives array index: • generate array of size m, where m is prime and sufficiently large • assign unique numerical value N(k) to key of element k • h(k) = N(k) mod m 35 unit 11a Hash tables - cont Problem: hash value is not unique anymore! Solution: a collision procedure must determine the position for the new object hasshing 36 unit 11a Operations Search(x) Insert(x) Delete(x) Complexity: worst case O(n) when hash value is always the same (and table is really a list) best case O(1) when hash value is always distinct 37 unit 11a Example: keys are names list of unique identifiers • washington • lincoln • bush 103288042987600 5201793578 151444 the range currently is all positive integers, which is not possible to implement for m=10007 • washington • lincoln • bush 3249 4873 1339 38 unit 11a why modulus prime number? underlying reason: the new code numbers should appear random technical reason: the modulus operation should be a “field” example: • original codes are all even numbers • m is even • only even buckets in the table will get filled, half the table will be empty 39 unit 11a Back to sorting... Two recursive algorithms: MergeSort QuickSort 40 unit 11a MergeSort divide array to two equal parts recursively sort left part recursively sort right part merge the two sorted lists MergeSort 41 unit 11a QuickSort choose pivot arbitrarily divide to left part (< = than pivot) and right part (> than pivot) sort each part recursively quicksort 42 unit 11a Complexity MergeSort worst case O(n * log n) QuickSort worst case O(n2) average case O(n * log n) 43 unit 11a