Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Universal Hashing • When attempting to foil an malicious adversary, randomize the algorithm • Universal hashing: pick a hash function randomly when the algorithm begins – Guarantees good performance on average, no matter what keys adversary chooses – Need a family of hash functions to choose from – Think of quick-sort Universal Hashing • Let G be a (finite) collection of hash functions – …that map a given universe U of keys… – …into the range {0, 1, …, m - 1}. G is said to be universal if: – for each pair of distinct keys x, y U, the number of hash functions h G for which h(x) = h(y) is |G|/m – In other words: • With a random hash function from G the chance of a collision between x and y is exactly 1/m (x y) Universal Hashing • Theorem 11.3: – Choose h from a universal family of hash functions – Hash n keys into a table of m slots, n m – Then the expected number of collisions involving a particular key x is less than 1 – Proof: • For each pair of keys y, z, let cyx = 1 if y and z collide, 0 otherwise • E[cyz] = 1/m (by definition) • Let Cx be total number of collisions involving key x n 1 • E[Cx ] E[cxy ] yT yx • Since n m, we have E[Cx] < 1 m A Universal Hash Function • Choose table size m to be prime • Decompose key x into r+1 bytes, so that x = {x0, x1, …, xr} – Only requirement is that max value of byte < m – Let a = {a0, a1, …, ar} denote a sequence of r+1 elements chosen randomly from {0, 1, …, m - 1} – Define corresponding hash function ha G: r ha x ai xi mod m i 0 – With this definition, G has mr+1 members A Universal Hash Function G is a universal collection of hash functions (Theorem 11.5) • How to use: – Pick r based on m and the range of keys in U – Pick a hash function by (randomly) picking the a’s – Use that hash function on all keys Example • Let m = 5, and the size of each string is 2 bits (binary). Note the maximum value of a string is 3 and m = 5 • a = 1,3, chosen at random from 0,1,2,3,4 • Example for x = 4 = 01,00 (note r = 1) • ha(4) = 1 (01) + 3 (00) = 1 Open Addressing • Basic idea (details in Section 12.4): – To insert: if slot is full, try another slot, …, until an open slot is found (probing) – To search, follow same sequence of probes as would be used when inserting the element • If reach element with correct key, return it • If reach a NULL pointer, element is not in table • Good for fixed sets (adding but no deletion) • Table needn’t be much bigger than n Lecture 11: Binary Search Trees Shang-Hua Teng Data Format Keys Entry Keys Satellite data Insertion and Deletion on dynamic sets • Insert(S,x) – A modifying operation that augments the set S with the element pointed by x • Delete – Given a pointer x to an element in the set S, removes x from S – Notice that this operation uses a pointer to an element x, not a key value Querying on dynamic sets • Search(S,k) – given a set S and a key value k, returns a pointer x to an element in S such that key[x] = k, or NIL if no such element belongs S • Minimum(S) – on a totally ordered set S that returns a pointer to the element S with the smallest key • Maximum(S) • Successor(S,x) – Given an element x whose key is from a totally ordered set S, returns a pointer to the next larger element in S, or NIL if x is the maximum element • Predecessor(S,x) Trees root Edge Degree? interior node parent Node Depth/Level? path subtree leaf child Height? Binary Tree • Node – – – – Parent data left child right child parent (optional) Tree root Data Left Right root Trees • Full tree of height h – all leaves present at level h – all interior nodes full – total number of nodes in a full binary tree? • Complete tree of height h Applications - Expression Trees To represent infix expressions + * 5 - 3 8 (5*3)+(8-4) 4 Applications - Parse Trees Used in compilers to check syntax statement if cond then if statement cond then else statement statement Application - Game Trees Binary Search Trees • Binary Search Trees (BSTs) are an important data structure for dynamic sets • In addition to satellite data, elements have: – key: an identifying field inducing a total ordering – left: pointer to a left child (may be NULL) – right: pointer to a right child (may be NULL) – p: pointer to a parent node (NULL for root) Binary Search Trees • BST property: key[leftSubtree(x)] key[x] key[rightSubtree(x)] • Example: F B A H D K Traversals for a Binary Tree • Pre order – visit the node – go left – go right • In order – go left – visit the node – go right • Post order – go left – go right – visit the node • Level order / breadth first – for d = 0 to height • visit nodes at level d Traversal Examples Pre order A ABDGHCEFI In order B D G C E H GDHBAECFI Post order F GHDBEIFCA I Level order ABCDEFGHI Traversal Implementation • recursive implementation of preorder – base case? – self reference • visit node • pre-order(left child) • pre-order(right child) • What changes need to be made for in-order, post-order? Inorder Tree Walk • In order TreeWalk(x) TreeWalk(left[x]); print(x); TreeWalk(right[x]); • Prints elements in sorted (increasing) order In order Tree Walk F • Example: B A H D K • How long will a tree walk take? • In order walk prints in monotonically increasing order. Why? Evaluating an expression tree • Walk the tree in postorder • When visiting a node, use the results of its children to evaluate it. + * 5 - 3 8 4 Operations on BSTs: Search • Given a key and a pointer to a node, returns an element with that key or NULL: TreeSearch(x, k) if (x = NULL or k = key[x]) return x; if (k < key[x]) return TreeSearch(left[x], k); else return TreeSearch(right[x], k); BST Search: Example • Search for D and C: F B A H D K Operations of BSTs: Insert • Adds an element x to the tree so that the binary search tree property continues to hold • The basic algorithm – Like the search procedure above – Insert x in place of NULL – Use a “trailing pointer” to keep track of where you came from (like inserting into singly linked list) BST Insert: Example • Example: Insert C B F H A D C K BST Search/Insert: Running Time • What is the running time of TreeSearch() or TreeInsert()? – O(h), where h = height of tree • What is the height of a binary search tree? – worst case: h = O(n) when tree is just a linear string of left or right children • We’ll keep all analysis in terms of h for now • Later we’ll see how to maintain h = O(lg n) Sorting With Binary Search Trees • Informal code for sorting array A of length n: BSTSort(A) for i=1 to n TreeInsert(A[i]); InorderTreeWalk(root); • Argue that this is (n lg n) • What will be the running time in the – Worst case? – Average case? (hint: remind you of anything?) Sorting With BSTs • Average case analysis for i=1 to n TreeInsert(A[i]); InorderTreeWalk(root); – It’s a form of quicksort! 3 1 8 2 6 7 5 1 2 3 8 6 7 5 2 6 7 5 5 1 8 2 7 6 5 7 Sorting with BSTs • Same partitions are done as with quicksort, but in a different order – In previous example • Everything was compared to 3 once • Then those items < 3 were compared to 1 once • Etc. – Same comparisons as quicksort, different order! Sorting with BSTs • Since run time is proportional to the number of comparisons, same expected time as quicksort: O(n lg n) • Which do you think is better, quicksort or BSTsort? Why? Sorting with BSTs • Since run time is proportional to the number of comparisons, same time as quicksort: O(n lg n) • Which do you think is better, quicksort or BSTSort? Why? • Answer: quicksort – Better constants – Sorts in place – Doesn’t need to build data structure More BST Operations • A priority queue supports – Insert – Minimum – Extract-Min • BSTs are good for more than sorting. For example, can implement a priority queue BST Operations: Minimum • How can we implement a Minimum() query? • What is the running time? – O(h) BST Operations: Successor • For deletion, we will need a Successor() operation • What is the successor of node 3? Node 15? Node 13? • What are the general rules for finding the successor of node x? (hint: two cases) BST Operations: Successor • Two cases: – x has a right subtree: successor is minimum node in right subtree – x has no right subtree: successor is first ancestor of x whose left child is also ancestor of x • Intuition: As long as you move to the left up the tree, you’re visiting smaller nodes. • Predecessor: similar algorithm BST Operations: Delete • Deletion is a bit tricky • 3 cases: – x has no children: • Remove x F B H A – x has one child: • Splice out x D C – x has two children: • Swap x with successor • Perform case 1 or 2 to delete it K Example: delete K or H or B BST Operations: Delete • Why will case 2 always go to case 0 or case 1? – because when x has 2 children, its successor is the minimum in its right subtree • Could we swap x with predecessor instead of successor? – Of course Next Lecture • Up next: guaranteeing an O(lg n) height tree Animation • Animated Binary Tree