* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Constant-Time LCA Retrieval
Survey
Document related concepts
Transcript
Ariel Rosenfeld Bar-Ilan Uni. In a rooted tree T, a node u is an ancestor of a node v if u is on the unique path from the root to v. In a rooted tree T, the Lowest Common Ancestor (LCA) of two nodes u and v is the deepest node in T that is the ancestor of both u and v. 1 2 3 4 5 6 Node 3 is the LCA of nodes 4 and 6. Node 1 is the LCA of node 2 and 5. The LCA problem is then, given a rooted tree T for preprocessing, preprocess it in a way so that the LCA of any two given nodes in T can be retrieved in constant time. Preprocessing Querying Restrictions\comments None O(n) O(n^3) O(1) Naïve (do it yourselves) O(n^2) O(1) DP O(n) O(1) When the tree is complete O(n) O(1) Using reduction. Article added. Let B denote a complete binary tree with n nodes. The key here is to encode the unique path from the root to a node in the node itself. We assign each node a path number, a logn bit number that encodes the unique path from the root to the node. For each node v in B we encode a path number in the following way: ◦ Counting from the left most bit, the i’th bit of the path number for v corresponds to the i’th edge on the path from the root to v. ◦ A 0 for the i’th bit from the left indicates that the i’th edge on the path goes to a left child, and a 1 indicates that it goes to a right child. ◦ Let k denote then number of edges on the path from the root to v, then we mark the k+1 bit (the height bit) of the path number 1, and the rest of the logn-k-1 bits 0. 1 0 node j 1 0 0 node i Node i’s path number is Node j’s path number is 0101 1010 The height bit is marked in blue Padded bits are marked in red. 1000 0100 0010 0001 1100 0110 0011 0101 1010 0111 1001 1011 1110 1101 1111 Path numbers can easily be assigned in a simple O(n) in-order traversal on B. (good programing exercise) Suppose now that u and v are two nodes in B, and that path(u) and path(v) are their appropriate path numbers. We denote the lowest common ancestor of u and v as lca(u,v). We denote the prefix bits in the path number, those that correspond to edges on the path from the root, as the path bits of the path number. First we calculate path(u) XOR path(v) and find the left most bit which equals 1. If there is no such bit than path(u) = path(v) and so u = v, so assume that the k’th bit of the result is 1. If both the k’th bit in path(u) and the k’th bit in path(v) are path bits, then this means that u and v agree on k-1 edges of their path from the root, meaning that the k-1 prefix of each node’s path number encodes within it the path from the root to lca(u,v). lca(u,v) u 0100 0010 v 0111 path(u) XOR path(v) = 0010 XOR 0111 0101 path(lca(u,v) = 0 1 0 0 height bit padded bits lca(u’,v’) 1010 u’ 1001 v’ 1011 path(u’) XOR path(v’) = 1001 XOR 1011 0010 path(lca(u,v) = 1 0 1 0 height bit padded bit This concludes that if we take the prefix k1 bits of the result of path(u) XOR path(v), add 1 as the k’th bit, and pad logn-k 0 suffix bits, we get path(lca(u,v)). If either the k’th bit in path(u) or the k’th bit in path(v) (or both) is not a path bit then one node is ancestor to the other, and lca(u,v) can easily be retrieved by comparing path(u) and path(v)’s height bit. The following are the two stages of the general LCA algorithm for any arbitrary tree T: First, we reduce the LCA problem to the Restricted Range Minima {RRM} (simple case of Range Minimum Query {RMQ}) problem. Problem of finding the smallest number in an interval of a fixed list of numbers, where the difference between two successive numbers in the list is exactly one. We’ll solve the Restricted Range Minima problem and thus solve the LCA problem. Let T denote an arbitrary tree Let lca(u,v) denote the lowest common ancestor of nodes u and v in T. First we execute a depth-first traversal of T to label the nodes in the depth-first order they are encountered. In that same traversal we maintain a list L, of nodes of T, in the same order that they were visited. The only property of the depth-first numbering we need is that the number given to any node is smaller then the number given to any of it’s descendents. 000 001 010 011 100 110 101 111 The depth-first traversal creates these depth numbers and the following list L: L = { 0, 1, 0, 2, 3, 2, 4, 2, 5, 6, 5, 7, 5, 2, 0 } Now if want to find lca(u,v), we find the first occurrence of the two nodes in L, this defines an interval I in L. Suppose u occurs in L before v. Now, I describes the part of the traversal, from the point we first discovered u to the point we first discovered v. lca(u,v) can be retrieved by finding the minimum number in I. This is due to the following two simple facts: ◦ If u is an ancestor of v then all nodes visited between u and v are in u’s subtree, thus the depthnumber assigned to u is minimal in I. ◦ If u is not an ancestor of v, then all those nodes visited between u and v are in lca(u,v)’s subtree, and the traversal must visit lca(u,v). Thus the minimum of I is the depth-number assigned to lca(u,v). 000 001 010 011 100 101 110 111 L = { 0, 1, 0, 2, 3, 2, 4, 2, 5, 6, 5, 7, 5, 2, 0 } lca(3,7) = 2 ◦ lca(0,7) = 0 So far we’ve shown how to reduce the LCA problem to the range minima problem. This next step shows how to achieve reduction to the restricted range minima problem. Denote level(u) as the number of edges in the unique path from the root to node u in T. If L = { l1, l2, … , lz } then we build the following list : L’={level(l1),level(l2),…level(lz)}. We use L’ in the same manner we used L in the previous reduction scheme. This works because in every interval I = [u,v] in L, lca(u,v) is the lowest node in I for the same reasons mentioned earlier. The difference between two adjacent elements in L’ is exactly one. This completes the reduction to the restricted range minima problem. Denote n as the number of nodes in T. ◦ Depth-first traversal can be done in O( n ) space and time complexity. ◦ L is of size O( n ) and thus it’s creation and initialization can be done in O( n ) space and time complexity. ◦ To find lca(u,v) we need the first occurrence of u and v in L. This could be stored in a table of size O( n ). Thus the creation and initialization of this table can be done in O( n ) space and time complexity. The total space and time complexity of the reduction is then O( n ). The Range Minima problem is the problem of finding the smallest number in an interval of a fixed list of numbers. The Restricted Range Minima problem is an instance of the Range Minima problem where the difference between two successive numbers is exactly one. The Restricted Range Minima problem is stated formally in the following: Given a list L = { l1 , l2 , … , ln } of n real numbers, where for each i = 1… n-1 holds that | li - li+1 | = 1 preprocess the list so that for any interval [ li , li+1 , … , lj ] ,1 i < j n, the minimum over the interval can be retrieved in constant time. I added the article explaining how to solve RRM in O(n) time and space. Somewhat challenging.. If LCA had a good solution, would it solve (restricted) Range Maxima? In other words, Can we reduce RM to LCA? Surprisingly easy. A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9] 10 25 22 34 7 19 10 12 26 16 A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9] 10 25 22 34 7 10 0 19 10 12 26 16 A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9] 10 25 22 34 7 10 0 25 1 19 10 12 26 16 A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9] 10 25 22 34 7 10 0 22 2 25 1 19 10 12 26 16 A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9] 10 25 22 34 34 77 10 0 22 2 34 25 1 3 19 10 12 26 16 A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9] 10 25 22 34 7 7 4 10 0 22 2 34 25 1 3 19 10 12 26 16 A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9] 10 25 22 34 7 19 10 12 26 16 4 0 6 2 1 5 7 3 9 8 O(n), O(1) -time solution for LCA, If there is an O(n), O(1) -time solution for then there is an RMQ. Let A be the input array to the RMQ problem. Let C be the Cartesian tree of A. RMQA(i,j) = LCAC(i,j) A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] 10 25 22 34 7 19 10 12 A[9] 26 16 4 0 6 2 1 5 7 3 9 8