Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ENCODING NEAREST LARGER VALUES Pat Nicholson* and Rajeev Raman** *MPII ** University of Leicester DÉJÀ VU: THE ENCODING APPROACH Input Data (Relatively Big) DÉJÀ VU: THE ENCODING APPROACH Preprocess w.r.t. Some Query Input Data (Relatively Big) Encoding (Hope: much smaller) DÉJÀ VU: THE ENCODING APPROACH Encoding (Hope: much smaller) DÉJÀ VU: THE ENCODING APPROACH Encoding (Hope: much smaller) Auxiliary Data Structures: (Should be smaller still) DÉJÀ VU: THE ENCODING APPROACH Succinct Data Structure: Minimum Space Possible Encoding (Hope: much smaller) Auxiliary Data Structures: (Should be smaller still) DÉJÀ VU: THE ENCODING APPROACH Succinct Data Structure: Minimum Space Possible Auxiliary Data Structures: Encoding (Hope: much smaller) (Should be smaller still) Query (Hope: as fast as nonsuccinct counterpart) NEAREST LARGER VALUES Support nearest larger value queries on an array π΄ 1. . π : ο Given index π, return position π of the βnearestβ value larger than A π 10 2 3 1 9 8 7 11 Two important questions: ο #1: What does βnearestβ mean? ο Many possible variants of the problem: ο Unidirectional: return the index of the NLV to the left (π < π) ο Bidirectional: return the indices of the NLV to the left AND right ο Nondirectional: return the index of the closest NLV (min. |π β π|) ο #2: Are all elements in the array distinct? 5 4 OVERVIEW: ENCODING NLV Distinct Problem Space Q Notes Unidirectional 2π + π(π) π(1) Cartesian Tree Bidirectional 2π + π(π) π(1) Cartesian Tree ??? ??? 2π + π(π) π(1) Cartesian Tree Bidirectional [Fischer 2011] log 2 (3 + 2 2) π + π π β 2.54π + π(π) π(1) Schröder Trees (Navigate CSA) Nondirectional ??? ??? Yes Nondirectional No Unidirectional [Fischer et al. 2009] For all these results: space bound is optimal to within lower order terms OVERVIEW: ENCODING NLV Distinct Problem Space Q Notes Unidirectional 2π + π(π) π(1) Cartesian Tree Bidirectional 2π + π(π) π(1) Cartesian Tree Yes Nondirectional No < π. ππ + π π > π. πππππ β πΆ(π) πΆ(π) This paper: NLV Tree 2π + π(π) π(1) Cartesian Tree Bidirectional [Fischer 2011] log 2 (3 + 2 2) π + π π β 2.54π + π(π) π(1) Schröder Trees (Navigate CSA) Nondirectional ??? ??? Unidirectional [Fischer et al. 2009] Still very open: What is the constant? log 2 3? Prove it! BIGGER PICTURE Encoding 1D Range Minimum Queries ο Fischer and Heun [SICOMP 2011] ο Encoding also using 2π + π(π) bits via Cartesian tree All-Nearest Larger Values ο ο ο ο Asano et al. [Mehlhornβs Festschrift 2009, WADS 2013] Trade-offs for computing the solutions to all NLV queries Berkman et al. [J. Alg 1993] Parallel algorithms for parenthesis matching, triangulating monotone polygons Encoding 2D Nearest Larger Values ο Jo, Raman, and Rao [WALCOM 2015], Jayapaul et al. [IWOCA 2014] ο Encode NLV of π × π array under πΏ1 metric using π π 2 bits Encoding 2D Range Minimum Queries See Brodal et al. [ESA 2010, 2012, and 2013] Encode RMQ for an π × π matrix requires Ξ©(ππ log min(π, π)) BIGGER PICTURE Encoding 1D Range Minimum Queries ο Fischer and Heun [SICOMP 2011] ο Encoding also using 2π + π(π) bits via Cartesian tree All-Nearest Larger Values ο ο ο ο Asano et al. [Mehlhornβs Festschrift 2009, WADS 2013] Trade-offs for computing the solutions to all NLV queries Berkman et al. [J. Alg 1993] Parallel algorithms for parenthesis matching, triangulating monotone polygons Encoding 2D Nearest Larger Values ο Jo, Raman, and Rao [WALCOM 2015], Jayapaul et al. [IWOCA 2014] ο Encode NLV of π × π array under πΏ1 metric using π π 2 bits Encoding 2D Range Minimum Queries ο See Brodal et al. [ESA 2010, 2012, and 2013] ο Encode RMQ for an π × π matrix requires Ξ©(ππ log min(π, π)) CARTESIAN TREES REVIEW We can rebuild him. We have the technology. NONDIRECTIONAL NLV TREE Tie breaking rule: break ties to by choosing the one to the right. TIEBREAKING MATTERS? 1 2 3 4 5 6 7 8 9 10 To the right 1 2 5 14 40 116 341 1010 3009 9012 To the smaller 1 2 5 14 42 126 383 1178 3640 11316 To the larger 1 2 5 12 32 88 248 702 1998 5696 π Rule Open problem: Does the tie breaking rule affect the constant factor: i.e., log π₯π π πββ lim ? IDEA: COMPRESS RUNS IDEA: COMPRESS RUNS IDEA: COMPRESS RUNS IDEA: COMPRESS RUNS DIGRESSION: PATH (OR CHAIN) COMPRESSION Degree two Degree one Terminal Subtree If there are π deleted nodes, and π chains, then store: ο Path/chain-compressed tree: ~π β π bits ο Bitvector marking chain terminals: log πβπ bits π π ο Bitvector of length π with π ones: unary chain lengths log π bits ο Bitvector of length π indicating a zig or zag for each deleted node This works out to 2π + Ξ log π bits ο Note: doesnβt support queries, just recovers structure COMPRESSING CARTESIAN TREES W.R.T. NLVS Lemma: Excluding chains containing nodes representing array elements π΄[1] or π΄[π], if a chain contains ππ deleted elements, then there are exactly ππ + 1 combinatorially distinct chains with respect to answering nearest larger value queries (breaking ties to the right). Forget about whether it zigs or zags, just store # in prefixβ¦ THE ENCODING If there are π deleted nodes, and π chains, then store: ο Path/chain-compressed tree: ~π β π bits ο Bitvector marking chain terminals: log πβπ π bits ο Bitvector of length π with π ones: unary chain lengths log π π bits ο For each deleted chain of length ππ , store number of nodes in prefix ο This takes no more than log π π=1(ππ + 1) β€ π log π π +1 bits If we maximize this expression in terms of π and π: ο Upper bounded by 1.9198π + π(π) bits We can improve the encoding of the chain lengths: ο Encode multiset of lengths to zeroth order empirical entropy ο This improves the upper bound constant factor to about 1.9 (> 1.8999) SUB-OPTIMALITY EXAMPLES SUB-OPTIMALITY EXAMPLES SUB-OPTIMALITY EXAMPLES SUB-OPTIMALITY EXAMPLES ENCODING β DATA STRUCTURE Tree decomposition: Mini-Micro trees Farzan and Munro ο ο ο ο Davoodi et al. showed how to support select-inorder for binary tree We simply plug our compression into this framework Need to support two additional operations: is_chain_prefix/suffix Decompress fingerprints, use lookup tables: tree + inorder position Theorem: Space bound the same as encoding + π(π) bits and supports nondirectional NLV query in π(1) time LOWER BOUND SKETCH βComputer assistedβ lower bound idea: ο Rough Idea 1: Use the computer to count number of distinct structures on π½ elements for some integer π½ > 0. Call this value ππ½ . ο Rough Idea 2: Given an instance of size π glue pieces of size π½ together without restricting the number of possible configurations of each piece ο Two adjacent π½-structures can obviously interfere with each other in non-trivial ways LOWER BOUND SKETCH Clearer Idea: ο Fix tiebreaking rule: to the right ο ππ½ is the number of distinct π½ sized NLV structures ο π π½ is the number of distinct π½ sized NLV structures with added restrictions β 10 2 3 1 9 8 7 11 5 ο Break permutation into upper and lower half: ο Green blocks come from upper half, blue from lower half ο Interleave: green guys can have ππ½ configurations, blue guys: π π½ ο Only the max in each green block will βexitβ 4 β CONCLUSIONS AND OPEN PROBLEMS For nearest larger value problems the details are crucial: ο Distinct elements? ο Definition of nearest? ο Tiebreaking rules? We have considered encodings of nondirectional NLV ο For an array containing distinct elements these can be encoded using less than 1.9π + π π bits: slightly less than the Cartesian tree Open Problems: ο What is the optimal space bound for the nondirectional NLV? ο Distinct vs. nondistinct? ο Does the tiebreaking rule affect the constant factor? ο Other formulations? THANK YOU