* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Succinct Data Structure
Survey
Document related concepts
Transcript
ENCODING NEAREST LARGER VALUES Pat Nicholson* and Rajeev Raman** *MPII ** University of Leicester DÉJÀ VU: THE ENCODING APPROACH Input Data (Relatively Big) DÉJÀ VU: THE ENCODING APPROACH Preprocess w.r.t. Some Query Input Data (Relatively Big) Encoding (Hope: much smaller) DÉJÀ VU: THE ENCODING APPROACH Encoding (Hope: much smaller) DÉJÀ VU: THE ENCODING APPROACH Encoding (Hope: much smaller) Auxiliary Data Structures: (Should be smaller still) DÉJÀ VU: THE ENCODING APPROACH Succinct Data Structure: Minimum Space Possible Encoding (Hope: much smaller) Auxiliary Data Structures: (Should be smaller still) DÉJÀ VU: THE ENCODING APPROACH Succinct Data Structure: Minimum Space Possible Auxiliary Data Structures: Encoding (Hope: much smaller) (Should be smaller still) Query (Hope: as fast as nonsuccinct counterpart) NEAREST LARGER VALUES Support nearest larger value queries on an array π΄ 1. . π : ο Given index π, return position π of the βnearestβ value larger than A π 10 2 3 1 9 8 7 11 Two important questions: ο #1: What does βnearestβ mean? ο Many possible variants of the problem: ο Unidirectional: return the index of the NLV to the left (π < π) ο Bidirectional: return the indices of the NLV to the left AND right ο Nondirectional: return the index of the closest NLV (min. |π β π|) ο #2: Are all elements in the array distinct? 5 4 OVERVIEW: ENCODING NLV Distinct Problem Space Q Notes Unidirectional 2π + π(π) π(1) Cartesian Tree Bidirectional 2π + π(π) π(1) Cartesian Tree ??? ??? 2π + π(π) π(1) Cartesian Tree Bidirectional [Fischer 2011] log 2 (3 + 2 2) π + π π β 2.54π + π(π) π(1) Schröder Trees (Navigate CSA) Nondirectional ??? ??? Yes Nondirectional No Unidirectional [Fischer et al. 2009] For all these results: space bound is optimal to within lower order terms OVERVIEW: ENCODING NLV Distinct Problem Space Q Notes Unidirectional 2π + π(π) π(1) Cartesian Tree Bidirectional 2π + π(π) π(1) Cartesian Tree Yes Nondirectional No < π. ππ + π π > π. πππππ β πΆ(π) πΆ(π) This paper: NLV Tree 2π + π(π) π(1) Cartesian Tree Bidirectional [Fischer 2011] log 2 (3 + 2 2) π + π π β 2.54π + π(π) π(1) Schröder Trees (Navigate CSA) Nondirectional ??? ??? Unidirectional [Fischer et al. 2009] Still very open: What is the constant? log 2 3? Prove it! BIGGER PICTURE Encoding 1D Range Minimum Queries ο Fischer and Heun [SICOMP 2011] ο Encoding also using 2π + π(π) bits via Cartesian tree All-Nearest Larger Values ο ο ο ο Asano et al. [Mehlhornβs Festschrift 2009, WADS 2013] Trade-offs for computing the solutions to all NLV queries Berkman et al. [J. Alg 1993] Parallel algorithms for parenthesis matching, triangulating monotone polygons Encoding 2D Nearest Larger Values ο Jo, Raman, and Rao [WALCOM 2015], Jayapaul et al. [IWOCA 2014] ο Encode NLV of π × π array under πΏ1 metric using π π 2 bits Encoding 2D Range Minimum Queries See Brodal et al. [ESA 2010, 2012, and 2013] Encode RMQ for an π × π matrix requires Ξ©(ππ log min(π, π)) BIGGER PICTURE Encoding 1D Range Minimum Queries ο Fischer and Heun [SICOMP 2011] ο Encoding also using 2π + π(π) bits via Cartesian tree All-Nearest Larger Values ο ο ο ο Asano et al. [Mehlhornβs Festschrift 2009, WADS 2013] Trade-offs for computing the solutions to all NLV queries Berkman et al. [J. Alg 1993] Parallel algorithms for parenthesis matching, triangulating monotone polygons Encoding 2D Nearest Larger Values ο Jo, Raman, and Rao [WALCOM 2015], Jayapaul et al. [IWOCA 2014] ο Encode NLV of π × π array under πΏ1 metric using π π 2 bits Encoding 2D Range Minimum Queries ο See Brodal et al. [ESA 2010, 2012, and 2013] ο Encode RMQ for an π × π matrix requires Ξ©(ππ log min(π, π)) CARTESIAN TREES REVIEW We can rebuild him. We have the technology. NONDIRECTIONAL NLV TREE Tie breaking rule: break ties to by choosing the one to the right. TIEBREAKING MATTERS? 1 2 3 4 5 6 7 8 9 10 To the right 1 2 5 14 40 116 341 1010 3009 9012 To the smaller 1 2 5 14 42 126 383 1178 3640 11316 To the larger 1 2 5 12 32 88 248 702 1998 5696 π Rule Open problem: Does the tie breaking rule affect the constant factor: i.e., log π₯π π πββ lim ? IDEA: COMPRESS RUNS IDEA: COMPRESS RUNS IDEA: COMPRESS RUNS IDEA: COMPRESS RUNS DIGRESSION: PATH (OR CHAIN) COMPRESSION Degree two Degree one Terminal Subtree If there are π deleted nodes, and π chains, then store: ο Path/chain-compressed tree: ~π β π bits ο Bitvector marking chain terminals: log πβπ bits π π ο Bitvector of length π with π ones: unary chain lengths log π bits ο Bitvector of length π indicating a zig or zag for each deleted node This works out to 2π + Ξ log π bits ο Note: doesnβt support queries, just recovers structure COMPRESSING CARTESIAN TREES W.R.T. NLVS Lemma: Excluding chains containing nodes representing array elements π΄[1] or π΄[π], if a chain contains ππ deleted elements, then there are exactly ππ + 1 combinatorially distinct chains with respect to answering nearest larger value queries (breaking ties to the right). Forget about whether it zigs or zags, just store # in prefixβ¦ THE ENCODING If there are π deleted nodes, and π chains, then store: ο Path/chain-compressed tree: ~π β π bits ο Bitvector marking chain terminals: log πβπ π bits ο Bitvector of length π with π ones: unary chain lengths log π π bits ο For each deleted chain of length ππ , store number of nodes in prefix ο This takes no more than log π π=1(ππ + 1) β€ π log π π +1 bits If we maximize this expression in terms of π and π: ο Upper bounded by 1.9198π + π(π) bits We can improve the encoding of the chain lengths: ο Encode multiset of lengths to zeroth order empirical entropy ο This improves the upper bound constant factor to about 1.9 (> 1.8999) SUB-OPTIMALITY EXAMPLES SUB-OPTIMALITY EXAMPLES SUB-OPTIMALITY EXAMPLES SUB-OPTIMALITY EXAMPLES ENCODING β DATA STRUCTURE Tree decomposition: Mini-Micro trees Farzan and Munro ο ο ο ο Davoodi et al. showed how to support select-inorder for binary tree We simply plug our compression into this framework Need to support two additional operations: is_chain_prefix/suffix Decompress fingerprints, use lookup tables: tree + inorder position Theorem: Space bound the same as encoding + π(π) bits and supports nondirectional NLV query in π(1) time LOWER BOUND SKETCH βComputer assistedβ lower bound idea: ο Rough Idea 1: Use the computer to count number of distinct structures on π½ elements for some integer π½ > 0. Call this value ππ½ . ο Rough Idea 2: Given an instance of size π glue pieces of size π½ together without restricting the number of possible configurations of each piece ο Two adjacent π½-structures can obviously interfere with each other in non-trivial ways LOWER BOUND SKETCH Clearer Idea: ο Fix tiebreaking rule: to the right ο ππ½ is the number of distinct π½ sized NLV structures ο π π½ is the number of distinct π½ sized NLV structures with added restrictions β 10 2 3 1 9 8 7 11 5 ο Break permutation into upper and lower half: ο Green blocks come from upper half, blue from lower half ο Interleave: green guys can have ππ½ configurations, blue guys: π π½ ο Only the max in each green block will βexitβ 4 β CONCLUSIONS AND OPEN PROBLEMS For nearest larger value problems the details are crucial: ο Distinct elements? ο Definition of nearest? ο Tiebreaking rules? We have considered encodings of nondirectional NLV ο For an array containing distinct elements these can be encoded using less than 1.9π + π π bits: slightly less than the Cartesian tree Open Problems: ο What is the optimal space bound for the nondirectional NLV? ο Distinct vs. nondistinct? ο Does the tiebreaking rule affect the constant factor? ο Other formulations? THANK YOU