Download Worst Case Constant Time Priority Queue

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Array data structure wikipedia , lookup

Linked list wikipedia , lookup

Lattice model (finance) wikipedia , lookup

Quadtree wikipedia , lookup

Interval tree wikipedia , lookup

Red–black tree wikipedia , lookup

Binary tree wikipedia , lookup

Binary search tree wikipedia , lookup

B-tree wikipedia , lookup

Transcript
Worst Case Constant Time Priority Queue
Andrej Brodnik
Svante Carlsson
Johan Karlsson
Abstract
J. Ian Munro
When talking about the neighbours of e we mean
the left and the right neighbour. We let m denote
lg M .
We present a new data structure of size O(M ) for
solving the vEB problem. When this data structure
is used in combination with a new memory topology
Lower Bounds and some Matching
it provides an O(1) worst case time solution.
Upper Bounds
1
Introduction
In this paper we will look at a problem, which was
introduced by van Emde Boas et al. [12] as the extended repertoire of the Priority Queue problem,
commonly called vEB. We present a solution to
the vEB problem using a new data structure called
Split Tagged Tree. Together with a new memory
topology we achieve a worst case constant time for
the vEB operations.
Definition 1 The vEB problem is to, given a set
N of size N with elements drawn from an ordered
bounded universe M of size M , support the following operations:
Insert(e)
: N := N ∪ {e}
Delete(e)
: N := N \{e}
Member(e)
: Find whether e ∈ N
Min
: Compute the smallest element of N
Max
: Compute the largest element of N
DeleteMin
: Delete the smallest element
of N
DeleteMax
: Delete the largest element
of N
Predecessor(e) : Compute the largest element of N < e
Successor(e)
: Compute the smallest element of N > e
The Priority Queue problem only supports
Insert(e), DeleteMin and Min.
We will refer to an element e’s predecessor as its left
neighbour and its successor as its right neighbour.
1
The obvious lower bound Ω(1) for the the vEB operations has been improved in various models of
computation. Under the pointer machine model
(cf. [10]) Mehlhorn et al. proved an amortized
lower bound of Ω(lg lg M ) [8]. Recently,
p Beame and
Fich [2] gave a lower bound of Ω( lg N/ lg lg N )
under the communication game model (cf. [9, 13])
which also apply to the cell probe (cf. [14]) and
RAM (cf. [11]) models. They also gave a matching
upper bound for the static vEB problem. Andersson and Thorup [1] gave a data structure and an
algorithm with a matching worst case time for the
dynamic version.
Our Model of Computation
Our model is based on the RAM model of computation which includes branching and the arithmetic operations addition and subtraction. We will
also need bitwise Boolean operations and multiplication/division (cf. MBRAM [11]). However, we
do not want the model to be unrealistic and therefore we restrict the model to only use bounded registers. The registers we use are w = O(lg M ) bits
wide; i.e. a memory locations can store w-bit values
and all operations are defined for arguments with
w bits. This model of computation is implemented
by any standard computer today and we denote it
with extended RAM (ERAM).
Operations to search for Least (Most) Significant
Bit (LSB, MSB) can be implemented to run in O(1)
time in the ERAM model, using a technique called
Word-Size-Parallelism [3]. Hence, we let these operation be defined in the model as well.
2
Previous Related Work
The splitting node ν is a left (right) splitting
node of e if e is a leaf in the left (right) subtree
To solve the vEB problem, van Emde Boas et al. of ν. The first left (right) splitting node on a path
[12] introduced a data structure called Stratified from e to the root is the lowest left (right) splitting
Tree.
node of e.
When we talk about the lowest splitting nodes we
refer to the lowest left and to the lowest right splitting node.
Remember that the stratified tree consists of
tagged leaves which are linked together in a doubly
linked list and internal nodes that can have pointer
to either other internal node or leaves.
Definition 2 A Stratified Tree is a complete binary tree on M in which:
• Each leaf representing an element of N is
tagged and contains pointers to its predecessor
and successor;
• Nodes can be active and if so they contain:
Definition 4 A Split Tagged Tree is a complete
binary tree on M in which:
– pointers to the smallest and largest active
node or leaf in each subtree.
– an indication if there exists branching
nodes on the path from the node to its root
or not.
• Each leaf representing an element of N is
tagged;
• A splitting node is tagged and has pointers to
the leaf representing the largest (smallest) element x (y) in its left (right) subtree, where
x ∈ N (y ∈ N ); and
Tagged means marked with a boolean flag or likewise. To find neighbours and insert/delete elements
in a stratified tree, a binary search in performed
on the path from the leaf to the root to find the
• There is a special super node, placed on top of
proper internal node or leaf (see [12] for details).
the tree, and it is tagged if the set contains at
This yields a O(lg lg M ) worst case time for all the
least one element. If the super node is tagged
vEB operations, which matches the lower bound by
it contains pointers to the leaves representing
Mehlhorn et al.
the minimum and maximum elements.
Since we can not lower the worst case time to
The super node is both a left and a right splitting
constant time for the different vEB operations in
node for all the elements in the set. Hence, all
any of the above models of computation we need a
elements, even minimum and maximum, have both
new model.
left and right splittings nodes.
Note that the internal nodes in the STT only
can
have pointer to leaves not other internal nodes.
3 The Split Tagged Tree
Since leaves represents elements of M we will talk
We now introduce a new data structure called Split about the leaves and the corresponding element inTagged Tree, which is a slightly modified strati- terchangeably.
Internal nodes are represented by the structure
fied tree, and identify the problem stoping us from
reaching constant worst case time. Then we look at STT node in Algorithm 1 which are stored in stanwhat kind of model of computation that can solve dard heap order in an array. The super node is
our problem in order to reach the worst case con- also represented by this structure, though its left
(right) points at maximum (minimum), stored at
stant time for the vEB operations.
Before we give the definition of the Split Tagged location 0 in the array. The leaves only need a
Tree (STT) we need some auxiliary definitions of tag and hence can be stored in a boolean array.
nodes in a complete binary tree that has leaves for The pointer to the largest (smallest) element in the
left (right) subtree then becomes an index into the
every element in a universe M (cf trie):
boolean array.
Definition 3 An internal node is a splitting node
There are a number of properties that the STT
if there is at least one tagged leaf in each of its has. We state these properties in the form of “half”
subtrees.
Lemmata as the other halves are symmetrical.
2
typedef struct node {
boolean tag;
/* Tag
int left;
/* Largest element in left subtree
int right; /* Smallest element in right subtree
} STT node;
typedef struct STT {
STT node nodes[M]; /* Array of internal nodes
boolean leaves[M];
/* Array of leaf tags
} STT;
exists in the right subtree of the lowest common ancestor ν of z and e and; if e < z then no element
v ∈ N exists in the left subtree of the lowest common ancestor ν of e and z.
*/
*/
*/
With these properties we can describe how to
perform updates and answer queries in the STT.
By Lemma 2, queries about neighbours can be answered. By a counting argument we see that when
inserting an element e 6∈ N , exactly one internal
node becomes a splitting node and one leaf gets
tagged. The leaf is the leaf corresponding to e. To
find the new splitting node we get the element z
pointed at by the lowest left splitting node νl of e.
If z < e then, by Lemma 3, the lowest common ancestor ν of e and z should be e’s new lowest right
splitting node with one pointer to z and one to e,
and νl should be updated to point at e instead of
z. If e < z then, by Lemma 3, the lowest common
ancestor ν of e and z should be e’s new lowest left
splitting node with one pointer to e and one to e’s
right neighbour found in e’s lowest right splitting
node νr . νr should be updated to point at e instead
of e’s right neighbour.
By a similar reasoning we see that deleting is
done by removing the tags in the leaf and in the
lower of the left and right lowest splitting nodes,
and by updating the pointers in the other node.
Since only a constant number of nodes need to
be updated, the time to update (and also to search)
the STT is asymptotically bounded by the time to
traverse the tree. The tree is of height lg M + 1,
which implies that the time to update and search
the tree is O(lg M ) in the ERAM and pointer machine models.
*/
*/
Algorithm 1: Efficient Representation of the STT
Lemma 1 Let e ∈ N and let νl be its lowest left
splitting node. Then, νl is the only of e’s left splitting nodes that points at e.
Lemma 1 implies that in the STT there is a constant number of nodes that have pointers to an element e ∈ N . Moreover, the nodes containing the
pointers in question are the lowest left and lowest
right splitting nodes of e. We proceed by showing
how to find the neighbours of e:
Lemma 2 Let νl and νr be e’s lowest left and lowest right splitting nodes respectively, and let x ∈ N
be the left neighbour of e, hence x < e. Then, if
e ∈ N the pointer at νr refer to x; and if e 6∈ N
then either a pointer in νl or in νr points to x.
Finally, we show that during insertion, of an element e, it is sufficient to find either the lowest left or
the lowest right splitting node of e to decide where
the new splitting node shall be (see Figure 1):
Yggdrasil implementation of RAMBO
νl
νr
If we can avoid traversing the tree from a leaf towards the root when we want to find the lowest
ν
splitting nodes and instead find them in constant
time we will have a worst case constant time solution for the vEB problem.
z
e
e
z
The RAMBO model of computation is a RAM
model in which, in one part of the memory, regisFigure 1: The two scenarios before insertion of e.
ters can share bits; i.e. bytes overlap (cf. RAMBO
introduced by Fredman et al. [7] and further deLemma 3 Let νl be the lowest left splitting node of scribed by Brodnik [4]). One particular implemene 6∈ N and let z ∈ N be the largest element in the tation of the RAMBO called Yggdrasil can help us
left subtree of νl . If z < e then no element v ∈ N to solve our problem.
νl
νr
ν
3
then proceed as above. This gives us the lowest left
(right) splitting node in constant time.
The lowest common ancestor νk of the elements
e and f is the last common node on the paths from
the root to e and f respectively. The path from
the root to e can be deduced from the binary representation of e. Hence, the node νk corresponds
to the least significant common bit j in the binary
(1) representations of e and f . Let i = e div 2 then
eq. (1) gives
In the Yggdrasil memory implementation the
registers overlap in the form of a complete binary
tree (Figure 2) with bits Bk enumerated in standard heap order. The most significant bit m − 1 of
such a register is found in the root of the tree and
in general:
reg[i].bit[j] = Bk where
k = (i div 2j ) + 2m−j−1
k = ((e div 2) div 2j ) + 2m−j−1
= (e div 2j+1 ) + 2m−j−1
Bit j in register
:
3
B1
reg[5]
B2
2
1
The least significant common bit j is next to, toward the top, the most significant non common bit
of e and f , i.e.
B3
B4
B5
B6
(2)
j = MSB(e XOR f ) + 1
B7
(3)
Algorithm 2: Representation of the STT with
Yggdrasil memory
Now we can find both the lowest splitting nodes
of an element and the lowest common ancestor of
two elements in constant time. Following pointers
and updating those and the tags can also be done
in constant time. Hence, given Yggdrasil memory,
updates and queries can be performed in constant
time. The Yggdrasil memory has ben developed
by Priqueue AB, as a SDRAM memory module according to the PC100 standard [5].
Note that the STT still contains redundant information, the leaves can be removed since the leaf
is tagged if and only if it is pointed at by an internal node. Moreover, the information in the internal nodes can be stored using pointers of variable
length. This reduces the space requirements from
2M w+M +O(w) bits to 4M +O(w) bits of conventional memory for the vEB problem. However, reducing the size increases the constant for the time.
Or we can use perfect hashing (cf. [6]) to store the
internal nodes that are tagged. This gives a space
requirement of O(N ) word instead of O(M/ lg M )
words of ordinary memory.
The above discussion leads to the following Theorem:
To find the lowest splitting node νk of the element e we read the register reg[e div 2], find
the least significant set bit j and compute k using eq. (1). To find the lowest right (left) splitting
node we use e (e) to mask the register value and
Theorem 1 Using the Split-Tagged-Tree together
with the Yggdrasil memory we can solve the vEB
problem in O(1) time per operation using only
4M + O(w) bits of ordinary memory and M bits
of Yggdrasil memory.
0
B8
Register:
0
B9
B10
B11
B12
B13
B14
B15
1
2
3
4
5
6
7
Figure 2: Overlapped memory Yggdrasil.
We now store the tags of the STT in the Yggdrasil memory, reg. The tree of internal pointers
is stored in an array, denoted nodes, while the tags
of the leaves are stored in the boolean array, leaves
(see Algorithm 2).
typedef struct node {
int left;
/* Largest element in left subtree */
int right; /* Smallest element in right subtree */
} STT node;
typedef struct STT {
STT node nodes[M]; /* Array of internal nodes */
boolean leaves[M];
/* Array of leaf tags */
Yggdrasil reg[M/2];
/* Register with byte
overlapping */
} STT;
4
[9] P. B. Miltersen. Lower bounds for Union-SplitFind related problems on random access machines. In Proceedings of the Twenty-Sixth
Annual ACM Symposium on the Theory of
Computing, pages 625–634, Montréal, Québec,
Canada, 23–25 May 1994.
If we want to solve the Priority Queue problem
or the vEB problem with support for either predecessor or successor but not both we can simply
remove the pointer to one or the other element in
the nodes. This saves half of the ordinary memory
and it reduces the constant for updates times.
[10] A. Schönhage. Storage modifications machines.
SIAM Journal on Computing,
9(3):490–508, August 1980.
References
[1] A. Andersson and M. Thorup. Tight(er) worst- [11] P. van Emde Boas. Machine models and simcase bounds on dynamic searching and priority
ulations. volume A: Algorithms and Complexqueues. Manuscript to apper in STOC2000.
ity, pages 3–66. Elsevier/MIT Press, Amsterdam, 1990.
[2] P. Beame and F. E. Fich. Optimal bounds
for the predecessor problem. In Proceedings [12] P. van Emde Boas, R. Kaas, and E. Zijlstra.
of the Thirty-First Annual ACM Symposium
Design and implementation of an efficient prion Theory of Computing, pages 295–304, New
ority queue. Mathematical Systems Theory,
York, May 1–4 1999. ACM Press.
10:99–127, 1977.
[3] A. Brodnik. Computation of the least signif- [13] A. Yao. Some complexity questions related to
distributive computing. In Proceedings of the
icant set bit. In Proceedings Electrotechnical
11th Annual ACM Symposium on Theory of
and Computer Science Conference, volume B,
Computing, pages 209–213, 1979.
pages 7–10, Portorož, Slovenia, 1993.
[4] A. Brodnik. Searching in Constant Time [14] A. Yao. Should tables be sorted? Journal of
the ACM, 28(3):614–628, July 1981.
and Minimum Space ( Minimæ Res Magni
Momenti Sunt). PhD thesis, University of
Waterloo, Waterloo, Ontario, Canada, 1995.
(Also published as technical report CS-95-41.).
[5] A. Brodnik, J. Karlsson, R. Leben, M. Miletić,
M. Špegel, and A. Trost. Design of high performance memory module on PC100. In Proceedings Electrotechnical and Computer Science Conference, Slovenia, 1999.
[6] M. L. Fredman, J. Komlós, and E. Szemerédi.
Storing a sparse table with O(1) worst case access time. Journal of the Association for Computing Machinery, 31(3):538–544, July 1984.
[7] M. L. Fredman and M. E. Saks. The cell
probe complexity of dynamic data structures.
In 21st ACM Symposium on Theory of Computing, pages 345–354, 1989.
[8] K. Mehlhorn, S. Näher, and H. Alt. A lower
bound on the complexity of the union-splitfind problem. SIAM Journal on Computing,
17(6):1093–1102, 1988.
5