Download Chapter 7 Data Structure Transformations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Chapter 7
Data Structure Transformations
Basheer Qolomany
Outline
• 7.1 Making Structures Dynamic
• 7.2 Making Structures Persistent
7.1 Making Structures Dynamic
• Two types of data structures for solving searching
problems:
– A static structure is built once and then searched
many times; insertions and deletions of elements are
not allowed.
– dynamic structure: This structure is initially empty,
and the three operations available on it are for
inserting a new element, for deleting a current
element, and for performing a search.
Static & Dynamic Structures
• To describe the performance of the static structure A we
give three functions of N :
– PA(N) = the preprocessing time required to build A,
– QA(N) = the query time required to perform a search in A, and
– SA(N) = the storage required to represent A.
• We analyze the performance of the dynamic structure B
by giving the functions
–
–
–
–
IB ( N ) = the insertion time for B,
DB(N) = the deletion time for B,
QB(N) = the query time required to perform a search in B, and
SB(N) = the storage required to represent B.
• compare the performance of the static structure A with
that of the dynamic structure B, we define the
"insertion" time for the static structure A as:
I A(N) = PA(N) / N,
• Which is the cost of building an N-element structure
amortized over the N elements it represents.
• We define the cost of "preprocessing" the dynamic
structure B to be:
PB(N) =
1≤𝑖≤𝑛 𝐼𝐵
𝑖
The Transformations
• The two well-studied problems here are how to make a
static structure dynamic and how to allow queries in old
states of a dynamic data structure.
• Static structures, like the interval trees: they are built once
and then allow queries, but no changes of the underlying
data.
• To make them dynamic, we want to allow changes in the
underlying data.
• There are efficient construction methods that take the static
data structure as a black box, which is used to build the new
dynamic structure.The most important such class is the
decomposable searching problems.
decomposable searching problems
• The notion of decomposable search problems, and the
idea of a static-to-dynamic transformation, goes back to
Bentley (1979).
• The underlying idea is always that the current set is
partitioned in a number of blocks
X = X1 ∪ · · · ∪ Xm.
Each block is stored by one static structure; queries are
answered by querying each of these static structures, and
updates are performed by rebuilding one or several
blocks.
decomposable searching problems
• The original method in Bentley (1979) uses only blocks
whose size is a power of two, and only one block of each
size. Thus there are at most log n blocks.
• This gives a bad worst-case complexity because we
might have to rebuild everything into one structure; but
the structure of size 2i is rebuilt only when the ith bit of
n changes, which is every 2i−1th step.
Insertion
• If preproc(k) is the time to build a static structure
of size k, then the total time of the first n inserts
is:
• Thus the amortized insertion time in a set of n
elements is
Theorem
• Given a static structure for a decomposable searching
problem that can be built in time O (n(log n)c) and that
answers queries in time O(log n) for an n-element set,
the exponential-blocks transformation gives a structure
for the same problem that supports insertion in
amortized O((log n)c+1) time and queries in worst-case
O((log n)2) time.
• This method is not useful for deletion; if we delete an
element from the largest block, we have to rebuild
everything, so we can easily construct a sequence of
alternating insert and delete operations, in which each
time the entire structure has to be rebuilt.
• A method that also supports deletion partitions the set
in Θ(√n) blocks of size O(√n) .
Theorem
• Given a static structure for a decomposable searching
problem that can be built in time O (n(log n)c) and that
answers queries in time O(log n) for an n-element set,
the √n-blocks transformation gives a structure for the
same problem that supports insertion and deletion in
O(√n(log n)c) time and queries in O(√n log n) time, all
times worst case.
Weak deletion
• A weak deletion deletes the element, so that the queries
are answered correctly, but the time bound for
subsequent queries and weak deletions does not
decrease.
• If we combine the weak deletion with the exponentialblocks idea, we get the following structure: The current
set is partitioned into blocks, where each block has a
nominal size and an actual size. The nominal size is a
power of 2, with each power occurring at most once. The
actual size of a block with nominal size 2i is between 2i−1 +
1 and 2i .
• To delete an element, we find its block and perform a
weak deletion, decreasing the actual size. If by this the
actual size of the block becomes 2i−1, we check whether
there is a block of nominal size 2i−1; if there is none, we
rebuild the block of actual size 2i−1 as block of nominal
size 2i−1. Else, we rebuild the block of actual size 2i−1
together with the elements of the block of nominal size
2i−1 as block of nominal size 2i .
• To insert an element, we create a block of size 1 and
perform the binary addition of the blocks, based on their
nominal size.
• To query, we perform the query for each block.
Theorem
• Given a static structure for a decomposable searching
problem that can be built in time preproc(n) and that
supports weak deletion in time weakdel(n), and answers
queries in time query(n) for an n-element set, the
exponential-blocks transformation with weak deletion
gives a structure for the same problem that supports
insertion in amortized O((log n) preproc(n)/n ) time,
deletions in amortized O(weakdel(n) + preproc(n)/n )
time, and queries in worst-case O(log n query(n)) time.
7.2 Making Structures Persistent
• A dynamic data structure changes over time, and
sometimes it is useful if we can access old versions of it.
• Obvious application is revision control and the
implementation of the “undo” command in editors,
multiple file versions, and error recovery.
• For example, given a database containing a company's
personnel administration, it might be important to be able
to ask questions like: how many people had a salary >= x
one year ago. To answer this kind of so-called in-the-past
queries, we require that the data structure can remember
relevant information concerning its own history.
Techniques to access earlier versions
• “Partial persistence” The most natural persistence, allows
queries to previous versions, which could be identified by
timestamps or version numbers.
• “Full persistence” in which past versions can also be
changed, giving rise to a version tree without any special
current version.
• “Confluent persistence” studied first for double-ended
queues, in a confluently persistent structure, one may also
join different versions. But these stronger variants of
persistence seem only of theoretical interest.
• “Backtracking”: setting the current version back to an old
version and discarding all changes since then. The use of a
stack for old versions predates all persistence considerations.
Fat nodes
• “Fat nodes” method is a transformation that replaces
each node of the pointer-based structure by a search
tree for the correct version of the node, using the query
time as key. Each time the underlying structure is
modified, any “fat” node whose content is modified just
receives a new version entry in its search tree; and newly
created nodes contain new search trees, initially with
one version only.
Theorem
• Any dynamic structure in the pointer-machine model that
supports queries in time query(n) and updates in time
update(n) on a set with n elements can be made
persistent, allowing queries to past versions, with a query
time O(log n query(n)) for past versions, O(query(n)) for
the current version, and update time O(update(n)), using
the “fat nodes” method combined with a search tree that
allows constant-time queries and updates at the maximum
end.
Theorem
• Any dynamic structure in the pointer-machine model
that supports queries in time query(n) and updates in
time update(n) on a set with n elements can be made
to support backtracking, using stacks for “fat nodes,”
with a query time O(query(n)), update time
O(update(n)), and backtrack time amortized O(1),
with a sequence of a updates, b queries, and c
backtracks, starting on an initially empty set, taking
O(a update(a) + b query(a) + c).
Thank You