Download Application of Data Structures

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linked list wikipedia , lookup

Quadtree wikipedia , lookup

B-tree wikipedia , lookup

Red–black tree wikipedia , lookup

Binary tree wikipedia , lookup

Lattice model (finance) wikipedia , lookup

Interval tree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
Application of Data Structures
Christopher Moh 2005
Overview

Priority Queue structures




Heaps
Application: Dijkstra’s algorithm
Cumulative Sum Data Structures on
Intervals
Augmenting data structures with extra
info to solve questions
Christopher Moh 2005
Priority Queue (PQ) Structures

Stores elements in a list by comparing a
key field



Often has other satellite data
For example, when sorting pixels by their R
value, we consider the R as the key field
and GB as satellite data
Priority queues allow us to sort
elements by their key field.
Christopher Moh 2005
Common PQ operations

Create()


Find_Min()


Insert element x (with predefined key field)
Delete(x)


Returns the smallest element (by key field)
Insert(x)


Creates an empty priority queue
Delete position x from the queue
Change(x, k)

Change key field of position x to k
Christopher Moh 2005
Optional PQ operations

Union (a,b)


Combines two PQs a and b
Search (k)

Returns the position of the element in the
heap with key value k
Christopher Moh 2005
Considerations when
implementing a PQ in competition

How complicated is it?


How fast does it need to be?


Is the code likely to be buggy?
Does a constant factor also come into the
equation?
Do I need to store extra data to do a Search?

During the course of this presentation, we shall
assume that there exists existing extra data which
allows us to do a search in O(1) time. The
handling of this data structure will be assumed
and not covered.
Christopher Moh 2005
Linear Array

Unsorted Array



Create, Insert, Change in O(1) time
Find_min, Delete in O(n) time
Sorted Array


Create, Find_min in O(1) time
Insert, Delete, Change in O(n + log n) =
O(n) time
Christopher Moh 2005
Binary Heaps

Will be the most common structure that will
be implemented in competition setting




Efficient for most applications
Easy to implement
A heap is a structure where the value of a
node is less than the value of all of its
children
A binary heap is a heap where the maximum
number of children for each node is 2.
Christopher Moh 2005
Array implementation

Consider a heap of size nheap in an array
BHeap[1..nheap] (Define BHeap[nheap+1 ..
(nheap*2)+1] to be INFINITY for practical reasons)




The children of BHeap[x] are BHeap[x*2] and
BHeap[x*2+1]
The parent of BHeap[x] are BHeap[x/2]
This allows a near uniform Binary Heap where we can
ensure that the number of levels in this heap is O(log n)
Some properties wrt Key values: BHeap[x] >= BHeap[x/2],
BHeap[x] <= BHeap[x*2], BHeap[x] <= BHeap[x*2+1],
BHeap[x*2] ?? BHeap[x*2+1]
Christopher Moh 2005
PQ Operations on a BHeap


We define BTree(x) to be the Binary Tree rooted at
BHeap[x]
We define Heapify(x) to be an operation that does
the following:




Assume: BTree(x*2) and BTree(x*2+1) are binary heaps but
BTree(x) is not necessarily a binary heap
Produce: BTree(x) binary heap
Details of Heapify in later slides – but for now, we assume
Heapify is O(log n)
For the rest of the presentation, we assume the
variable n refers to nheap
Christopher Moh 2005
Operations on a BHeap
Create is trivial – O(1) time
Find_min:


1.

Return BHeap[1]
O(1) time
Insert (element with key value x)

1.
2.
3.
4.
nheap++
BHeap[nheap] = x
T = nheap
While (T != 1 && Bheap[T] < BHeap[T/2])
1.
2.

Swap (Bheap[T], BHeap[T/2]
T=T/2
O(log n) time as the number of levels is O(log n)
Christopher Moh 2005
Operations on a BHeap
ChangeDown (position x, new key value k)


1.
2.
3.
Assume: k < existing BHeap[x]
BHeap[x] = k
T=x
While (T != 1 && BHeap[T] < BHeap[T/2])
1.
2.


Swap (BHeap[T], BHeap[T/2])
T = T/2
Complexity: O(log n)
This procedure is known as “bubbling up” the
heap
Christopher Moh 2005
Operations on a BHeap
ChangeUp (position x, new key value
k)


1.
2.

Assume: k > existing BHeap[x]
BHeap[x] = k
Heapify(x)
O(log n) as complexity of Heapify is O(log
n)
Christopher Moh 2005
Operations on a BHeap
Delete (position x on the heap)

1.
2.
3.
4.
5.
BHeap[x] = BHeap[nheap]
nheap—
Heapify(x)
T=x
While (T != 1 && BHeap[T] < BHeap[T/2])
1.
2.


Swap (BHeap[T], BHeap[T/2])
T=T/2
Complexity is O(log n)
Why must I do both Heapify and “bubble up”?
Christopher Moh 2005
Operations on a BHeap
Heapify (position x on the heap)

1.
2.
3.
4.
5.

T = min(BHeap[x], BHeap[x*2], BHeap[x*2+1])
If (T == BHeap[x]) return;
K = position where BHeap[K] = T
Swap(BHeap[x], BHeap[K])
Heapify(K)
O(log n) as the maximum number of levels in
the heap is O(log n) and Heapify only goes
through each level at most once
Christopher Moh 2005
BHeap Operations: Summary



Create, Find_min in O(1) time
Change (includes both ChangeUp and
ChangeDown), Insert, and Delete are
O(log n) time
Union operations are how long?


Insertion: O(n log n) union
Heapify: O(n) union
Christopher Moh 2005
Corollary: Heapsort
We can convert an unsorted array to a heap
using Heapify (why does this work?):

1.
For (i = n/2; i >= 1; i--)
1.
We can then return a sorted list (list initially
empty):

1.
For (i = 1; i <= n; i++)
1.
2.

Heapify(i)
Append the value of find_min to the list
Delete(1)
Complexity is O(n log n)
Christopher Moh 2005
Binomial Trees

Define Binomial Tree B(k) as follows:


B(0) is a single node
B(n), n != 0, is formed by merging two B(n-1)
trees in the following way:


The root of the B(n) tree is the root of one of the B(n-1)
trees, and the (new) leftmost child of this root is the root
of the other B(n-1) tree.
Within the tree, the heap property holds i.e. that
the key field of any node is greater than the key
field of all its children.
Christopher Moh 2005
Properties of Binomial Trees



The number of nodes in B(k) is exactly
2^k.
The height of B(k) is exactly (k + 1)
For any tree B(k)


The root of B(k) has exactly k children
If we take the children of B(k) from left to
right, they form the roots of a B(k-1), B(k2), …, B(0) tree in that order
Christopher Moh 2005
Binomial Heaps

Binomial Heaps are a forest of binomial trees with the
following properties:



All the binomial trees are of different sizes
The binomial trees are ordered (from left to right) by
increasing size
If we consider the fact that the size of B(k) is 2^k,
the binomial tree B(k) exists in a binomial heap of n
nodes iff the bit representing 2^k is “1” in the binary
representation of n

For example: 13 (decimal) = 1101 (binary), so the binomial
heap with 13 nodes consists of the binomial trees B(0), B(2),
and B(3).
Christopher Moh 2005
Binomial Heap Implementation

Each node will store the following data:


Key field
Pointers (if non-existent, points to NIL) to






Parent
Next Sibling (ordered left to right; a sibling must have the
same parent); For roots of binomial trees, next sibling points to
the root of the next binomial tree
Leftmost child
Number of children in field degree
Any other data that might be useful for the program
The binomial heap is represented by a head pointer
that points to the root of the smallest binomial tree
(which is the leftmost binomial tree)
Christopher Moh 2005
Operations on Binomial Trees
Link (h1, h2)



1.
2.
3.
4.

Links two binomial trees with root h1 and h2 of
the same order k to form a new binomial tree of
order (k+1)
We assume h1->key < h2->key which implies
that h1 is the root of the new tree
T = h1->leftchild
h1->leftchild = h2
h2->parent = h1
H2->next_sibling= T
O(1) time
Christopher Moh 2005
Operations on binomial heaps
Create – Create a new binomial heap with one
node (key field set)



Set Parent, Leftchild, Next sibling to NIL
O(1) time
Find_min

1.
2.
X = head, min = INFINITY
While (X != nil)
1.
2.
3.

If (X->key < min) min = X->key
X = X->next_sibling
Return min
O(log n) time as there are at most log n binomial trees
(log n bits)
Christopher Moh 2005
More Operations
Merge (h1, h2, L)



Given binomial heaps with head pointers
h1 and h2, create a list L of all the
binomial trees of h1 U h2 arranged in
ascending order of size
For any order k, there may be zero, one,
or two binomial trees of order k in this
list.
Christopher Moh 2005
More Operations
Merge (h1, h2, L)


1.
2.
Assume that NIL is a node of infinitely small
order
L = empty
While (h1 != NIL || h2 != NIL)
1.
If (h1->degree < h2->degree)
1.
2.
2.
Append the (binomial)tree with root h1 to L
h1 = h1->next_sibling
Else
1.
Apply above steps to h2 instead
Christopher Moh 2005
More Operations

Union (h1, h2)


The fundamental operation involving
binomial heaps
Takes two binomial heaps with head
pointers h1 and h2 and creates a new
binomial heap of the union of h1 and h2
Christopher Moh 2005
More Operations
Union (h1, h2)

1.
2.
3.
Start with empty binomial heap
Merge (h1, h2, L)
Go by increasing k in the list L until L is empty
1.
2.

If there is exactly one or exactly three (how can this
happen?) binomial trees of order k in L, append one
binomial tree of order k to the binomial heap and
remove that tree from L
If there are two trees of order k, remove both trees,
use Link to form a tree of order (k+1) and pre-pend
this tree to L
Union is O(log n)
Christopher Moh 2005
More Operations

Inserting a new node with key field set




Create a new binomial heap with that one node
Union (existing heap with head h, new heap)
O (log n) time
ChangeDown (node at position x, new value)



Decreasing the key value of a node
Same idea as binary heap: “Bubble” up the
binomial tree containing this node (exchange only
key fields and satellite data! What’s the
complexity if you physically change the node?)
O (log n) time
Christopher Moh 2005
More Operations
Delete (node at position x)


1.



2.
3.
4.

Deleting position x from the heap
ChangeDown(x, -INFINITY)
Now x is at the root of its binomial tree
Supposing that the binomial tree is of order k
Recall that the children of the root of the binomial tree,
from right to left, are binomial trees of order 0, 1, 2, 3, 4,
…, k-1
Form a new binomial heap with the children of the root of
this binomial tree the roots in the new binomial heap
Remove the original binomial tree from the original
binomial heap
Union (original heap, new heap)
O(log n) complexity
Christopher Moh 2005
More Operations
ChangeUp (node at position X, new
value)

1.
2.

Delete (X)
Insert (new value)
O (log n) time
Christopher Moh 2005
Summary – Binomial Heaps



Create in O(1) time
Union, Find_min, Delete, Insert, and Change
operations take O(log n) time
In general, because they are more
complicated, in competition it is far more
prudent (saves time coding and debugging)
to use a binary heap instead

Unless there are MANY Union operations
Christopher Moh 2005
Application of heaps: Dijkstra


1.
2.
3.
The following describes how Dijkstra’s
algorithm can be coded with a binary heap
Initializing phase:
Let n be the number of nodes
Create a heap of size n, all key fields
initialized to INFINITY
Change_val (s, 0) where s is the source
node
Christopher Moh 2005
Running of Dijkstra’s algorithm
1.
While (heap is not empty)
1.
2.
3.
X = node corresponding to find_min
value
Delete (position of X in heap = 1)
For all nodes k that are adjacent to X
1.
If (cost[X] + distance[X][k] < cost[k])
1.
ChangeDown (position of k in heap, cost[X] +
distance[X][k])
Christopher Moh 2005
Analysis of running time

At most n nodes are deleted


Let m be the number of edges. Each edge is
relaxed at most once.



O(n log n)
O(m log n)
Total running time O([m+n] log n)
This is faster than using a basic array list
unless the graph is very dense, in which case
m is about O(n^2) which leads to a running
time of O(n^2 log n)
Christopher Moh 2005
Cumulative Sum on Intervals


Problem: We have a line that runs from x
coordinate 1 to x coordinate N. At x
coordinate X [X an integer between 0 and N],
there is g(X) gold. Given an interval [a,b],
how much gold is there between a and b?
How efficiently can this be done if we
dynamically change the amount of gold and
the interval [a,b] keeps changing?
Christopher Moh 2005
Cumulative Sum Array



Let us define C(0) = 0, and C(x) = C(x-1) + g(x)
where g(x) is the amount of gold at position x
C(x) then defines the total amount of gold from
position 1 to position x
The amount of gold in interval [a,b] is simply C(b) –
C(a-1)


For any change in a or b, we can perform the update in O(1)
time
However, if we change g(x), we will have to change
C(x), C(x+1), C(x+2), …, C(N)

Any change in gold results in an update in O(N) time
Christopher Moh 2005
Cumulative Sum Tree


We can use the binary representation of any number
to come up with a cumulative sum tree
For example, let say we take 13 (decimal) = 1101
(binary)

The cumulative sum of g(1) + g(2) + … g(13) can be
represented as the sum of:




g(1) + g(2) + … + g(8) [ 8 elements ]
g(9) + g(10) + … + g(12) [ 4 elements ]
g(13) [ 1 element ]
Notice that the number of elements in each case represents
a bit that is “1” in the binary representation of the number
Christopher Moh 2005
Cumulative Sum Tree

Another example: C(19)

19 (decimal) is 10011 (binary)




C(19) is the sum of the following:
g(1) + g(2) + … + g(16) [ 16 elements ]
g(17) + g(18) [ 2 elements ]
g(19) [ 1 element ]
Christopher Moh 2005
Cumulative Sum Tree


Let us define C2(x) to be the sum of g(x) +
g(x-1) + … + g(p + 1) where p is a number
with the same binary representation as x
except the least significant bit of x (the
rightmost bit of x that is “1”) is “0”
Examples of x and the corresponding p:



x = 6 [110], p = 4 [100]
x = 13 [1101], p = 12 [1100]
x = 16 [10000], p = 0 [00000]
Christopher Moh 2005
Cumulative Sum Tree

If we want to find the cumulative sum C(x) = g(1) +
g(2) + … + g(x), we can trace through the values of
C2 using the binary representation of x






Examples:
C(13) = C2(8) + C2(8+4) + C2(8+4+1)
C(16) = C2(16)
C(21) = C2(16) + C2(16+4) + C2(16+4+1)
C(99) = C2(64) + C2(64+32) + C2(64+32+2) +
C2(64+32+2+1)
This allows us to find C(x) in log x time

Hence the amount of gold in interval [a,b] = C(b) – C(a-1)
can be found in log N time, which implies updates of a and b
can be done in O(log N)
Christopher Moh 2005
Cumulative Sum Tree
What happens when we change g(x)?



If g(x) is changed, we only need to update C2(y)
where C2(y) covers g(x)
We can go through all necessary C2(y) in the
following way:
1.
While (x <= N)
1.
2.


Update C2(x)
Add the value of the least significant bit of x to x
This runs in O(log N) time
Hence updates to g can also be done in O(log n)
time, which is a great improvement over the
O(N) needed for an array.
Christopher Moh 2005
Cumulative Sum Tree

Examples [binary representation in brackets]





Change to g(5) [ 101 ] : Update C2(5), C2(6), C2(8),
C2(16) and all C2(power of 2 > 16)
Change to g(13) [ 1101 ]: Update C2(13), C2(14), C2(16),
and all C2(power of 2 > 16)
Change to g(35) [ 100011 ]: Update C2(35), C2(36),
C2(40), C2(48), C2(64), and all C2(power of 2 > 64)
We can implement a cumulative sum tree very
simply: By simply using a linear array to store the
values of C2.
Can we extend a cumulative sum tree to 2 or more
dimensions?

See IOI 2001 Day 1 Question 1
Christopher Moh 2005
Sum of Intervals Tree







Another way to solve the question is to use a “Sum of
Intervals” Binary Tree
Each node in the tree is represented by (L, R) and
the value of (L,R) is g(L) + g(L+1) + … + g(R)
The root of the tree has L = 1 and R = N
Every leaf has L = R
Every non-leaf has children (L, [L+R]/2) [left child]
and ([L+R]/2+1, R) [right child]
The number of nodes in the tree is O(2*N) [ why? ]
In an implementation, every node should have
pointers to its children and its parent
Christopher Moh 2005
Sum of Intervals Tree
How to find C(x) = g(1) + g(2) + … + g(x)?


1.
2.
We trace from the root downwards
L = 1, R = N, C = 0
While (L != R)
1.
2.
M = (L + R) / 2
If (M < x)
1.
2.
3.
Else
1.
3.

C += value of (L,R)
Set L and R to the left child of the current node
Set L and R to the right child of the current node
C += value at (L,R) [ or (L,L) or (R,R) as L = R ]
Time complexity: O(log n)
Christopher Moh 2005
Sum of Intervals Tree
What happens when g(x) is changed?


1.
2.
Trace from (x,x) upwards to the root
Let L = R = x
While (L,R) is not the root
1.
2.
3.


Update the value of (L,R)
Set (L,R) to the parent of (L,R)
Update the root
Complexity of O(log N)
Hence all updates of interval [a,b] and g(x)
can be done in O(log N) time
Christopher Moh 2005
Augmenting Data Structures



It is often useful to change the data structure
in some way, by adding additional data in
each node or changing what each node
represents.
This allows us to use the same data structure
to solve problems
For example, we can use so-called “interval
trees” to solve not just cumulative sum
problems

We can use properties of elements in the interval
(L,R) that are related to L and R.
Christopher Moh 2005
Other data structures

Balanced (and unbalanced) binary trees





Red-Black trees
2-3-4 trees
Splay trees
Suffix Trees
Fibonacci Heaps
Christopher Moh 2005