Download 2-3 trees 2-3

Document related concepts

Linked list wikipedia , lookup

Lattice model (finance) wikipedia , lookup

Quadtree wikipedia , lookup

Redโ€“black tree wikipedia , lookup

Binary tree wikipedia , lookup

Interval tree wikipedia , lookup

B-tree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
Multi-Way Search Trees
Manolis Koubarakis
Data Structures and Programming
Techniques
1
Multi-Way Search Trees
โ€ข Multi-way trees are trees such that each internal
node can have many children.
โ€ข Let us assume that the entries we store in a
search tree are pairs of the form (๐‘˜, ๐‘ฅ) where ๐‘˜ is
the key and ๐‘ฅ the value associated with the key.
โ€ข Example: Assume we store information about
students. The key can be the student ID while the
value can be information such as name, year of
study etc.
Data Structures and Programming
Techniques
2
Definitions
โ€ข A tree is ordered if there is a linear ordering
defined for the children of each node; that is,
we can identify children of a node as being the
first, the second, third and so on.
โ€ข Let ๐‘ฃ be a node of an ordered tree. We say
that ๐‘ฃ is a ๐’…-node if ๐‘ฃ has ๐‘‘ children.
Data Structures and Programming
Techniques
3
Definitions (contโ€™d)
โ€ข A multi-way search tree is an ordered tree ๐‘‡ that
has the following properties:
โ€“ Each internal node of ๐‘‡ has at least 2 children. That is,
each internal node is a ๐‘‘-node such that ๐‘‘ โ‰ฅ 2.
โ€“ Each internal ๐‘‘-node of ๐‘‡ with children ๐‘ฃ1 , โ‹ฏ , ๐‘ฃ๐‘‘
stores an ordered set of ๐‘‘ โˆ’ 1 key-value entries
๐‘˜1 , ๐‘ฅ1 , โ‹ฏ , ๐‘˜๐‘‘โˆ’1 , ๐‘ฅ๐‘‘โˆ’1 , where ๐‘˜1 โ‰ค โ‹ฏ โ‰ค ๐‘˜๐‘‘โˆ’1 .
โ€“ Let us conveniently define ๐‘˜0 = โˆ’โˆž and ๐‘˜๐‘‘ = +โˆž.
For each entry (๐‘˜, ๐‘ฅ) stored at a node in the subtree
of ๐‘ฃ rooted at ๐‘ฃ๐‘– , ๐‘– = 1, โ‹ฏ , ๐‘‘, we have that ๐‘˜๐‘–โˆ’1 โ‰ค
๐‘˜ โ‰ค ๐‘˜๐‘– .
Data Structures and Programming
Techniques
4
Definitions (contโ€™d)
โ€ข By the above definition, the external nodes of a
multi-way search tree do not store any entries
and are โ€œdummyโ€ nodes (i.e., our trees are
extended trees).
โ€ข When ๐‘š โ‰ฅ 2 is the maximum number of
children that a node is allowed to have, then we
have an ๐’Ž-way search tree.
โ€ข A binary search tree is a special case of a multiway search tree, where each internal node stores
one entry and has two children (i.e., ๐‘š = 2).
Data Structures and Programming
Techniques
5
Example Multi-Way Search Tree (๐‘š =
3)
22
5
3
4
6
10
8
11
25
14
13
23
24
27
17
Data Structures and Programming
Techniques
6
Proposition
โ€ข Let ๐‘‡ be an ๐‘š-way search tree with height โ„Ž,
๐‘› entries and ๐‘›๐ธ external nodes. Then, the
following inequalities hold:
1. โ„Ž โ‰ค ๐‘› โ‰ค ๐‘šโ„Ž โˆ’ 1
2. log ๐‘š (๐‘› + 1) โ‰ค โ„Ž โ‰ค ๐‘›
3. ๐‘›๐ธ = ๐‘› + 1
โ€ข Proof?
Data Structures and Programming
Techniques
7
Proof
โ€ข We will prove (1) first.
โ€ข The lower bound can be seen by considering
an ๐‘š-way search tree like the one given on
the next slide where we have one internal
node and one entry in each node for levels
0, 1, 2, โ‹ฏ , โ„Ž โˆ’ 1 and level โ„Ž contains only
external nodes.
Data Structures and Programming
Techniques
8
Proof (contโ€™d)
0
1
2
โ‹ฎ
โ„Žโˆ’1
โ„Ž
Data Structures and Programming
Techniques
9
Proof (contโ€™d)
โ€ข For the upper bound, consider an ๐‘š-way search
tree of height โ„Ž where each internal node in the
levels 0 to โ„Ž โˆ’ 1 has exactly ๐‘š children (the
external nodes are at level โ„Ž ).
๐‘šโ„Ž โˆ’1
in
๐‘šโˆ’1
๐‘–
โ€ข These internal nodes are โ„Žโˆ’1
๐‘š
=
๐‘–=0
total.
โ€ข Since each of these nodes has ๐‘š โˆ’ 1 entries, the
total number of entries in the internal nodes is
๐‘šโ„Ž โˆ’ 1.
Data Structures and Programming
Techniques
10
Proof (contโ€™d)
โ€ข To prove the lower bound of (2), rewrite (1)
and take logarithms in base ๐‘š. The upper
bound in (2) is the same as the lower bound in
(1).
โ€ข The proof of (3) is left as an exercise.
Data Structures and Programming
Techniques
11
Searching in a Multi-Way Search Tree
โ€ข Let ๐‘‡ be a multi-way search tree and ๐‘˜ be a key.
โ€ข The algorithm for searching for an entry with key ๐‘˜ is
simple.
โ€ข We trace a path in ๐‘‡ starting at the root.
โ€ข When we are at a ๐‘‘-node ๐‘ฃ during the search, we
compare the key ๐‘˜ with the keys ๐‘˜1 , โ‹ฏ , ๐‘˜๐‘‘โˆ’1 stored at
๐‘ฃ.
โ€ข If ๐‘˜ = ๐‘˜๐‘– , for some ๐‘–, the search is successfully
completed. Otherwise, we continue the search in the
child ๐‘ฃ๐‘– of ๐‘ฃ such that ๐‘˜๐‘–โˆ’1 < ๐‘˜ < ๐‘˜๐‘– .
โ€ข If we reach an external node, then we know that there
is no entry with key ๐‘˜ in ๐‘‡.
Data Structures and Programming
Techniques
12
Example Multi-Way Search Tree
22
5
3
4
6
10
8
11
25
14
13
23
24
27
17
Data Structures and Programming
Techniques
13
Search for Key 12
22
5
3
4
6
10
8
11
25
14
13
23
24
27
17
Unsuccessful search
Data Structures and Programming
Techniques
14
Search for Key 24
22
5
3
4
6
10
8
25
14
23
24
27
Successful search
11
13
17
Data Structures and Programming
Techniques
15
Insertion in a Multi-Way Search Tree
โ€ข If we want to insert a new pair (๐‘˜, ๐‘ฅ) into a multi-way
search tree, then we start by searching for this entry.
โ€ข If we find the entry, then we do not need to reinsert it.
โ€ข If we end up in an external node, then the entry is not
in the tree. In this case, we return to the parent ๐‘ฃ of
the external node and attempt to insert the key there.
โ€ข If ๐‘ฃ has space for one more key then we insert the
entry there. If not, we create a new node, we insert the
entry in this node and make this node a child of ๐‘ฃ in
the appropriate position.
Data Structures and Programming
Techniques
16
Insert Key 28 (๐‘š = 3)
22
5
3
4
6
10
8
25
14
23
24
27
Unsuccessful search
11
13
17
Data Structures and Programming
Techniques
17
Key 28 Inserted
22
5
3
4
6
10
8
11
25
14
13
23
24
27 28
17
Data Structures and Programming
Techniques
18
Insert Key 32
22
5
3
4
6
10
8
25
14
23
24
27 28
Unsuccessful search
11
13
17
Data Structures and Programming
Techniques
19
Key 32 Inserted
22
5
3
4
6
10
8
11
25
14
13
23
24
17
Data Structures and Programming
Techniques
27 28
32
20
Insert Key 12
22
5
3
4
6
10
8
11
25
14
13
23
24
17
27 28
32
Unsuccessful Search
Data Structures and Programming
Techniques
21
Key 12 Inserted
22
5
3
4
6
10
25
14
8
11
13
23
24
17
27 28
32
12
Data Structures and Programming
Techniques
22
Deletion from a Multi-Way Search Tree
โ€ข The algorithm for deletion from a multi-way
search tree is left as an exercise.
Data Structures and Programming
Techniques
23
Complexity of Operations
โ€ข Let us consider the time to search a ๐‘š-way
search tree for a given key.
โ€ข The time spent at a ๐‘‘-node depends on the
implementation of the node. If we use a sorted
array then, using binary search, we can search a
node in ๐‘‚(log ๐‘‘) time.
โ€ข Thus the time for a search operation in the tree is
๐‘‚ โ„Ž log ๐‘š .
โ€ข The complexity of insertion and deletion is also
๐‘‚ โ„Ž log ๐‘š .
Data Structures and Programming
Techniques
24
Efficiency Considerations
โ€ข We know that maintaining perfect balance in
binary search trees yields shortest average search
paths, but the attempts to maintain perfect
balance when we insert or delete nodes can incur
costly rebalancing in which every node of the tree
needs to be rearranged.
โ€ข AVL trees showed us one way to solve this
problem by abandoning the goal of perfect
balance and adopt the goal of keeping the trees
โ€œalmost balancedโ€.
Data Structures and Programming
Techniques
25
Efficiency Considerations (contโ€™d)
โ€ข Multi-way search trees give us another way to
solve this problem.
โ€ข The primary efficiency goal for a multi-way search
tree is to keep the height as small as possible but
permit the number of keys at each node to vary.
โ€ข We want the height of the tree โ„Ž to be a
logarithmic function of ๐‘›, the total number of
entries stored in the tree.
โ€ข A search tree with logarithmic height is called a
balanced search tree.
Data Structures and Programming
Techniques
26
Balanced Multi-way Search Trees
โ€ข We will study two kinds of balanced multi-way
search trees:
โ€“ 2-3 trees
โ€“ 2-3-4 trees or (2,4) trees.
Data Structures and Programming
Techniques
27
2-3 Trees
โ€ข A 2-3 tree is a multi-way search tree which has
the following properties:
โ€ข Size property: Each internal node contains
one or two entries, and has either two or
three children.
โ€ข Depth property: All leaves of the tree are
empty trees that have the same depth (lie on
a single bottom level).
Data Structures and Programming
Techniques
28
Example of 2-3 Tree
H
J
D
A
E
F
I
Data Structures and Programming
Techniques
K
N
L
O
P
29
Searching in 2-3 Trees
โ€ข To search for the key L, for example, we start
at the root and since L > H, we follow the
pointer to the right subtree of the root node.
โ€ข Now we note that L lies between J and N, so
we follow the middle pointer between the
nodes J and N to the node containing the keys
K and L.
โ€ข L is found and the search terminates.
Data Structures and Programming
Techniques
30
Search for Key L
H
J
D
A
E
F
I
Data Structures and Programming
Techniques
K
N
L
O
P
31
Insertion of New Keys
โ€ข Suppose we want to insert the new key B into the tree.
โ€ข Since B<H, we follow the left pointer from the root
node to the node containing D in the second row.
โ€ข Then we follow the left pointer of Dโ€™s node to the node
containing A.
โ€ข Since B>A, we follow the right pointer of Aโ€™s node
which lead us to an empty tree (an external node).
โ€ข Then we go back to the parent of the external node
and try to store the new key there.
โ€ข The node containing A has room for one more key so
we store key B there. We also add a new empty child to
this node.
Data Structures and Programming
Techniques
32
Example: Insert B
H
J
D
A
E
F
I
Data Structures and Programming
Techniques
K
N
L
O
P
33
Example: Insert B
H
J
D
A
E
F
I
Data Structures and Programming
Techniques
K
N
L
O
P
34
Example: Result
H
J
D
A
B
E
F
I
Data Structures and Programming
Techniques
K
N
L
O
P
35
Insertion of New Keys (contโ€™d)
โ€ข Let us now insert key M. This leads to the attempt
to add M to the node containing K and L which
now overflows with keys K, L and M.
โ€ข The strategy for such cases is to split the
overflowed node into two nodes and pass the
middle key to the parent.
โ€ข Hence we split the overflowed node into two new
nodes containing K and M respectively.
โ€ข We also pass the middle key L to the parent node
containing J and N.
Data Structures and Programming
Techniques
36
Insertion of New Keys (contโ€™d)
โ€ข The attempt to add L to this parent node
results in a new overflowed node in which the
key L lies between keys J and N.
โ€ข So we split this parent node into new nodes
containing J and N respectively, and we pass
the middle key L up to the root.
โ€ข The root has room for L so we store it there.
Data Structures and Programming
Techniques
37
Example: Insert M
H
J
D
A
B
E
F
I
Data Structures and Programming
Techniques
K
N
L
O
P
38
Example: Insert M (contโ€™d)
H
J
D
A
B
E
F
I
Data Structures and Programming
Techniques
K
N
L
O
P
M overflows
this node
39
Example: Insert M (contโ€™d)
H
J
D
N
L
A
B
E
F
I
K
M
The node is split in
two and L is passed
Data Structures and Programming
to the parent node
Techniques
O
P
40
Example: Insert M (contโ€™d)
H
J
D
L overflows
this node
N
L
A
B
E
F
I
Data Structures and Programming
Techniques
K
M
O
P
41
Example: Insert M (contโ€™d)
L
H
D
A
B
E
The node is split in
two and L is passed
up to the parent
N
J
F
I
K
Data Structures and Programming
Techniques
M
O
P
42
Example: Result
H
D
A
B
E
L is inserted in the root node
L
N
J
F
I
K
Data Structures and Programming
Techniques
M
O
P
43
Insertion of New Keys (contโ€™d)
โ€ข Let us now insert key Q. Q should be entered
in the node containing O and P which now
overflows.
โ€ข Thus, this node is split up to two nodes one
containing O and the other containing R, and
the middle key is passed up towards the
parent node.
โ€ข The parent node has only key N so there is
space for P and it is inserted there.
Data Structures and Programming
Techniques
44
Example: Insert Q
H
D
A
B
E
L
N
J
F
I
K
Data Structures and Programming
Techniques
M
O
P
Q overflows
this node
45
Example: Insert Q (contโ€™d)
H
D
L
N
J
P
A
B
E
F
I
K
Data Structures and Programming
Techniques
M
O
QP
This node is split up
and P is passed up
46
Example: Result
H
D
A
B
E
L
N P
J
F
I
K
Data Structures and Programming
Techniques
M
O
QP
47
Insertion of New Keys (contโ€™d)
โ€ข Let us now insert key R. R is inserted in the
node with Q where there is space.
Data Structures and Programming
Techniques
48
Example: Insert R
H
D
A
B
E
L
N P
J
F
I
K
Data Structures and Programming
Techniques
M
O
QP R
49
Inserting New Keys (contโ€™d)
โ€ข Let us now insert key S. S should be inserted in
the node with Q and R.
โ€ข This node overflows. Thus, it is split into two
nodes one containing Q and the other containing
S and R is passed up to the parent node.
โ€ข R now oveflows this node where N and P are also
stored. Thus, this node is split into two nodes one
containing N and the other containing R and the
middle key P is sent up to the parent (the root).
Data Structures and Programming
Techniques
50
Inserting New Keys (contโ€™d)
โ€ข The root now overflows with the addition of P
so it is split into two nodes one containing H
and the other containing P and the middle key
L is used to create a new root.
โ€ข Thus, we have added one more level to the
tree.
Data Structures and Programming
Techniques
51
Example: Insert S
H
D
A
B
E
L
N P
J
F
I
K
Data Structures and Programming
Techniques
M
O
QP R
S overflows
this node
52
Example: Insert S (contโ€™d)
H
D
A
B
E
L
N P
J
F
I
R
K
Data Structures and Programming
Techniques
M
O
Q
S
This node is split
and R is sent up
53
Example: Insert S (contโ€™d)
H
L
R overflows this
node
D
A
B
E
N P
J
F
I
R
K
Data Structures and Programming
Techniques
M
O
Q
S
54
Example: Insert S (contโ€™d)
H
D
A
B
E
P
L
I
R
N
J
F
This node is split up
and P is sent up
K
Data Structures and Programming
Techniques
M
O
Q
S
55
Example: Insert S (contโ€™d)
P overflows the root
H
D
A
B
E
P
L
F
I
R
N
J
K
Data Structures and Programming
Techniques
M
O
Q
S
56
Example: Result
The root splits and
L becomes the new root
L
H
D
A
B
P
J
E
F
I
R
N
K
Data Structures and Programming
Techniques
M
O
Q
S
57
Complexity of Insertion in 2-3 Trees
โ€ข When we insert a key at level ๐‘˜, in the worst case we need
to split ๐‘˜ + 1 nodes (one at each of the ๐‘˜ levels plus the
root).
โ€ข A 2-3 tree containing ๐‘› keys with the maximum number of
levels takes the form of a binary tree where each internal
node has one key and two children.
โ€ข In such a tree ๐‘› = 2๐‘˜+1 โˆ’ 1 where ๐‘˜ is the number of the
lowest level.
โ€ข This implies that ๐‘˜ + 1 = log(๐‘› + 1) from which we see
that the splits are in the worst case ๐‘‚ log ๐‘› .
โ€ข So insertion in a 2-3 tree takes at worst ๐‘ถ ๐ฅ๐จ๐  ๐’ time.
โ€ข Similarly we can prove that searches and deletions take
๐‘ถ ๐ฅ๐จ๐  ๐’ time.
Data Structures and Programming
Techniques
58
(2,4) Trees
โ€ข A (2,4) tree or 2-3-4 tree is a multi-way search
tree which has the following two properties:
โ€“ Size property: Every internal node contains at
least one and at most three keys, and has at least
two and at most four children.
โ€“ Depth property: All the external nodes are empty
trees that have the same depth (lie on a single
bottom level).
Data Structures and Programming
Techniques
59
Result
โ€ข Proposition. The height of a (2,4) tree storing
๐‘› entries is ๐‘‚ log ๐‘› .
โ€ข Proof: Let โ„Ž be the height of a (2,4) tree ๐‘‡
storing ๐‘› entries. We justify the proposition by
showing that
1
log(๐‘›
2
+ 1) โ‰ค โ„Ž
and
โ„Ž โ‰ค log(๐‘› + 1).
Data Structures and Programming
Techniques
60
Result (contโ€™d)
โ€ข Note that by the size property, we have at most 4
nodes at depth 1, at most 42 nodes at depth 2,
and so on. Thus, the number of external nodes of
๐‘‡ is at most 4โ„Ž .
โ€ข Similarly, by the size property, we have at least 2
nodes at depth 1, at least 22 nodes at depth 2,
and so on. Thus, the number of external nodes in
๐‘‡ is at least 2โ„Ž .
โ€ข We also know that the number of external nodes
is ๐‘› + 1.
Data Structures and Programming
Techniques
61
Result (contโ€™d)
โ€ข Therefore, we obtain
2โ„Ž โ‰ค ๐‘› + 1
and
๐‘› + 1 โ‰ค 4โ„Ž .
โ€ข Taking the logarithm in base 2 of each of the
above terms, we get that
โ„Ž โ‰ค log(๐‘› + 1)
and
log(๐‘› + 1) โ‰ค 2โ„Ž.
โ€ข These inequalities prove our claims.
Data Structures and Programming
Techniques
62
Insertion in (2,4) Trees
โ€ข To insert a new entry (๐‘˜, ๐‘ฅ), with key ๐‘˜, into a
(2,4) tree ๐‘‡, we first perform a search for ๐‘˜.
โ€ข Assuming that ๐‘‡ has no entry with key ๐‘˜, this
search terminates unsuccessfully at an
external node ๐‘ง.
โ€ข Let ๐‘ฃ be the parent of ๐‘ง. We insert the new
entry into node ๐‘ฃ and add a new child (an
external node) to ๐‘ฃ on the left of ๐‘ง.
Data Structures and Programming
Techniques
63
Insertion (contโ€™d)
โ€ข Our insertion method preserves the depth
property, since we add a new external node at
the same level as existing external nodes.
โ€ข But it might violate the size property. If a node ๐‘ฃ
was previously a 4-node, then it may become a 5node after the insertion which causes the tree to
longer be a (2,4) tree.
โ€ข This type of violation of the size property is called
an overflow node at node ๐‘ฃ, and it must be
resolved in order to restore the properties of a
(2,4) tree.
Data Structures and Programming
Techniques
64
Dealing with Overflow Nodes
โ€ข Let ๐‘ฃ1 , โ‹ฏ , ๐‘ฃ5 be the children of ๐‘ฃ, and let ๐‘˜1 , โ‹ฏ , ๐‘˜4 be
the keys stored at ๐‘ฃ. To remedy the overflow at node ๐‘ฃ,
we perform a split operation on ๐‘ฃ as follows.
โ€ข Replace ๐‘ฃ with two nodes ๐‘ฃโ€ฒ and ๐‘ฃโ€ฒโ€ฒ, where
โ€“ ๐‘ฃโ€ฒ is a 3-node with children ๐‘ฃ1 , ๐‘ฃ2 , ๐‘ฃ3 storing keys ๐‘˜1 and
๐‘˜2
โ€“ ๐‘ฃโ€ฒโ€ฒ is a 2-node with children ๐‘ฃ4 , ๐‘ฃ5 , storing key ๐‘˜4 .
โ€ข If ๐‘ฃ was the root of ๐‘‡, create a new root node ๐‘ข. Else,
let ๐‘ข be the parent of ๐‘ฃ.
โ€ข Insert key ๐‘˜3 into ๐‘ข and make ๐‘ฃโ€ฒ and ๐‘ฃโ€ฒโ€ฒ children of ๐‘ข,
so that if ๐‘ฃ was child ๐‘– of ๐‘ข, then ๐‘ฃโ€ฒ and ๐‘ฃโ€ฒโ€ฒ become
children ๐‘– and ๐‘– + 1 of ๐‘ข, respectively.
Data Structures and Programming
Techniques
65
Overflow at a 5-node
โ„Ž1 โ„Ž2
๐‘ข1
๐‘ข
๐‘ฃ = ๐‘ข2
๐‘ข3
๐‘˜1 ๐‘˜2 ๐‘˜3 ๐‘˜4
๐‘ฃ1
๐‘ฃ2
๐‘ฃ3
๐‘ฃ4
Data Structures and Programming
Techniques
๐‘ฃ5
66
The third key of ๐‘ฃ inserted into the
parent node ๐‘ข
โ„Ž1
๐‘ข1
๐‘ฃ = ๐‘ข2
๐‘˜1 ๐‘˜2
๐‘ฃ1
๐‘ฃ2
๐‘ฃ3
โ„Ž2
๐‘ข
๐‘˜3
๐‘ข3
๐‘˜4
๐‘ฃ4
Data Structures and Programming
Techniques
๐‘ฃ5
67
Node ๐‘ฃ replaced with a 3-node ๐‘ฃโ€ฒ and
a 2-node ๐‘ฃโ€ฒโ€ฒ
โ„Ž1 ๐‘˜3 โ„Ž2 ๐‘ข
๐‘ข1 ๐‘ฃ โ€ฒ = ๐‘ข2
๐‘ฃ โ€ฒโ€ฒ = ๐‘ข3 ๐‘ข4
๐‘˜1 ๐‘˜2
๐‘ฃ1
๐‘ฃ2
๐‘˜4
๐‘ฃ3
๐‘ฃ4
Data Structures and Programming
Techniques
๐‘ฃ5
68
Example
โ€ข Let us now see an example of a few insertions
into an initially empty (2,4) tree.
Data Structures and Programming
Techniques
69
Insert 4
4
Data Structures and Programming
Techniques
70
Insert 6
4 6
Data Structures and Programming
Techniques
71
Insert 12
4 6 12
Data Structures and Programming
Techniques
72
Insert 15 - Overflow
4 6 12 15
Data Structures and Programming
Techniques
73
Creation of New Root Node
12
4 6
15
Data Structures and Programming
Techniques
74
Split
12
4 6
15
Data Structures and Programming
Techniques
75
Insert 3
12
3 4 6
15
Data Structures and Programming
Techniques
76
Insert 5 - Overflow
12
3 4 5 6
15
Data Structures and Programming
Techniques
77
5 is Sent to the Parent Node
12
5
3 4
6
15
Data Structures and Programming
Techniques
78
Split
5
3 4
12
6
Data Structures and Programming
Techniques
15
79
Insert 10
5
3 4
12
6 10
Data Structures and Programming
Techniques
15
80
Insert 8
5
3 4
12
6 8 10
Data Structures and Programming
Techniques
15
81
Insertion (contโ€™d)
โ€ข Let us now see a more complicated example
of insertion in a (2,4) tree.
Data Structures and Programming
Techniques
82
Initial Tree
5 10 12
3 4
6 8
11
13 14 15
Data Structures and Programming
Techniques
83
Insert 17 - Overflow
5 10 12
3 4
6 8
11
13 14 15 17
Data Structures and Programming
Techniques
84
15 is Sent to the Parent Node
5 10 12
3 4
6 8
11
Data Structures and Programming
Techniques
15
13 14
17
85
Split
5 10 12 15
3 4
6 8
11
13 14
Data Structures and Programming
Techniques
17
86
Overflow at the Root
5 10 12 15
3 4
6 8
11
Data Structures and Programming
Techniques
13 14
17
87
Creation of New Root
12
5 10
3 4
6 8
15
11
Data Structures and Programming
Techniques
13 14
17
88
Split
12
15
5 10
3 4
6 8
11
Data Structures and Programming
Techniques
13 14
17
89
Final Tree
12
15
5 10
3 4
6 8
11
Data Structures and Programming
Techniques
13 14
17
90
Complexity Analysis of Insertion
โ€ข A split operation affects a constant number of nodes
of the tree and a constant number of entries stored at
such nodes. Thus, it can be implemented in ๐‘ถ(๐Ÿ) time.
โ€ข As a consequence of a split operation on node ๐‘ฃ, a new
overflow may arise at the parent ๐‘ข of ๐‘ฃ. A split
operation either eliminates the overflow or it
propagates it into the parent of the current node.
Hence, the number of split operations is bounded by
the height of the tree, which is ๐‘‚ log ๐‘› .
โ€ข Therefore, the total time to perform an insertion is
๐‘ถ ๐’๐’๐’ˆ ๐’ .
Data Structures and Programming
Techniques
91
Removal in (2,4) Trees
โ€ข Let us now consider the removal of an entry with key ๐‘˜ from a (2,4)
tree ๐‘‡.
โ€ข Removing such an entry can always be reduced to the case where
the entry to be removed is stored at a node ๐‘ฃ whose children are
external nodes.
โ€ข Suppose that the entry we wish to remove is stored in the ๐‘–-th
entry (๐‘˜๐‘– , ๐‘ฅ๐‘– ) at a node ๐‘ง that has only internal nodes as children. In
this case, we swap the entry (๐‘˜๐‘– , ๐‘ฅ๐‘– ) with an appropriate entry that
is stored at a node ๐‘ฃ with external-node children as follows:
โ€“ We find the right-most internal node ๐‘ฃ in the subtree rooted at the ๐‘–th child of ๐‘ง, noting that the children of node ๐‘ฃ are all external nodes.
โ€“ We swap the entry (๐‘˜๐‘– , ๐‘ฅ๐‘– ) at ๐‘ง with the last entry of ๐‘ฃ. The key of this
entry is the predecessor of ๐‘˜๐‘– in the natural ordering of the keys of the
tree.
Data Structures and Programming
Techniques
92
Removal (contโ€™d)
โ€ข Once we ensure that the entry to be removed is
stored at a node ๐‘ฃ with only external-node
children, we simply remove the entry from ๐‘ฃ and
remove the ๐‘–-th external node of ๐‘ฃ.
โ€ข Removing an entry from a node ๐‘ฃ preserves the
depth property, because we always remove an
external node child from a node ๐‘ฃ with only
external-node children.
โ€ข However, we might violate the size property at ๐‘ฃ.
Data Structures and Programming
Techniques
93
Removal (contโ€™d)
โ€ข If ๐‘ฃ was previously a 2-node, then, after the removal, it
becomes a 1-node with no entries.
โ€ข This type of violation of the size property is called an
underflow node at ๐‘ฃ.
โ€ข To remedy an underflow, we check whether an immediate
sibling of ๐‘ฃ is a 3-node or a 4-node. If we find such a sibling
๐‘ค, then we perform a transfer operation, in which we
move a child of ๐‘ค to ๐‘ฃ, a key of ๐‘ค to the parent ๐‘ข of ๐‘ฃ and
๐‘ค, and a key of ๐‘ข to ๐‘ฃ.
โ€ข If ๐‘ฃ has only one sibling, or if both immediate siblings of ๐‘ฃ
are 2-nodes, then we perform a fusion operation, in which
we merge ๐‘ฃ with a sibling, creating a new node ๐‘ฃโ€ฒ, and
move a key from the parent ๐‘ข of ๐‘ฃ to ๐‘ฃโ€ฒ.
Data Structures and Programming
Techniques
94
Removal (contโ€™d)
โ€ข A fusion operation at a node ๐‘ฃ may cause a new
underflow to occur at the parent ๐‘ข of ๐‘ฃ, which in
turn triggers a transfer or fusion at ๐‘ข.
โ€ข Hence, the number of fusion operations is
bounded by the height of the tree which is
๐‘‚ log ๐‘› .
โ€ข Therefore, a removal operation can take
๐‘ถ ๐’๐’๐’ˆ ๐’ time in the worst case.
โ€ข If an underflow propagates all the way up to the
root, then the root is simply deleted.
Data Structures and Programming
Techniques
95
Examples
โ€ข Let us now see some examples of removal
from a (2,4) tree.
Data Structures and Programming
Techniques
96
Initial Tree
12
15
5 10
4
6 8
11
Data Structures and Programming
Techniques
13 14
17
97
Remove 4
12
4
๐‘ฃ
15
5 10
6 8
11
Data Structures and Programming
Techniques
13 14
17
98
Transfer
12
๐‘ข
15
10
5
6
๐‘ฃ
๐‘ค
8
11
Data Structures and Programming
Techniques
13 14
17
99
After the Transfer
12
๐‘ข
15
6 10
๐‘ค
๐‘ฃ
5
8
11
Data Structures and Programming
Techniques
13 14
17
100
Remove 12
12
6 10
15
11
๐‘ค
๐‘ฃ
5
8
Data Structures and Programming
Techniques
13 14
17
101
Fusion of ๐‘ค and ๐‘ฃ
11
๐‘ข
15
6
10
๐‘ค
๐‘ฃ
5
8
๐‘ฃ
Data Structures and Programming
Techniques
13 14
17
102
After the Fusion
11
6
5
15
8 10
Data Structures and Programming
Techniques
13 14
17
103
Remove 13
11
6
15
13
5
8 10
Data Structures and Programming
Techniques
14
17
104
After the Removal of 13
11
6
5
15
8 10
Data Structures and Programming
Techniques
14
17
105
Remove 14 - Underflow
11
6
15
14
5
8 10
Data Structures and Programming
Techniques
17
106
Fusion
11
๐‘ข
6
๐‘ฃ
5
8 10
Data Structures and Programming
Techniques
15
17
107
Underflow at ๐‘ข
11
๐‘ข
6
5
8 10
Data Structures and Programming
Techniques
15 17
108
Fusion
11
๐‘ข
6
5
8 10
Data Structures and Programming
Techniques
15 17
109
Remove the Root
6 11
5
8 10
Data Structures and Programming
Techniques
15 17
110
Final Tree
6 11
5
8 10
Data Structures and Programming
Techniques
15 17
111
Readings
โ€ข T. A. Standish. Data Structures, Algorithms and
Software Principles in C.
โ€“ Section 9.9
โ€ข M. T. Goodrich, R. Tamassia and D. Mount.
Data Structures and Algorithms in C++.
โ€“ Section 10.4
โ€ข R. Sedgewick. ฮ‘ฮปฮณฯŒฯฮนฮธฮผฮฟฮน ฯƒฮต C. 3ฮท
ฮ‘ฮผฮตฯฮนฮบฮฑฮฝฮนฮบฮฎ ฮˆฮบฮดฮฟฯƒฮท. ฮ•ฮบฮดฯŒฯƒฮตฮนฯ‚ ฮšฮปฮตฮนฮดฮฌฯฮนฮธฮผฮฟฯ‚.
โ€“ Section 13.3
Data Structures and Programming
Techniques
112