* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Is it a Tree?
Survey
Document related concepts
Transcript
Binary Search Trees
Understand tree terminology
Understand and implement tree traversals
Define the binary search tree property
Implement binary search trees
Implement the TreeSort algorithm
October 2004
John Edgar
2
A set of nodes (or vertices)
with a single starting point
called the root
Each node is connected by
an edge to another node
A tree is a connected graph
There is a path to every node
in the tree
A tree has one fewer edges
than the number of nodes
October 2004
John Edgar
4
NO!
yes!
All the nodes are
not connected
yes! (but not
a binary tree)
NO!
There is an extra
edge (5 nodes
and 5 edges)
October 2004
John Edgar
yes! (it’s actually
the same graph as
the blue one)
5
Node v is said to be a child
of u, and u the parent of v if
root
A
There is an edge between the
nodes u and v, and
u is above v in the tree,
edge
This relationship can be
generalized
B
C
parent of
B, C, D
D
E and F are descendants of A
D and A are ancestors of G
E
F
G
B, C and D are siblings
F and G are?
October 2004
John Edgar
6
A leaf is a node with no children
A path is a sequence of nodes v1 … vn
where vi is a parent of vi+1 (1 i n-1)
A subtree is any node in the tree along with all
of its descendants
A binary tree is a tree with at most two children
per node
The children are referred to as left and right
We can also refer to left and right subtrees
October 2004
John Edgar
7
A
path from
A to D to G
C
B
subtree
rooted at B
leaves:
E
October 2004
D
John Edgar
F
C,E,F,G
G
8
A
B
left subtree
of A
D
C right child of A
E
F
G
right
subtree of C
H
October 2004
John Edgar
I
J
9
The height of a node v is the length of the
longest path from v to a leaf
The height of the tree is the height of the root
The depth of a node v is the length of the path
from v to the root
This is also referred to as the level of a node
Note that there is a slightly different formulation
of the height of a tree
Where the height of a tree is said to be the number of
different levels of nodes in the tree (including the root)
October 2004
John Edgar
10
A
height of the tree is ?
3
height of node B is ?
B
C
level 1
2
D
H
October 2004
John Edgar
E
depth of
node E is ?
2
F
G
I
level 2
J
level 3
11
yes!
yes!
However, these trees are not “beautiful” (for some applications)
October 2004
John Edgar
13
A binary tree is perfect, if
A
No node has only one child
And all the leaves have the
same depth
A perfect binary tree of
height h has how many
nodes?
B
D
C
E
F
G
2h+1 – 1 nodes, of which 2h
are leaves
October 2004
John Edgar
14
Each level doubles the number of nodes
Level 1 has 2 nodes (21)
Level 2 has 4 nodes (22) or 2 times the number in Level 1
Therefore a tree with h levels has 2h+1 - 1nodes
The root level has 1 node 01
11
12
21
31
October 2004
22
32
the bottom level
has 2h nodes, that
is, just over ½ the
nodes are leaves
33
John Edgar
23
34
35
24
36
37
38
15
A binary tree is complete if
A
The leaves are on at most two
different levels,
The second to bottom level is
completely filled in, and
The leaves on the bottom
level are as far to the left as
possible
B
D
C
E
F
Perfect trees are also
complete
October 2004
John Edgar
16
A binary tree is balanced if
Leaves are all about the same distance from the root
The exact specification varies
Sometimes trees are balanced by comparing
the height of nodes
e.g. the height of a node’s right subtree is at most
one different from the height of its left subtree
Sometimes a tree's height is compared to the
number of nodes
e.g. red-black trees
October 2004
John Edgar
17
A
A
B
D
B
C
E
F
D
C
E
F
G
October 2004
John Edgar
18
A
A
B
C
B
E
D
C
D
F
October 2004
John Edgar
19
A traversal algorithm for a binary tree visits each
node in the tree
Typically, it will do something while visiting each node!
Traversal algorithms are naturally recursive
There are three traversal methods
Inorder
Preorder
Postorder
October 2004
John Edgar
21
C++
// InOrder traversal algorithm
void inOrder(Node *n) {
if (n != 0) {
inOrder(n->leftChild);
visit(n);
inOrder(n->rightChild);
}
}
October 2004
John Edgar
22
InOrder Traversal
A
B
D
October 2004
John Edgar
C
E
F
23
C++
// PreOrder traversal algorithm
void preOrder(Node *n) {
if (n != 0) {
visit(n);
preOrder(n->leftChild);
preOrder(n->rightChild);
}
}
October 2004
John Edgar
24
visit(n)
1
preOrder(n->leftChild)
17
preOrder(n->rightChild)
2
3
9
13
visit
preOrder(l)
preOrder(r)
visit
preOrder(l)
preOrder(r)
5
16
visit
preOrder(l)
preOrder(r)
4
October 2004
11
6
7
20
visit
preOrder(l)
preOrder(r)
27
visit
preOrder(l)
preOrder(r)
8
39
visit
preOrder(l)
preOrder(r)
visit
preOrder(l)
preOrder(r)
John Edgar
25
C++
// PostOrder traversal algorithm
void postOrder(Node *n) {
if (n != 0) {
postOrder(n->leftChild);
postOrder(n->rightChild);
visit(n);
}
}
October 2004
John Edgar
26
postOrder(n->leftChild)
8
postOrder(n->rightChild)
17
visit(n)
4
2
9
13
postOrder(l)
postOrder(r)
visit
postOrder(l)
postOrder(r)
visit
3
16
postOrder(l)
postOrder(r)
visit
1
7
5
20
postOrder(l)
postOrder(r)
visit
27
postOrder(l)
postOrder(r)
visit
6
39
postOrder(l)
postOrder(r)
visit
11
postOrder(l)
postOrder(r)
visit
October 2004
John Edgar
27
The binary tree can be implemented using
a number of data structures
Reference structures (similar to linked lists)
Arrays
We will look at three implementations
Binary search trees (reference / pointers)
Red – black trees (reference / pointers)
Heap (arrays)
October 2004
John Edgar
29
Consider maintaining data in some order
The data is to be frequently searched on the sort key
e.g. a dictionary
Possible solutions might be:
A sorted array
▪ Access in O(logn) using binary search
▪ Insertion and deletion in linear time
An ordered linked list
▪ Access, insertion and deletion in linear time
Neither of these is efficient
October 2004
John Edgar
30
The data structure should be able to perform all
these operations efficiently
Create an empty dictionary
Insert
Delete
Look up
The insert, delete and look up operations should
be performed in at most O(logn) time
October 2004
John Edgar
31
A binary search tree (BST) is a binary tree
with a special property
For all nodes in the tree:
▪ All nodes in a left subtree have labels less than the
label of the node
▪ All nodes in a right subtree have labels greater than
or equal to the label of the node
Binary search trees are fully ordered
October 2004
John Edgar
32
17
13
9
27
16
20
39
11
October 2004
John Edgar
33
inOrder(n->leftChild)
5
An inorder traversal retrieves
the data in sorted order
visit(n)
17
inOrder(n->rightChild)
3
1
9
inOrder(l)
visit
inOrder(r)
13
inOrder(l)
visit
inOrder(r)
4
16
inOrder(l)
visit
inOrder(r)
2
October 2004
11
7
6
20
inOrder(l)
visit
inOrder(r)
27
inOrder(l)
visit
inOrder(r)
8
39
inOrder(l)
visit
inOrder(r)
inOrder(l)
visit
inOrder(r)
John Edgar
34
Binary search trees can be implemented using a
reference structure
Tree nodes contain data and two pointers to
nodes
Node *leftChild
data
Node *rightChild
data to be stored
in the tree
pointers to Nodes
October 2004
John Edgar
35
To find a value in a BST search from the root
node:
If the target is less than the value in the node search its
left subtree
If the target is greater than the value in the node search
its right subtree
Otherwise return true, or return data, etc.
How many comparisons?
One for each node on the path
Worst case: height of the tree + 1
October 2004
John Edgar
36
The BST property must hold after insertion
Therefore the new node must be inserted in the
correct position
This position is found by performing a search
If the search ends at the (null) left child of a node
make its left child refer to the new node
If the search ends at the right child of a node make its
right child refer to the new node
The cost is about the same as the cost for the
search algorithm, O(height)
October 2004
John Edgar
37
insert 43
create new node
find position
insert new node
47
32
19
43
10
7
October 2004
63
41
23
12
John Edgar
37
30
54
44
43
53
79
59
57
96
91
97
38
After deletion the BST property must hold
Deletion is not as straightforward as search or
insertion
So much so that sometimes it is not even
implemented!
Deleted nodes are marked as deleted in some way
There are a number of different cases that must
be considered
October 2004
John Edgar
39
The node to be deleted has no children
The node to be deleted has one child
The node to be deleted has two children
October 2004
John Edgar
40
The node to be deleted has no children
Remove it (assigning null to its parent’s reference)
October 2004
John Edgar
41
delete 30
47
32
63
19
10
7
October 2004
41
23
12
37
30
John Edgar
54
44
53
79
59
57
96
91
97
42
The node to be deleted has one child
Replace the node with its subtree
October 2004
John Edgar
43
delete 79
replace with subtree
47
32
63
19
10
7
October 2004
41
23
12
37
30
John Edgar
54
44
53
79
59
57
96
91
97
44
delete 79
after deletion
47
32
63
19
10
7
October 2004
41
23
12
37
30
John Edgar
54
44
53
59
57
96
91
97
45
The node to be deleted has two children
Replace the node with its successor, the left most
node of its right subtree
▪ It is also possible to replace the node with its predecessor,
the right most node of its left subtree
If that node has a child (and it can have at most one
child) attach it to the node’s parent
▪ Why can a predecessor or successor have at most one child?
October 2004
John Edgar
46
delete 32
find successor and detach
47
32
63
19
10
41
23
37
54
44
53
79
59
96
temp
7
October 2004
12
30
John Edgar
57
91
97
47
delete 32
find successor
attach target node’s
children to
32 37
successor
47
63
temp
19
10
41
23
37
54
44
53
79
59
96
temp
7
October 2004
12
30
John Edgar
57
91
97
48
delete 32
- find successor
- attach target’s
children to
32
successor
- make successor
child of
target’s 19
parent
10
7
October 2004
47
37
23
12
63
temp
41
54
44
30
John Edgar
53
79
59
57
96
91
97
49
delete 32
note: successor
had no subtree
47
37
63
temp
19
10
7
October 2004
41
23
12
54
44
30
John Edgar
53
79
59
57
96
91
97
50
delete 63
- find predecessor*: note
it has a subtree
47
32
63
19
41
54
*predecessor used instead
of successor to show its
location - an
implementation would have
to pick one or the other
79
temp
10
7
October 2004
23
12
37
30
John Edgar
44
53
59
57
96
91
97
51
delete 63
- find predecessor
- attach predecessor’s
subtree to its
32
parent
19
47
63
41
54
79
temp
10
7
October 2004
23
12
37
30
John Edgar
44
53
59
57
96
91
97
52
delete 63
- find predecessor
- attach subtree
- attach target’s
32
children to
predecessor
47
temp
63
19
41
54
59
79
temp
10
7
October 2004
23
12
37
30
John Edgar
44
53
59
57
96
91
97
53
delete 63
- find predecessor
- attach subtree
- attach children
32
- attach pre.
to target’s
parent
19
10
7
October 2004
23
12
47
temp
63
41
37
30
John Edgar
54
44
59
79
53
96
57
91
97
54
delete 63
47
32
59
19
10
7
October 2004
41
23
12
37
30
John Edgar
54
44
79
53
96
57
91
97
55
The efficiency of BST operations depends on
the height of the tree
All three operations (search, insert and delete)
are O(height)
If the tree is complete the height is log(n)
What if it isn’t complete?
October 2004
John Edgar
56
Insert 7
Insert 4
Insert 1
Insert 9
Insert 5
It’s a complete tree!
October 2004
John Edgar
7
4
9
height = log(5) = 2
1
5
57
Insert 9
Insert 1
Insert 7
Insert 4
Insert 5
It’s a linked list with a lot
of extra pointers!
9
1
7
height =
n – 1 = 4 = O(n)
4
5
October 2004
John Edgar
58
It would be ideal if a BST was always
close to complete
i.e. balanced
How do we guarantee a balanced BST?
We have to make the insertion and deletion
algorithms more complex
▪ e.g. red – black trees.
October 2004
John Edgar
59
It is possible to sort an array using a binary
search tree
Insert the array items into an empty tree
Write the data from the tree back into the array using an
InOrder traversal
Running time = n*(insertion cost) + traversal
Insertion cost is O(h)
Traversal is O(n)
Total = O(n) * O(h) + O(n), i.e. O(n * h)
If the tree is balanced = O(n * log(n))
October 2004
John Edgar
60
Tree Quiz I
Write a recursive function to print the
items in a BST in descending order
class Node {
public:
int data;
Node *leftc;
Node *rightc;
};
October 2004
John Edgar
62
Tree Quiz II
Write a recursive function to delete a BST
stored in dynamic memory
class Node {
public:
int data;
Node *leftc;
Node *rightc;
};
October 2004
John Edgar
63
Summary
Trees
Terminology: paths, height, node relationships, …
Binary search trees
Traversal
▪ Post-order, pre-order, in-order
Operations
▪ Insert, delete, search
Balanced trees
Binary search tree operations are efficient for
balanced trees
October 2004
John Edgar
65
Readings
Carrano Ch. 10
October 2004
John Edgar
66