Download Is it a Tree?

Document related concepts

Linked list wikipedia , lookup

Lattice model (finance) wikipedia , lookup

Quadtree wikipedia , lookup

Red–black tree wikipedia , lookup

Interval tree wikipedia , lookup

B-tree wikipedia , lookup

Binary tree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
Binary Search Trees





Understand tree terminology
Understand and implement tree traversals
Define the binary search tree property
Implement binary search trees
Implement the TreeSort algorithm
October 2004
John Edgar
2

A set of nodes (or vertices)
with a single starting point
 called the root


Each node is connected by
an edge to another node
A tree is a connected graph
 There is a path to every node
in the tree
 A tree has one fewer edges
than the number of nodes
October 2004
John Edgar
4
NO!
yes!
All the nodes are
not connected
yes! (but not
a binary tree)
NO!
There is an extra
edge (5 nodes
and 5 edges)
October 2004
John Edgar
yes! (it’s actually
the same graph as
the blue one)
5

Node v is said to be a child
of u, and u the parent of v if
root
A
 There is an edge between the
nodes u and v, and
 u is above v in the tree,

edge
This relationship can be
generalized
B
C
parent of
B, C, D
D
 E and F are descendants of A
 D and A are ancestors of G
E
F
G
 B, C and D are siblings
 F and G are?
October 2004
John Edgar
6


A leaf is a node with no children
A path is a sequence of nodes v1 … vn
 where vi is a parent of vi+1 (1  i  n-1)
A subtree is any node in the tree along with all
of its descendants
 A binary tree is a tree with at most two children
per node

 The children are referred to as left and right
 We can also refer to left and right subtrees
October 2004
John Edgar
7
A
path from
A to D to G
C
B
subtree
rooted at B
leaves:
E
October 2004
D
John Edgar
F
C,E,F,G
G
8
A
B
left subtree
of A
D
C right child of A
E
F
G
right
subtree of C
H
October 2004
John Edgar
I
J
9

The height of a node v is the length of the
longest path from v to a leaf
 The height of the tree is the height of the root

The depth of a node v is the length of the path
from v to the root
 This is also referred to as the level of a node

Note that there is a slightly different formulation
of the height of a tree
 Where the height of a tree is said to be the number of
different levels of nodes in the tree (including the root)
October 2004
John Edgar
10
A
height of the tree is ?
3
height of node B is ?
B
C
level 1
2
D
H
October 2004
John Edgar
E
depth of
node E is ?
2
F
G
I
level 2
J
level 3
11
yes!
yes!
However, these trees are not “beautiful” (for some applications)
October 2004
John Edgar
13

A binary tree is perfect, if
A
 No node has only one child
 And all the leaves have the
same depth

A perfect binary tree of
height h has how many
nodes?
B
D
C
E
F
G
 2h+1 – 1 nodes, of which 2h
are leaves
October 2004
John Edgar
14
 Each level doubles the number of nodes
 Level 1 has 2 nodes (21)
 Level 2 has 4 nodes (22) or 2 times the number in Level 1
 Therefore a tree with h levels has 2h+1 - 1nodes
 The root level has 1 node 01
11
12
21
31
October 2004
22
32
the bottom level
has 2h nodes, that
is, just over ½ the
nodes are leaves
33
John Edgar
23
34
35
24
36
37
38
15

A binary tree is complete if
A
 The leaves are on at most two
different levels,
 The second to bottom level is
completely filled in, and
 The leaves on the bottom
level are as far to the left as
possible

B
D
C
E
F
Perfect trees are also
complete
October 2004
John Edgar
16

A binary tree is balanced if
 Leaves are all about the same distance from the root
 The exact specification varies

Sometimes trees are balanced by comparing
the height of nodes
 e.g. the height of a node’s right subtree is at most
one different from the height of its left subtree

Sometimes a tree's height is compared to the
number of nodes
 e.g. red-black trees
October 2004
John Edgar
17
A
A
B
D
B
C
E
F
D
C
E
F
G
October 2004
John Edgar
18
A
A
B
C
B
E
D
C
D
F
October 2004
John Edgar
19

A traversal algorithm for a binary tree visits each
node in the tree
 Typically, it will do something while visiting each node!


Traversal algorithms are naturally recursive
There are three traversal methods
 Inorder
 Preorder
 Postorder
October 2004
John Edgar
21
C++
// InOrder traversal algorithm
void inOrder(Node *n) {
if (n != 0) {
inOrder(n->leftChild);
visit(n);
inOrder(n->rightChild);
}
}
October 2004
John Edgar
22
InOrder Traversal
A
B
D
October 2004
John Edgar
C
E
F
23
C++
// PreOrder traversal algorithm
void preOrder(Node *n) {
if (n != 0) {
visit(n);
preOrder(n->leftChild);
preOrder(n->rightChild);
}
}
October 2004
John Edgar
24
visit(n)
1
preOrder(n->leftChild)
17
preOrder(n->rightChild)
2
3
9
13
visit
preOrder(l)
preOrder(r)
visit
preOrder(l)
preOrder(r)
5
16
visit
preOrder(l)
preOrder(r)
4
October 2004
11
6
7
20
visit
preOrder(l)
preOrder(r)
27
visit
preOrder(l)
preOrder(r)
8
39
visit
preOrder(l)
preOrder(r)
visit
preOrder(l)
preOrder(r)
John Edgar
25
C++
// PostOrder traversal algorithm
void postOrder(Node *n) {
if (n != 0) {
postOrder(n->leftChild);
postOrder(n->rightChild);
visit(n);
}
}
October 2004
John Edgar
26
postOrder(n->leftChild)
8
postOrder(n->rightChild)
17
visit(n)
4
2
9
13
postOrder(l)
postOrder(r)
visit
postOrder(l)
postOrder(r)
visit
3
16
postOrder(l)
postOrder(r)
visit
1
7
5
20
postOrder(l)
postOrder(r)
visit
27
postOrder(l)
postOrder(r)
visit
6
39
postOrder(l)
postOrder(r)
visit
11
postOrder(l)
postOrder(r)
visit
October 2004
John Edgar
27

The binary tree can be implemented using
a number of data structures
 Reference structures (similar to linked lists)
 Arrays

We will look at three implementations
 Binary search trees (reference / pointers)
 Red – black trees (reference / pointers)
 Heap (arrays)
October 2004
John Edgar
29

Consider maintaining data in some order
 The data is to be frequently searched on the sort key
e.g. a dictionary

Possible solutions might be:
 A sorted array
▪ Access in O(logn) using binary search
▪ Insertion and deletion in linear time
 An ordered linked list
▪ Access, insertion and deletion in linear time
 Neither of these is efficient
October 2004
John Edgar
30

The data structure should be able to perform all
these operations efficiently
 Create an empty dictionary
 Insert
 Delete
 Look up

The insert, delete and look up operations should
be performed in at most O(logn) time
October 2004
John Edgar
31

A binary search tree (BST) is a binary tree
with a special property
 For all nodes in the tree:
▪ All nodes in a left subtree have labels less than the
label of the node
▪ All nodes in a right subtree have labels greater than
or equal to the label of the node

Binary search trees are fully ordered
October 2004
John Edgar
32
17
13
9
27
16
20
39
11
October 2004
John Edgar
33
inOrder(n->leftChild)
5
An inorder traversal retrieves
the data in sorted order
visit(n)
17
inOrder(n->rightChild)
3
1
9
inOrder(l)
visit
inOrder(r)
13
inOrder(l)
visit
inOrder(r)
4
16
inOrder(l)
visit
inOrder(r)
2
October 2004
11
7
6
20
inOrder(l)
visit
inOrder(r)
27
inOrder(l)
visit
inOrder(r)
8
39
inOrder(l)
visit
inOrder(r)
inOrder(l)
visit
inOrder(r)
John Edgar
34
Binary search trees can be implemented using a
reference structure
 Tree nodes contain data and two pointers to
nodes

Node *leftChild
data
Node *rightChild
data to be stored
in the tree
pointers to Nodes
October 2004
John Edgar
35

To find a value in a BST search from the root
node:
 If the target is less than the value in the node search its
left subtree
 If the target is greater than the value in the node search
its right subtree
 Otherwise return true, or return data, etc.

How many comparisons?
 One for each node on the path
 Worst case: height of the tree + 1
October 2004
John Edgar
36


The BST property must hold after insertion
Therefore the new node must be inserted in the
correct position
 This position is found by performing a search
 If the search ends at the (null) left child of a node
make its left child refer to the new node
 If the search ends at the right child of a node make its
right child refer to the new node

The cost is about the same as the cost for the
search algorithm, O(height)
October 2004
John Edgar
37
insert 43
create new node
find position
insert new node
47
32
19
43
10
7
October 2004
63
41
23
12
John Edgar
37
30
54
44
43
53
79
59
57
96
91
97
38


After deletion the BST property must hold
Deletion is not as straightforward as search or
insertion
 So much so that sometimes it is not even
implemented!
 Deleted nodes are marked as deleted in some way

There are a number of different cases that must
be considered
October 2004
John Edgar
39



The node to be deleted has no children
The node to be deleted has one child
The node to be deleted has two children
October 2004
John Edgar
40

The node to be deleted has no children
 Remove it (assigning null to its parent’s reference)
October 2004
John Edgar
41
delete 30
47
32
63
19
10
7
October 2004
41
23
12
37
30
John Edgar
54
44
53
79
59
57
96
91
97
42

The node to be deleted has one child
 Replace the node with its subtree
October 2004
John Edgar
43
delete 79
replace with subtree
47
32
63
19
10
7
October 2004
41
23
12
37
30
John Edgar
54
44
53
79
59
57
96
91
97
44
delete 79
after deletion
47
32
63
19
10
7
October 2004
41
23
12
37
30
John Edgar
54
44
53
59
57
96
91
97
45

The node to be deleted has two children
 Replace the node with its successor, the left most
node of its right subtree
▪ It is also possible to replace the node with its predecessor,
the right most node of its left subtree
 If that node has a child (and it can have at most one
child) attach it to the node’s parent
▪ Why can a predecessor or successor have at most one child?
October 2004
John Edgar
46
delete 32
find successor and detach
47
32
63
19
10
41
23
37
54
44
53
79
59
96
temp
7
October 2004
12
30
John Edgar
57
91
97
47
delete 32
find successor
attach target node’s
children to
32 37
successor
47
63
temp
19
10
41
23
37
54
44
53
79
59
96
temp
7
October 2004
12
30
John Edgar
57
91
97
48
delete 32
- find successor
- attach target’s
children to
32
successor
- make successor
child of
target’s 19
parent
10
7
October 2004
47
37
23
12
63
temp
41
54
44
30
John Edgar
53
79
59
57
96
91
97
49
delete 32
note: successor
had no subtree
47
37
63
temp
19
10
7
October 2004
41
23
12
54
44
30
John Edgar
53
79
59
57
96
91
97
50
delete 63
- find predecessor*: note
it has a subtree
47
32
63
19
41
54
*predecessor used instead
of successor to show its
location - an
implementation would have
to pick one or the other
79
temp
10
7
October 2004
23
12
37
30
John Edgar
44
53
59
57
96
91
97
51
delete 63
- find predecessor
- attach predecessor’s
subtree to its
32
parent
19
47
63
41
54
79
temp
10
7
October 2004
23
12
37
30
John Edgar
44
53
59
57
96
91
97
52
delete 63
- find predecessor
- attach subtree
- attach target’s
32
children to
predecessor
47
temp
63
19
41
54
59
79
temp
10
7
October 2004
23
12
37
30
John Edgar
44
53
59
57
96
91
97
53
delete 63
- find predecessor
- attach subtree
- attach children
32
- attach pre.
to target’s
parent
19
10
7
October 2004
23
12
47
temp
63
41
37
30
John Edgar
54
44
59
79
53
96
57
91
97
54
delete 63
47
32
59
19
10
7
October 2004
41
23
12
37
30
John Edgar
54
44
79
53
96
57
91
97
55
The efficiency of BST operations depends on
the height of the tree
 All three operations (search, insert and delete)
are O(height)
 If the tree is complete the height is log(n)
 What if it isn’t complete?

October 2004
John Edgar
56






Insert 7
Insert 4
Insert 1
Insert 9
Insert 5
It’s a complete tree!
October 2004
John Edgar
7
4
9
height = log(5) = 2
1
5
57






Insert 9
Insert 1
Insert 7
Insert 4
Insert 5
It’s a linked list with a lot
of extra pointers!
9
1
7
height =
n – 1 = 4 = O(n)
4
5
October 2004
John Edgar
58

It would be ideal if a BST was always
close to complete
 i.e. balanced

How do we guarantee a balanced BST?
 We have to make the insertion and deletion
algorithms more complex
▪ e.g. red – black trees.
October 2004
John Edgar
59

It is possible to sort an array using a binary
search tree
 Insert the array items into an empty tree
 Write the data from the tree back into the array using an
InOrder traversal

Running time = n*(insertion cost) + traversal
 Insertion cost is O(h)
 Traversal is O(n)
 Total = O(n) * O(h) + O(n), i.e. O(n * h)
 If the tree is balanced = O(n * log(n))
October 2004
John Edgar
60
Tree Quiz I

Write a recursive function to print the
items in a BST in descending order
class Node {
public:
int data;
Node *leftc;
Node *rightc;
};
October 2004
John Edgar
62
Tree Quiz II

Write a recursive function to delete a BST
stored in dynamic memory
class Node {
public:
int data;
Node *leftc;
Node *rightc;
};
October 2004
John Edgar
63
Summary

Trees
 Terminology: paths, height, node relationships, …

Binary search trees
 Traversal
▪ Post-order, pre-order, in-order
 Operations
▪ Insert, delete, search

Balanced trees
 Binary search tree operations are efficient for
balanced trees
October 2004
John Edgar
65
Readings

Carrano Ch. 10
October 2004
John Edgar
66