Download 1-a

Document related concepts

Linked list wikipedia , lookup

Lattice model (finance) wikipedia , lookup

Quadtree wikipedia , lookup

Red–black tree wikipedia , lookup

Interval tree wikipedia , lookup

B-tree wikipedia , lookup

Binary tree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
1
Advanced C Programming
Ivor Page
C++ Notes 1995
Copyright Ivor Page 1995
2
Copyright Notice
This Microsoft Powerpoint presentation data file is the
sole property of Ivor P. Page. It may only be viewed, or
shown to individuals, or exhibited to a class, or
broadcast by any other means, with the owner’s written
permission. This Powerpoint data file may not be
copied without the owner’s written consent. Printed
notes may not be made from this data file for any use
without the owner’s written consent. All copies of this
file MUST BE DESTROYED by October 1st 1996.
Copyrighted, Ivor P. Page, 1995,
410 Pleasant Valley Lane,
Richardson,
TX 75080
C++ Notes 1995
Copyright Ivor Page 1995
3
Data Structures: The Data
For almost all purposes, data structures are used to store
many “records” of data, where the records have
something in common. The records have many fields, such
as in a personnel records system.
The records may be fixed or variable in size. If the size has a
“small” bound, it may be that fixed sized records can be
used, where every record is allocated space for the largest
possible record. Alternatively the data structure must be
designed for variable length records.
The maximum number of records to be held may be known
or unknown.
C++ Notes 1995
Copyright Ivor Page 1995
4
Data Structures: The Data
In most application, there is at least one ordering
relationship that can be applied to the data, i.e. names
can be alphabetically ordered; personnel can be
ordered by personnel-number within each department,
and then the departments can be ordered by
department code, etc.
It may be necessary to search the data given a key. A key
is a value corresponding to one field of the records,
such as a name, or a personnel-number. Sometimes we
may need to search on more than one field.
C++ Notes 1995
Copyright Ivor Page 1995
5
Abstract Data Types
It is convenient (and valuable) to design all data
structures using the same basic strategy. An Abstract
Data Type is simply some data, together with a set of
interface functions that manipulate the data.
Encapsulation: The actual data should never be
manipulated (nor be seen if possible) by users of the
data structure. There may also be some internal
functions that directly operate on the data, but do not
form part of the interface. These should be hidden if
possible from the users.
C++ Notes 1995
Copyright Ivor Page 1995
6
Abstract Data Types
Abstraction: Every operation that the users need to
perform on the data must be provided for by the
interface. The interface functions should be very easy
to use and to understand (unlike the detailed
manipulations of the data itself) and should typically be
cast in the syntax of the application to be supported.
The users can then ignore the inner workings of the
data type and concentrate on the application level.
C++ Notes 1995
Copyright Ivor Page 1995
7
Abstract Data Types Cont’d
Here is the structure that we will use for all data structures:
ADT
hidden
Data
~
Data
f1
caller
hf1
f2
hf2
f3
Interface
C++ Notes 1995
Copyright Ivor Page 1995
8
Table
Interface:
insert(index,data)
delete(index)
data = get_data(index)
index = search(data)
index = find_free_cell()
Index
C++ Notes 1995
Copyright Ivor Page 1995
9
Queue
Interface:
add_to_tail(data)
data = remove_from_hd()
A
tail
C++ Notes 1995
B
C
D
head
Copyright Ivor Page 1995
10
Stack
Interface:
push(data)
data = pop()
data = top()
bool = empty()
C++ Notes 1995
Copyright Ivor Page 1995
11
Tree
Interface:
n2 = parent(n1)
n2 = leftmost_child(n1)
n2 = right_sibling(n1)
data = get_data(n)
add_left_child(n,data)
add_right_sibling(n,data)
remove_node(n)
n = search(data)
..
..
C++ Notes 1995
..
Copyright Ivor Page 1995
12
Graph
Interface:
n = search(data)
add_neighbor(n,data)
add_link(n1,n2)
delete_link(n1,n2)
delete_node(n)
C++ Notes 1995
Copyright Ivor Page 1995
Implementation with Arrays
13
All the above data structures can be implemented with
arrays, but in most situations, the fixed size of an array
makes it unsuitable, particularly when the amount of
data to be held is unknown.
Tables:
Tables include data dictionaries, for example a name
and address table. Here the table may be searched
given the name. The search maps the given name into
an index corresponding to the entry containing that
name.
index
C++ Notes 1995
name address age etc. N
Copyright Ivor Page 1995
14
Implementation with Arrays
Linear search: If the entries are unordered, the search
takes O(N) time, where each “probe” requires a string
comparison. Find_free_cell() also requires O(N) time.
Given an index, insert() and delete() take constant time,
O(1).
Binary search: For binary search, the entries must be
ordered (sorted alphabetically by name if we want to
search on the name.) Keeping the table entries ordered
implies O(N) time for insertion and deletion. The
search proceeds as follows in O(log N) time:
C++ Notes 1995
Copyright Ivor Page 1995
15
Binary Search
Base_Index = 0;
Top_Index = N-1;
found = 0;
/* false */
while (!found){
Index = (Base_Index + Top_Index)/2;
found = probe_element(Index);
/* match? */
if(found) break;
if(Base_Index = = Top_Index ) break;
if (key is beyond Index)
Base_Index = Index+1;
else
Top_Index = Index-1;
} /* value of found tells if the search was successful */
C++ Notes 1995
Copyright Ivor Page 1995
16
Hash Tables
In a closed hash table, there is a mathematical mapping
from the key (name) to the initial probe index. For
example, the hash function for a table of size N entries,
might be:
hash = (sum of characters in name) % N
This function has the disadvantage that names such as
Fred, redF, derF, etc, all hash to the same initial probe
index. There are much better hash functions available:
hash = sum of (name[i]*prime[i])%N
Here we sum the chars multiplied by prime numbers, so
IVOR gives (I*1 + V*2 + O*3 +R*5)%N =892%N.
C++ Notes 1995
Copyright Ivor Page 1995
17
Hash Tables
The search is as follows:
index = hash(name);
count = found = 0;
while(!found && count < N) {
found = probe(index);
if(found)
break;
index = rehash(name, index);
count++;
}
C++ Notes 1995
Copyright Ivor Page 1995
18
Hash Table Performance
Average Times
Insertion or successful search takes -(1/a)log(1-a)
Deletion or unsuccessful search takes 1/(1-a)
where a = fraction of cells occupied
C++ Notes 1995
Copyright Ivor Page 1995
19
Queues Using arrays
The circular list, or ring buffer is often used for a queue:
0
1
head
a
b
~
head
~
y
z
tail
y
z
0
1
~
~
tail
N-1
N-1
a
b
tail is the index of the first free cell
z
C++ Notes 1995
y
~
b
a
Copyright Ivor Page 1995
20
Queues Using arrays
Insert at tail, if not full:
put data in cell tail;
tail = (tail+1) % N;
Delete from head, if not empty:
remove data from cell head;
head = (head+1) % N;
C++ Notes 1995
Copyright Ivor Page 1995
21
Queues Using arrays
When the queue is either full
or empty, the 2 indices have
the same value, so we cannot
distinguish these two cases.
A better way is to use an index
for the head, and a count of the
number of cells occupied.
C++ Notes 1995
0
1
head
~
tail
N-1
Copyright Ivor Page 1995
~
22
Queues Using arrays
Code when we use an index for head and a count:
(head+count)%N gives the first free cell
Insert at tail, if not full:
put data in cell (head+count)%N;
0
count++;
1
head
Delete from head, if not empty:
remove data from cell head;
head = (head+1) % N;
count--;
C++ Notes 1995
count
~
N-1
Copyright Ivor Page 1995
~
23
Stacks using arrays
We use an array in which the data doesn’t move when we
do a push() or a pop():
0
1
~
~
top
push if not full:
insert data at cell top+1;
top++;
C++ Notes 1995
pop if not empty:
remove data from cell top;
top--;
Copyright Ivor Page 1995
24
Binary Trees using Arrays
All trees can be implemented using binary trees. Here is
an array structure for a binary tree:
a
a b c d e f g
1
Array Index = 1 2 3 4 5 6 7
Note that array element
zero is not used.
b
c
2
3
d
e
f
g
4
5
6
7
A node with index i (i>1) has its parent at node i/2.
The left child of node with index i is at node 2*i.
The right child of node with index i is at node 2*i+1.
C++ Notes 1995
Copyright Ivor Page 1995
Data Structures using Pointers
25
Structures using pointers are particularly useful when the
maximum number of elements that must be stored is not
known. This is often the case in practice, especially when
writing modules for a software library, where the actual
users will have many different needs.
Linked List:
head
d p
d p
~
d
NULL
Each node contains some data d (a record), and a pointer p.
A pointer head points to the first element of the list. The
pointer of the last element of the list contains NULL. The
nodes are implemented using structs.
C++ Notes 1995
Copyright Ivor Page 1995
26
Linked Lists in C
Here is the declaration of the class and the head pointer:
class List_node {
char * name;
int age;
~
List_node * next;
};
List_node * head = NULL;
head
C++ Notes 1995
d p
d p
~
d
NULL
Copyright Ivor Page 1995
27
Linked Lists in C
Here is a function to add new_node at the head of the list:
void add_to_list(List_node * new_node)
{
new_node -> next = head; /* step 1 */
head = new_node;
/* step 2 */
}
head
2
C++ Notes 1995
1
new_node
Copyright Ivor Page 1995
Linked Lists in C cont’d
Function to remove the first element from the list and
return a pointer to it:
List_node * remove_first_element(void)
{
List_node *first = head; /* step 1 */
if(head!=NULL)
head = head -> next; /* step 2 */
return first;
}
first
1
head
2
C++ Notes 1995
Copyright Ivor Page 1995
28
Linked Lists in C cont’d
Function to search a linked list for a certain name and
return a pointer to the node if a match is found:
List_node * search_list(char * key)
{
List_node * ptr = head;
while(ptr != NULL)
{
if(strcmp(key, ptr -> name)= = 0)
break;
ptr = ptr -> next;
}
return ptr;
}
C++ Notes 1995
Copyright Ivor Page 1995
29
30
Queues using Linked Lists
In a queue, data is added at one end (the tail) and removed
from the other end (the head). A linked list is adequate
for this purpose with the addition of a pointer to the tail:
class Queue_node {
char * name;
int age;
tail
~
Queue
Queue_node * next;
};
head
class Queue {
Queue_node * head;
Queue_node * tail;
d p
d p
d NULL
~
};
C++ Notes 1995
Copyright Ivor Page 1995
31
Inserting at the queue tail
void add_to_tail(Queue_node * elem) {
if(q.tail = = NULL) /* queue is empty */
q.head = q.tail = elem;
else {
q.tail -> next = elem;
tail
q.tail = elem;
}
Queue
elem
2
head
d p
d p
~
d
NULL
d NULL
1
C++ Notes 1995
Copyright Ivor Page 1995
32
Stacks Using Linked Lists
Data is added and removed from the same end (the head) of
the list. Implementation is trivial using a singly linked list:
class Stack_node {
int datum;
Stack_node * next;
Int_stack
};
front
class Int_stack {
List_node * front;
};
d p
d p
~
Int_stack s;
C++ Notes 1995
Copyright Ivor Page 1995
d NULL
Stacks Using Linked Lists cont’d
33
Here the interface uses the actual data, not pointers to nodes:
void push(int new_datum)
{
Stack_node *ptr = new Stack_node;
ptr -> datum = new_datum;
ptr -> next = s.front; /* step 1 */
s.front = ptr;
/* step 2 */
}
Int_stack
2
front
d p
d p
~
d
NULL
1
2
d p new_datum
C++ Notes 1995
Copyright Ivor Page 1995
Stacks Using Linked Lists cont’d
pop() must free the space occupied by the top node:
int pop(void)
{
Stack_node *ptr = s.front;
/* step 1 */
int result;
if(s.front!=NULL) {
result = s.front -> datum;
s.front = s.front -> next;
/* step 2 */
free(ptr);
return result;
/* step 3 */
}
return ERROR; /* a special value to indicate
empty stack */
}
C++ Notes 1995
Copyright Ivor Page 1995
34
35
Stacks Using Linked Lists cont’d
Pop in action:
3
Int_stack
front
2
d p
d p
~
d
NULL
2
1
C++ Notes 1995
ptr
Copyright Ivor Page 1995
36
Binary Trees Using Pointers
The basic structure uses two pointers in each node:
class Tree_node {
char * name;
int age;
~
Tree_node * left_child;
Tree_node * right_child;
};
name
age
left_child
~
right_child
In some applications it is also necessary to include a
pointer to the parent node.
C++ Notes 1995
Copyright Ivor Page 1995
37
Preorder Traversals
Trees can be traversed in a number of standard orders:
void preorder(Tree_node *n) pass in pointer to the
{
root, it visits all nodes
if(n= =NULL) return;
print(n); /* access data */
a
preorder(n -> left_child);
preorder(n -> right_child);
1
}
b
c
2
gives order 1, 2, 4, 5, 3, 6, 7
C++ Notes 1995
3
d
e
f
g
4
5
6
7
Copyright Ivor Page 1995
38
Postorder Traversals
void postorder(Tree_node * n)
{
if(n= = NULL) return;
postorder(n -> left_child);
postorder(n -> right_child);
print(n);
}
gives order 4, 5, 2, 6, 7, 3, 1
C++ Notes 1995
a
1
b
c
2
3
d
e
f
g
4
5
6
7
Copyright Ivor Page 1995
39
Inorder Traversal
void inorder(Tree_node * n)
{
if(n is a leaf node) print(n);
else {
inorder(n -> left_child);
print(n);
inorder(n -> right_child);
}
}
gives order 4, 2, 5, 1, 6, 3, 7
C++ Notes 1995
We can tell if *n
is a leaf node by
testing to see if
both its child
pointers are NULL
a
1
b
c
2
3
d
e
f
g
4
5
6
7
Copyright Ivor Page 1995
40
Binary Search Trees
In order to enable fast searches, the elements of the left
and right sub-trees of any node n, must have key values
that are related to the key value in that node. This
relationship must apply to all interior nodes, including
the root node.
In the binary search tree, the following relationship holds:
keys in left sub-tree of n < key of n < keys in right subtree of n
key x
7
node n
4
<x
left sub-tree of n
C++ Notes 1995
9
>x
right sub-tree of n
2
6
8
Copyright Ivor Page 1995
11
41
Binary Search Trees
Search, insert, and delete, can all be done in average time
log n. If the tree remains balanced, these operations take
O(log n).
Here is the class:
class Tree_node {
int key;
~
Tree_node * left_child, * right_child;
};
C++ Notes 1995
Copyright Ivor Page 1995
42
Binary Search Tree: Search
Tree_node * search(int skey, Tree_node * n)
{
if(n = = NULL) return NULL;
if(skey = = n -> key) return ptr;
if(skey < n -> key)
return(search(skey, n -> left_child));
else return(search(skey, n -> right_child));
}
7
4
2
C++ Notes 1995
9
6
8
Copyright Ivor Page 1995
11
43
Binary Search Tree: Insertion
void insert(Tree_node * new_node, Tree_node ** n)
{
if(*n = = NULL) {
/* we have reached a leaf */
*n = new_node; /* n now pts to new_node */
new_node -> left_child = NULL;
new_node -> right_child = NULL;
}
else if(new_node -> key < (*n) -> key)
insert(new_node, &((*n) -> left_child));
else if(new_node -> key > (*n) -> key)
insert(new_node, &((*n) -> right_child));
}
C++ Notes 1995
Copyright Ivor Page 1995
44
Insertion example
root
Assume ~ means NULL
n
10
2
~
1
~
~ 40 ~
~
6
~
The insertion
changes the
pointer right_child
45
The insertion begins with a search for the key value to be
inserted. When this position is located, n contains the
address of the left-child/right-child pointer of the parent
node. This pointer value is changed to point to the new
node.
C++ Notes 1995
Copyright Ivor Page 1995
45
Binary Search Trees, Delete_min
We will use value semantics this time. This function deletes the
node with the smallest key and returns its value:
int delete_min(Tree_node **n)
{
7
int result;
if((*n) -> left_child = = NULL) {
4
result = (*n) -> key;
9
*n = (*n) -> right_child;
free(*n);
6
8
11
return result;
}
else return delete_min(&((*n)->left_child));
}
C++ Notes 1995
Copyright Ivor Page 1995
46
Binary Search Trees, Deletion
delete() removes the node holding a given key:
void delete(int value; Tree_node **n) {
if((*n) != NULL) {
if(value > (*n) -> key) delete(value, (*n) -> left_child);
if(value < (*n) -> key) delete(value, (*n) -> right_child);
if((*n) -> left_child = = NULL)
if((*n) -> right_child = = NULL)
*n = NULL;
else *n = (*n) -> right_child;
else if((*n) -> right_child = = NULL)
*n = left_child;
else (*n) -> key = delete_min((*n) -> right_child)
}
}
C++ Notes 1995
Copyright Ivor Page 1995
47
Completely Balanced Binary Trees
Keeping a binary tree completely balanced implies O(n)
time for insertion (actually rebalancing after insertion):
5
5
3
2
7
4
Insert 1
3
7
2
6
4
1
The rebalance operation here
required a change to every node,
taking O(n) time. For this reason
we use “almost balanced binary
trees,” such as AVL and 2-3 trees.
C++ Notes 1995
6
Rebalance
4
2
1
6
3
5
Copyright Ivor Page 1995
7
48
AVL Trees
First we define the height of a binary tree to be the length of
the longest path from the root to a leaf node.
The AVL property: if N is a node in a binary tree, node N has
the AVL property if the heights of its left and right sub-trees
differ by no more than 1. If all nodes, including the root,
have the AVL property, then the tree is an AVL tree.
It is possible to do insert, delete, and search in O(log n) time in
an AVL tree.
AVL trees
C++ Notes 1995
Copyright Ivor Page 1995
49
2-3 Trees
A 2-3 tree has the following properties:
• Each interior node has 2 or 3 children
• Each path from the root to a leaf node has the same length.
smallest descendent
smallest descendent
7 16
of 2nd child
of 3rd child, or 5
2
-
5
8 12
7
8
19 -
12
16
19
In this version, all data are recorded in leaf nodes
C++ Notes 1995
Copyright Ivor Page 1995
50
Searching a 2-3 Tree
Assume that the search has reached the y/z node below
and we are searching for the key x. If x<y, goto the left
child next, if y<=x and x<z or there is no third child, goto
the 2nd child next, and if x>z, goto the third child next.
y
x<y
C++ Notes 1995
z
y<=x<z
y
z<x
x<y
x>=y
Copyright Ivor Page 1995
51
Insertion into 2-3 Trees
Say we wish to insert new_node with key x. We first
search for x and stop at an interior node N, just above
the leaf nodes where the node containing x would be if
it existed.
If N has only 2 children, new_node becomes a new child of
N, placed in the proper order:
5
-
4
5
5
-
+4 =
2
5
C++ Notes 1995
2
5
6
+6 =
4
5
2
5
2
5
Copyright Ivor Page 1995
6
52
Insertion into 2-3 Trees cont’d
A node has to be split if it already has 3 children:
4
5
4
-
-
5
3
4
+3 =
2
4
3
5
2
C++ Notes 1995
-
5
Copyright Ivor Page 1995
53
Insertion into 2-3 Trees cont’d
A special case occurs when the root has to be split:
3
7
4
-
+4 =
2
3
7
3
2
-
3
7
4
-
7
This only occurs when there are exactly 3 data elements in
the tree before the insertion.
C++ Notes 1995
Copyright Ivor Page 1995
54
Deletion
When a leaf node is deleted, its parent P may be left with
only one child. If P has a neighboring sibling Q with 3
leaves. The leaves of P and Q can be shared, 2 each
between them:
4 -
6 -3 =
P 3 2
3
C++ Notes 1995
Q 6 7
4
6
P 4 7
2
4
Q 7 6
7
This process
does not lead
to further
recursion up
the tree.
Copyright Ivor Page 1995
55
Deletion
If P does not have a neighboring sibling with 3 children,
then it must have a neighboring sibling Q with 2
children. We combine P and Q, giving all three leaves to
the combined node PQ:
4 -
- -3 =
P 3 2
3
C++ Notes 1995
Q 7 4
7
PQ 4 7
2
4
7
As we can see,
this process may
leave the parent
of PQ with 1 child,
so the process
must continue up
the tree recursively
Copyright Ivor Page 1995
56
Deletion
When the root node has only one child, it may be deleted,
its single child becomes the new root node:
- -
old root
PQ 4 7
2
4
PQ 4 7
7
2
4
new root
7
We may reach the root, leaving it in this condition, by recursion
up the tree.
C++ Notes 1995
Copyright Ivor Page 1995
57
Struct for 2-3 trees
typedef struct two_three_node {
int kind; // 0=leaf, 1=interior
int low_of_second, low_of_third;
int key;
~
two_three_node * left_child;
two_three_node * right_child;
};
two_three_node * root;
Could use a
union of the
2 node types
// pointer to root node
We shall not pursue the details of the coding of the algorithms.
C++ Notes 1995
Copyright Ivor Page 1995
58
Rotations in AVL Trees
Rotations are used to reestablish the AVL property after
an insertion:
A
B
A
T1
T3
Single Rotation
to the right
B
T1
T2
T2
A
B
B
T1
T2
C++ Notes 1995
T3
A
T3
Single Rotation
to the left
T1
T3
T2
Copyright Ivor Page 1995
59
Double Rotations in AVL Trees
C
A
T4
Double
Rotation
to Right
B
A
C
B
T1
T1
T2
T2
B
C
B
T2
C++ Notes 1995
T4
T3
A
T1
T3
A
T4
Double
Rotation
to Left
T1
C
T2
T3
T4
T3
Copyright Ivor Page 1995
60
Example of building an AVL Tree
Say we have the keys 19, 10, 3, 5, 20, 13, 17, 15, 1, 8, 6 to
be inserted in that order:
19
19
19
+10=
+3=
10
SRR
10
10
3
19
3
10
+5,20,13,17=
3
19
5
13
20
17
C++ Notes 1995
Copyright Ivor Page 1995
61
Example of building an AVL Tree
The addition of 15 causes an unbalance at node 13, so a
double left rotation is req’d:
10
13
15
DRL
+1,8=
3
19
17
13
17
1
5
15
20
15
8
C++ Notes 1995
13
17
Copyright Ivor Page 1995
62
AVL Tree Example cont’d
The addition of 6 causes an imbalance
at node 5. A double left rotation is
needed to fix it.
10
3
19
5
6
DRL
1
5
15
20
8
5
8
13
6
17
10
This gives the final AVL Tree:
3
1
19
6
5
C++ Notes 1995
8
15
8
13
20
17
Copyright Ivor Page 1995
63
AVL Tree Implementation
The only addition to the nodes is a balance factor, which gives
the difference between the heights of the left and right subtrees, and should be +1, 0, or -1, in an AVL tree.
typedef struct AVL_node {
int balance_factor;
int key;
~
AVL_node * left_child;
AVL_node * right_child;
};
We shall not pursue the details of the coding of the algorithms.
C++ Notes 1995
Copyright Ivor Page 1995