Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Advanced C Programming Ivor Page C++ Notes 1995 Copyright Ivor Page 1995 2 Copyright Notice This Microsoft Powerpoint presentation data file is the sole property of Ivor P. Page. It may only be viewed, or shown to individuals, or exhibited to a class, or broadcast by any other means, with the owner’s written permission. This Powerpoint data file may not be copied without the owner’s written consent. Printed notes may not be made from this data file for any use without the owner’s written consent. All copies of this file MUST BE DESTROYED by October 1st 1996. Copyrighted, Ivor P. Page, 1995, 410 Pleasant Valley Lane, Richardson, TX 75080 C++ Notes 1995 Copyright Ivor Page 1995 3 Data Structures: The Data For almost all purposes, data structures are used to store many “records” of data, where the records have something in common. The records have many fields, such as in a personnel records system. The records may be fixed or variable in size. If the size has a “small” bound, it may be that fixed sized records can be used, where every record is allocated space for the largest possible record. Alternatively the data structure must be designed for variable length records. The maximum number of records to be held may be known or unknown. C++ Notes 1995 Copyright Ivor Page 1995 4 Data Structures: The Data In most application, there is at least one ordering relationship that can be applied to the data, i.e. names can be alphabetically ordered; personnel can be ordered by personnel-number within each department, and then the departments can be ordered by department code, etc. It may be necessary to search the data given a key. A key is a value corresponding to one field of the records, such as a name, or a personnel-number. Sometimes we may need to search on more than one field. C++ Notes 1995 Copyright Ivor Page 1995 5 Abstract Data Types It is convenient (and valuable) to design all data structures using the same basic strategy. An Abstract Data Type is simply some data, together with a set of interface functions that manipulate the data. Encapsulation: The actual data should never be manipulated (nor be seen if possible) by users of the data structure. There may also be some internal functions that directly operate on the data, but do not form part of the interface. These should be hidden if possible from the users. C++ Notes 1995 Copyright Ivor Page 1995 6 Abstract Data Types Abstraction: Every operation that the users need to perform on the data must be provided for by the interface. The interface functions should be very easy to use and to understand (unlike the detailed manipulations of the data itself) and should typically be cast in the syntax of the application to be supported. The users can then ignore the inner workings of the data type and concentrate on the application level. C++ Notes 1995 Copyright Ivor Page 1995 7 Abstract Data Types Cont’d Here is the structure that we will use for all data structures: ADT hidden Data ~ Data f1 caller hf1 f2 hf2 f3 Interface C++ Notes 1995 Copyright Ivor Page 1995 8 Table Interface: insert(index,data) delete(index) data = get_data(index) index = search(data) index = find_free_cell() Index C++ Notes 1995 Copyright Ivor Page 1995 9 Queue Interface: add_to_tail(data) data = remove_from_hd() A tail C++ Notes 1995 B C D head Copyright Ivor Page 1995 10 Stack Interface: push(data) data = pop() data = top() bool = empty() C++ Notes 1995 Copyright Ivor Page 1995 11 Tree Interface: n2 = parent(n1) n2 = leftmost_child(n1) n2 = right_sibling(n1) data = get_data(n) add_left_child(n,data) add_right_sibling(n,data) remove_node(n) n = search(data) .. .. C++ Notes 1995 .. Copyright Ivor Page 1995 12 Graph Interface: n = search(data) add_neighbor(n,data) add_link(n1,n2) delete_link(n1,n2) delete_node(n) C++ Notes 1995 Copyright Ivor Page 1995 Implementation with Arrays 13 All the above data structures can be implemented with arrays, but in most situations, the fixed size of an array makes it unsuitable, particularly when the amount of data to be held is unknown. Tables: Tables include data dictionaries, for example a name and address table. Here the table may be searched given the name. The search maps the given name into an index corresponding to the entry containing that name. index C++ Notes 1995 name address age etc. N Copyright Ivor Page 1995 14 Implementation with Arrays Linear search: If the entries are unordered, the search takes O(N) time, where each “probe” requires a string comparison. Find_free_cell() also requires O(N) time. Given an index, insert() and delete() take constant time, O(1). Binary search: For binary search, the entries must be ordered (sorted alphabetically by name if we want to search on the name.) Keeping the table entries ordered implies O(N) time for insertion and deletion. The search proceeds as follows in O(log N) time: C++ Notes 1995 Copyright Ivor Page 1995 15 Binary Search Base_Index = 0; Top_Index = N-1; found = 0; /* false */ while (!found){ Index = (Base_Index + Top_Index)/2; found = probe_element(Index); /* match? */ if(found) break; if(Base_Index = = Top_Index ) break; if (key is beyond Index) Base_Index = Index+1; else Top_Index = Index-1; } /* value of found tells if the search was successful */ C++ Notes 1995 Copyright Ivor Page 1995 16 Hash Tables In a closed hash table, there is a mathematical mapping from the key (name) to the initial probe index. For example, the hash function for a table of size N entries, might be: hash = (sum of characters in name) % N This function has the disadvantage that names such as Fred, redF, derF, etc, all hash to the same initial probe index. There are much better hash functions available: hash = sum of (name[i]*prime[i])%N Here we sum the chars multiplied by prime numbers, so IVOR gives (I*1 + V*2 + O*3 +R*5)%N =892%N. C++ Notes 1995 Copyright Ivor Page 1995 17 Hash Tables The search is as follows: index = hash(name); count = found = 0; while(!found && count < N) { found = probe(index); if(found) break; index = rehash(name, index); count++; } C++ Notes 1995 Copyright Ivor Page 1995 18 Hash Table Performance Average Times Insertion or successful search takes -(1/a)log(1-a) Deletion or unsuccessful search takes 1/(1-a) where a = fraction of cells occupied C++ Notes 1995 Copyright Ivor Page 1995 19 Queues Using arrays The circular list, or ring buffer is often used for a queue: 0 1 head a b ~ head ~ y z tail y z 0 1 ~ ~ tail N-1 N-1 a b tail is the index of the first free cell z C++ Notes 1995 y ~ b a Copyright Ivor Page 1995 20 Queues Using arrays Insert at tail, if not full: put data in cell tail; tail = (tail+1) % N; Delete from head, if not empty: remove data from cell head; head = (head+1) % N; C++ Notes 1995 Copyright Ivor Page 1995 21 Queues Using arrays When the queue is either full or empty, the 2 indices have the same value, so we cannot distinguish these two cases. A better way is to use an index for the head, and a count of the number of cells occupied. C++ Notes 1995 0 1 head ~ tail N-1 Copyright Ivor Page 1995 ~ 22 Queues Using arrays Code when we use an index for head and a count: (head+count)%N gives the first free cell Insert at tail, if not full: put data in cell (head+count)%N; 0 count++; 1 head Delete from head, if not empty: remove data from cell head; head = (head+1) % N; count--; C++ Notes 1995 count ~ N-1 Copyright Ivor Page 1995 ~ 23 Stacks using arrays We use an array in which the data doesn’t move when we do a push() or a pop(): 0 1 ~ ~ top push if not full: insert data at cell top+1; top++; C++ Notes 1995 pop if not empty: remove data from cell top; top--; Copyright Ivor Page 1995 24 Binary Trees using Arrays All trees can be implemented using binary trees. Here is an array structure for a binary tree: a a b c d e f g 1 Array Index = 1 2 3 4 5 6 7 Note that array element zero is not used. b c 2 3 d e f g 4 5 6 7 A node with index i (i>1) has its parent at node i/2. The left child of node with index i is at node 2*i. The right child of node with index i is at node 2*i+1. C++ Notes 1995 Copyright Ivor Page 1995 Data Structures using Pointers 25 Structures using pointers are particularly useful when the maximum number of elements that must be stored is not known. This is often the case in practice, especially when writing modules for a software library, where the actual users will have many different needs. Linked List: head d p d p ~ d NULL Each node contains some data d (a record), and a pointer p. A pointer head points to the first element of the list. The pointer of the last element of the list contains NULL. The nodes are implemented using structs. C++ Notes 1995 Copyright Ivor Page 1995 26 Linked Lists in C Here is the declaration of the class and the head pointer: class List_node { char * name; int age; ~ List_node * next; }; List_node * head = NULL; head C++ Notes 1995 d p d p ~ d NULL Copyright Ivor Page 1995 27 Linked Lists in C Here is a function to add new_node at the head of the list: void add_to_list(List_node * new_node) { new_node -> next = head; /* step 1 */ head = new_node; /* step 2 */ } head 2 C++ Notes 1995 1 new_node Copyright Ivor Page 1995 Linked Lists in C cont’d Function to remove the first element from the list and return a pointer to it: List_node * remove_first_element(void) { List_node *first = head; /* step 1 */ if(head!=NULL) head = head -> next; /* step 2 */ return first; } first 1 head 2 C++ Notes 1995 Copyright Ivor Page 1995 28 Linked Lists in C cont’d Function to search a linked list for a certain name and return a pointer to the node if a match is found: List_node * search_list(char * key) { List_node * ptr = head; while(ptr != NULL) { if(strcmp(key, ptr -> name)= = 0) break; ptr = ptr -> next; } return ptr; } C++ Notes 1995 Copyright Ivor Page 1995 29 30 Queues using Linked Lists In a queue, data is added at one end (the tail) and removed from the other end (the head). A linked list is adequate for this purpose with the addition of a pointer to the tail: class Queue_node { char * name; int age; tail ~ Queue Queue_node * next; }; head class Queue { Queue_node * head; Queue_node * tail; d p d p d NULL ~ }; C++ Notes 1995 Copyright Ivor Page 1995 31 Inserting at the queue tail void add_to_tail(Queue_node * elem) { if(q.tail = = NULL) /* queue is empty */ q.head = q.tail = elem; else { q.tail -> next = elem; tail q.tail = elem; } Queue elem 2 head d p d p ~ d NULL d NULL 1 C++ Notes 1995 Copyright Ivor Page 1995 32 Stacks Using Linked Lists Data is added and removed from the same end (the head) of the list. Implementation is trivial using a singly linked list: class Stack_node { int datum; Stack_node * next; Int_stack }; front class Int_stack { List_node * front; }; d p d p ~ Int_stack s; C++ Notes 1995 Copyright Ivor Page 1995 d NULL Stacks Using Linked Lists cont’d 33 Here the interface uses the actual data, not pointers to nodes: void push(int new_datum) { Stack_node *ptr = new Stack_node; ptr -> datum = new_datum; ptr -> next = s.front; /* step 1 */ s.front = ptr; /* step 2 */ } Int_stack 2 front d p d p ~ d NULL 1 2 d p new_datum C++ Notes 1995 Copyright Ivor Page 1995 Stacks Using Linked Lists cont’d pop() must free the space occupied by the top node: int pop(void) { Stack_node *ptr = s.front; /* step 1 */ int result; if(s.front!=NULL) { result = s.front -> datum; s.front = s.front -> next; /* step 2 */ free(ptr); return result; /* step 3 */ } return ERROR; /* a special value to indicate empty stack */ } C++ Notes 1995 Copyright Ivor Page 1995 34 35 Stacks Using Linked Lists cont’d Pop in action: 3 Int_stack front 2 d p d p ~ d NULL 2 1 C++ Notes 1995 ptr Copyright Ivor Page 1995 36 Binary Trees Using Pointers The basic structure uses two pointers in each node: class Tree_node { char * name; int age; ~ Tree_node * left_child; Tree_node * right_child; }; name age left_child ~ right_child In some applications it is also necessary to include a pointer to the parent node. C++ Notes 1995 Copyright Ivor Page 1995 37 Preorder Traversals Trees can be traversed in a number of standard orders: void preorder(Tree_node *n) pass in pointer to the { root, it visits all nodes if(n= =NULL) return; print(n); /* access data */ a preorder(n -> left_child); preorder(n -> right_child); 1 } b c 2 gives order 1, 2, 4, 5, 3, 6, 7 C++ Notes 1995 3 d e f g 4 5 6 7 Copyright Ivor Page 1995 38 Postorder Traversals void postorder(Tree_node * n) { if(n= = NULL) return; postorder(n -> left_child); postorder(n -> right_child); print(n); } gives order 4, 5, 2, 6, 7, 3, 1 C++ Notes 1995 a 1 b c 2 3 d e f g 4 5 6 7 Copyright Ivor Page 1995 39 Inorder Traversal void inorder(Tree_node * n) { if(n is a leaf node) print(n); else { inorder(n -> left_child); print(n); inorder(n -> right_child); } } gives order 4, 2, 5, 1, 6, 3, 7 C++ Notes 1995 We can tell if *n is a leaf node by testing to see if both its child pointers are NULL a 1 b c 2 3 d e f g 4 5 6 7 Copyright Ivor Page 1995 40 Binary Search Trees In order to enable fast searches, the elements of the left and right sub-trees of any node n, must have key values that are related to the key value in that node. This relationship must apply to all interior nodes, including the root node. In the binary search tree, the following relationship holds: keys in left sub-tree of n < key of n < keys in right subtree of n key x 7 node n 4 <x left sub-tree of n C++ Notes 1995 9 >x right sub-tree of n 2 6 8 Copyright Ivor Page 1995 11 41 Binary Search Trees Search, insert, and delete, can all be done in average time log n. If the tree remains balanced, these operations take O(log n). Here is the class: class Tree_node { int key; ~ Tree_node * left_child, * right_child; }; C++ Notes 1995 Copyright Ivor Page 1995 42 Binary Search Tree: Search Tree_node * search(int skey, Tree_node * n) { if(n = = NULL) return NULL; if(skey = = n -> key) return ptr; if(skey < n -> key) return(search(skey, n -> left_child)); else return(search(skey, n -> right_child)); } 7 4 2 C++ Notes 1995 9 6 8 Copyright Ivor Page 1995 11 43 Binary Search Tree: Insertion void insert(Tree_node * new_node, Tree_node ** n) { if(*n = = NULL) { /* we have reached a leaf */ *n = new_node; /* n now pts to new_node */ new_node -> left_child = NULL; new_node -> right_child = NULL; } else if(new_node -> key < (*n) -> key) insert(new_node, &((*n) -> left_child)); else if(new_node -> key > (*n) -> key) insert(new_node, &((*n) -> right_child)); } C++ Notes 1995 Copyright Ivor Page 1995 44 Insertion example root Assume ~ means NULL n 10 2 ~ 1 ~ ~ 40 ~ ~ 6 ~ The insertion changes the pointer right_child 45 The insertion begins with a search for the key value to be inserted. When this position is located, n contains the address of the left-child/right-child pointer of the parent node. This pointer value is changed to point to the new node. C++ Notes 1995 Copyright Ivor Page 1995 45 Binary Search Trees, Delete_min We will use value semantics this time. This function deletes the node with the smallest key and returns its value: int delete_min(Tree_node **n) { 7 int result; if((*n) -> left_child = = NULL) { 4 result = (*n) -> key; 9 *n = (*n) -> right_child; free(*n); 6 8 11 return result; } else return delete_min(&((*n)->left_child)); } C++ Notes 1995 Copyright Ivor Page 1995 46 Binary Search Trees, Deletion delete() removes the node holding a given key: void delete(int value; Tree_node **n) { if((*n) != NULL) { if(value > (*n) -> key) delete(value, (*n) -> left_child); if(value < (*n) -> key) delete(value, (*n) -> right_child); if((*n) -> left_child = = NULL) if((*n) -> right_child = = NULL) *n = NULL; else *n = (*n) -> right_child; else if((*n) -> right_child = = NULL) *n = left_child; else (*n) -> key = delete_min((*n) -> right_child) } } C++ Notes 1995 Copyright Ivor Page 1995 47 Completely Balanced Binary Trees Keeping a binary tree completely balanced implies O(n) time for insertion (actually rebalancing after insertion): 5 5 3 2 7 4 Insert 1 3 7 2 6 4 1 The rebalance operation here required a change to every node, taking O(n) time. For this reason we use “almost balanced binary trees,” such as AVL and 2-3 trees. C++ Notes 1995 6 Rebalance 4 2 1 6 3 5 Copyright Ivor Page 1995 7 48 AVL Trees First we define the height of a binary tree to be the length of the longest path from the root to a leaf node. The AVL property: if N is a node in a binary tree, node N has the AVL property if the heights of its left and right sub-trees differ by no more than 1. If all nodes, including the root, have the AVL property, then the tree is an AVL tree. It is possible to do insert, delete, and search in O(log n) time in an AVL tree. AVL trees C++ Notes 1995 Copyright Ivor Page 1995 49 2-3 Trees A 2-3 tree has the following properties: • Each interior node has 2 or 3 children • Each path from the root to a leaf node has the same length. smallest descendent smallest descendent 7 16 of 2nd child of 3rd child, or 5 2 - 5 8 12 7 8 19 - 12 16 19 In this version, all data are recorded in leaf nodes C++ Notes 1995 Copyright Ivor Page 1995 50 Searching a 2-3 Tree Assume that the search has reached the y/z node below and we are searching for the key x. If x<y, goto the left child next, if y<=x and x<z or there is no third child, goto the 2nd child next, and if x>z, goto the third child next. y x<y C++ Notes 1995 z y<=x<z y z<x x<y x>=y Copyright Ivor Page 1995 51 Insertion into 2-3 Trees Say we wish to insert new_node with key x. We first search for x and stop at an interior node N, just above the leaf nodes where the node containing x would be if it existed. If N has only 2 children, new_node becomes a new child of N, placed in the proper order: 5 - 4 5 5 - +4 = 2 5 C++ Notes 1995 2 5 6 +6 = 4 5 2 5 2 5 Copyright Ivor Page 1995 6 52 Insertion into 2-3 Trees cont’d A node has to be split if it already has 3 children: 4 5 4 - - 5 3 4 +3 = 2 4 3 5 2 C++ Notes 1995 - 5 Copyright Ivor Page 1995 53 Insertion into 2-3 Trees cont’d A special case occurs when the root has to be split: 3 7 4 - +4 = 2 3 7 3 2 - 3 7 4 - 7 This only occurs when there are exactly 3 data elements in the tree before the insertion. C++ Notes 1995 Copyright Ivor Page 1995 54 Deletion When a leaf node is deleted, its parent P may be left with only one child. If P has a neighboring sibling Q with 3 leaves. The leaves of P and Q can be shared, 2 each between them: 4 - 6 -3 = P 3 2 3 C++ Notes 1995 Q 6 7 4 6 P 4 7 2 4 Q 7 6 7 This process does not lead to further recursion up the tree. Copyright Ivor Page 1995 55 Deletion If P does not have a neighboring sibling with 3 children, then it must have a neighboring sibling Q with 2 children. We combine P and Q, giving all three leaves to the combined node PQ: 4 - - -3 = P 3 2 3 C++ Notes 1995 Q 7 4 7 PQ 4 7 2 4 7 As we can see, this process may leave the parent of PQ with 1 child, so the process must continue up the tree recursively Copyright Ivor Page 1995 56 Deletion When the root node has only one child, it may be deleted, its single child becomes the new root node: - - old root PQ 4 7 2 4 PQ 4 7 7 2 4 new root 7 We may reach the root, leaving it in this condition, by recursion up the tree. C++ Notes 1995 Copyright Ivor Page 1995 57 Struct for 2-3 trees typedef struct two_three_node { int kind; // 0=leaf, 1=interior int low_of_second, low_of_third; int key; ~ two_three_node * left_child; two_three_node * right_child; }; two_three_node * root; Could use a union of the 2 node types // pointer to root node We shall not pursue the details of the coding of the algorithms. C++ Notes 1995 Copyright Ivor Page 1995 58 Rotations in AVL Trees Rotations are used to reestablish the AVL property after an insertion: A B A T1 T3 Single Rotation to the right B T1 T2 T2 A B B T1 T2 C++ Notes 1995 T3 A T3 Single Rotation to the left T1 T3 T2 Copyright Ivor Page 1995 59 Double Rotations in AVL Trees C A T4 Double Rotation to Right B A C B T1 T1 T2 T2 B C B T2 C++ Notes 1995 T4 T3 A T1 T3 A T4 Double Rotation to Left T1 C T2 T3 T4 T3 Copyright Ivor Page 1995 60 Example of building an AVL Tree Say we have the keys 19, 10, 3, 5, 20, 13, 17, 15, 1, 8, 6 to be inserted in that order: 19 19 19 +10= +3= 10 SRR 10 10 3 19 3 10 +5,20,13,17= 3 19 5 13 20 17 C++ Notes 1995 Copyright Ivor Page 1995 61 Example of building an AVL Tree The addition of 15 causes an unbalance at node 13, so a double left rotation is req’d: 10 13 15 DRL +1,8= 3 19 17 13 17 1 5 15 20 15 8 C++ Notes 1995 13 17 Copyright Ivor Page 1995 62 AVL Tree Example cont’d The addition of 6 causes an imbalance at node 5. A double left rotation is needed to fix it. 10 3 19 5 6 DRL 1 5 15 20 8 5 8 13 6 17 10 This gives the final AVL Tree: 3 1 19 6 5 C++ Notes 1995 8 15 8 13 20 17 Copyright Ivor Page 1995 63 AVL Tree Implementation The only addition to the nodes is a balance factor, which gives the difference between the heights of the left and right subtrees, and should be +1, 0, or -1, in an AVL tree. typedef struct AVL_node { int balance_factor; int key; ~ AVL_node * left_child; AVL_node * right_child; }; We shall not pursue the details of the coding of the algorithms. C++ Notes 1995 Copyright Ivor Page 1995