Download ppt

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Rainbow table wikipedia , lookup

Hash table wikipedia , lookup

Linked list wikipedia , lookup

Transcript
More on Data Structures in C
CS-2301 System Programming
D-term 2009
(Slides include materials from The C Programming Language, 2nd edition, by Kernighan and Ritchie and
from C: How to Program, 5th and 6th editions, by Deitel and Deitel)
CS-2301 D-term 2009
More on Data Structures
in C
1
Linked List Review
• Linear data structure
• Easy to grow and shrink
• Easy to add and delete items
• Time to search for an item – O(n)
CS-2301 D-term 2009
More on Data Structures
in C
2
Linked List (continued)
struct listItem *head;
payload
next
payload
next
payload
next
payload
next
CS-2301 D-term 2009
More on Data Structures
in C
3
Doubly-Linked List (review)
struct listItem *head, *tail;
payload
prev
next
payload
prev
next
payload
payload
prev
prev
next
CS-2301 D-term 2009
More on Data Structures
in C
4
next
AddAfter(item *p, item *new)
Simple linked list
{ new -> next =
p -> next;
p -> next = new;
}
CS-2301 D-term 2009
Doubly-linked list
{ new -> next =
p -> next;
if (p -> next)
p->next->prev = new;
new -> prev = p;
p -> next = new;
}
More on Data Structures
in C
5
AddAfter(item *p, item *new)
Simple linked list
{ new -> next =
p -> next;
p -> next = new;
}
payload
prev
Doubly-linked list
{ new -> next =
p -> next;
if (p -> next)
p->next->prev = new;
new -> prev = p;
p -> next = new;
}
next
payload
prev
payload
next
CS-2301 D-term 2009
prev
More on Data Structures
in C
next
6
AddAfter(item *p, item *new)
Simple linked list
{ new -> next =
p -> next;
p -> next = new;
}
payload
prev
Doubly-linked list
{ new -> next =
p -> next;
if (p -> next)
p->next->prev = new;
new -> prev = p;
p -> next = new;
}
next
payload
prev
payload
next
CS-2301 D-term 2009
prev
More on Data Structures
in C
next
7
AddAfter(item *p, item *new)
Simple linked list
{ new -> next =
p -> next;
p -> next = new;
}
payload
prev
Doubly-linked list
{ new -> next =
p -> next;
if (p -> next)
p->next->prev = new;
new -> prev = p;
p -> next = new;
}
next
payload
prev
payload
next
CS-2301 D-term 2009
prev
More on Data Structures
in C
next
8
AddAfter(item *p, item *new)
Simple linked list
{ new -> next =
p -> next;
p -> next = new;
}
payload
prev
Doubly-linked list
{ new -> next =
p -> next;
if (p -> next)
p->next->prev = new;
new -> prev = p;
p -> next = new;
}
next
payload
prev
payload
next
CS-2301 D-term 2009
prev
More on Data Structures
in C
next
9
deleteNext(item *p)
Simple linked list
{ if (p->next != NULL)
p->next = p->next->
next;
}
CS-2301 D-term 2009
Doubly-linked list
• Complicated
• Easier to deleteItem
More on Data Structures
in C
10
deleteItem(item *p)
Simple linked list
• Not possible without
having a pointer to
previous item!
Doubly-linked list
{ if(p->next != NULL)
p->next->prev = p->prev;
if(p->prev != NULL)
p->prev->next = p->next;
}
payload
prev
next
payload
prev
payload
next
CS-2301 D-term 2009
prev
More on Data Structures
in C
next
11
deleteItem(item *p)
Simple linked list
• Not possible without
having a pointer to
previous item!
Doubly-linked list
{ if(p->next != NULL)
p->next->prev = p->prev;
if(p->prev != NULL)
p->prev->next = p->next;
}
payload
prev
next
payload
prev
payload
next
CS-2301 D-term 2009
prev
More on Data Structures
in C
next
12
deleteItem(item *p)
Simple linked list
• Not possible without
having a pointer to
previous item!
Doubly-linked list
{ if(p->next != NULL)
p->next->prev = p->prev;
if(p->prev != NULL)
p->prev->next = p->next;
}
payload
prev
next
payload
prev
payload
next
CS-2301 D-term 2009
prev
More on Data Structures
in C
next
13
Special Cases of Linked Lists
• Queue:–
– Items always added to tail
– Items always removed from head
• Stack:–
– Items always added to head
– Items always removed from head
CS-2301 D-term 2009
More on Data Structures
in C
14
Bubble Sort a Linked List
item *BubbleSort(item *p) {
if (p->next != NULL) {
item *q = p->next, *qq = p;
for (;q != NULL; qq = q, q = q->next)
if (p->payload > q->payload){
/*swap p and q */
}
p->next = BubbleSort(p->next);
};
return p;
}
CS-2301 D-term 2009
More on Data Structures
in C
15
Bubble Sort a Linked List
item *BubbleSort(item *p) {
if (p->next != NULL) {
item *q = p->next, *qq = p;
for (;q != NULL; qq = q, q = q->next)
if (p->payload > q->payload){
item *temp = p->next;
p->next = q->next; q->next = temp;
qq->next = p; p = q;
}
p->next = BubbleSort(p->next);
};
return p;
}
CS-2301 D-term 2009
More on Data Structures
in C
16
Head of (sub)list being sorted
Pointer to step thru (sub)list
Bubble Sort a Linked List
Pointer to item previous to
item *BubbleSort(item *p) { q in (sub)list
if (p->next != NULL) {
item *q = p->next, *qq = p;
for (;q != NULL; qq = q, q = q->next)
if (p->payload > q->payload){
item *temp = p->next;
p->next = q->next; q->next = temp;
qq->next = p; p = q;
}
p->next = BubbleSort(p->next);
};
return p;
}
CS-2301 D-term 2009
More on Data Structures
in C
17
Potential Exam Questions
• Analyze BubbleSort to determine if it is
correct, and fix it if incorrect.
• Hint: you need to define “correct”
• Hint2: you need to define a loop invariant to
convince yourself
• Draw a diagram showing the nodes,
pointers, and actions of the algorithm
CS-2301 D-term 2009
More on Data Structures
in C
18
Observations:–
• What is the order (Big-O notation) of the
Bubble Sort algorithm?
• Answer: O(n2)
• Note that Quicksort is faster – O(n log n) on
average
• Pages 87 & 110 in Kernighan and Ritchie
• Potential exam question:– why?
CS-2301 D-term 2009
More on Data Structures
in C
19
Questions?
CS-2301 D-term 2009
More on Data Structures
in C
20
Binary Tree (review)
• A linked list but with
two links per item
struct treeItem {
type payload;
treeItem *left;
treeItem *right;
};
payload
left
payload
left
payload
payload
left
left
right
payload
right
left
payload
left
right
right
right
CS-2301 D-term 2009
More on Data Structures
in C
21
right
payload
left
right
Binary Trees (continued)
• Two-dimensional data structure
• Easy to grow and shrink
• Easy to add and delete items at leaves
• More work needed to insert or delete branch nodes
• Search time is O(log n)
• If tree is reasonably balanced
• Degenerates to O(n) in worst case if unbalanced
CS-2301 D-term 2009
More on Data Structures
in C
22
Order of Traversing Binary Trees
• In-order
• Traverse left sub-tree (in-order)
• Visit node itself
• Traverse right sub-tree (in-order)
• Pre-order
• Visit node first
• Traverse left sub-tree
• Traverse right sub-tree
• Post-order
• Traverse left sub-tree
• Traverse right sub-tree
• Visit node last
CS-2301 D-term 2009
More on Data Structures
in C
23
Order of Traversing Binary Trees
• In-order
• Traverse left sub-tree (in-order)
• Visit node itself
• Traverse right sub-tree (in-order)
• Pre-order
• Visit node first
• Traverse left sub-tree
• Traverse right sub-tree
• Post-order
• Traverse left sub-tree
• Traverse right sub-tree
• Visit node last
CS-2301 D-term 2009
More on Data Structures
in C
24
Example of Binary Tree
x = (a.real*b.imag - b.real*a.imag) /
sqrt(a.real*b.real – a.imag*b.imag)
=
x
/
sqrt
*
.
a
.
real
CS-2301 D-term 2009
-
*
b
.
imag
b
…
.
real
More on Data Structures
in C
a
imag
25
Question
• What kind of traversal order is required for
this expression?
• In-order?
• Pre-order?
• Post-order?
CS-2301 D-term 2009
More on Data Structures
in C
26
Binary Trees in Compilers
• Used to represent the structure of the
compiled program
• Optimizations
•
•
•
•
•
Common sub-expression detection
Code simplification
Loop unrolling
Parallelization
Reductions in strength – e.g., substituting additions
for multiplications, etc.
• Many others
CS-2301 D-term 2009
More on Data Structures
in C
27
Questions about Trees?
or about
Programming Assignment 6?
CS-2301 D-term 2009
More on Data Structures
in C
28
New Challenge
• What if we require a data structure that has
to be accessed by value in constant time?
• I.e., O(log n) is not good enough!
• Need to be able to add or delete items
• Total number of items unknown
• But an approximate maximum might be known
CS-2301 D-term 2009
More on Data Structures
in C
29
Examples
• Anti-virus scanner
• Symbol table of compiler
• Virtual memory tables in operating system
• Bank account for an individual
CS-2301 D-term 2009
More on Data Structures
in C
30
Observation
• Arrays provide constant time access …
• … but you have to know which element you want!
• We only know the contents of the item we want!
• Also
• Not easy to grow or shrink
• Not open-ended
• Can we do better?
CS-2301 D-term 2009
More on Data Structures
in C
31
Answer – Hash Table
• Definition:– Hash Table
• A data structure comprising an array (for constant time access)
• A set of linked lists (one list for each array element)
• A hashing function to convert search key to array index
CS-2301 D-term 2009
More on Data Structures
in C
32
Definition
• Search key:– a value stored as (part of) the
payload of the item you are looking for
• Need to find the item containing that value
(i.e., key)
CS-2301 D-term 2009
More on Data Structures
in C
33
Answer – Hash Table
• Definition:– Hash Table
• A data structure comprising an array (for constant time access)
• A set of linked lists (one list for each array element)
• A hashing function to convert search key to array index
• Definition:– Hashing function (or simply hash function)
• A function that takes the search key in question and
“randomizes” it to produce an index
• So that non-randomness of keys avoids concentration of too
many elements around a few indices in array
• See §6.6 in Kernighan & Ritchie
CS-2301 D-term 2009
More on Data Structures
in C
34
Hash Table Structure
item item item item item ... item item item item item
data
next
data
next
data
next
CS-2301 D-term 2009
data
next
data
next
data
next
data
next
data
next
data
next
More on Data Structures
in C
data
next
35
Guidelines for Hash Tables
• Lists from each item should be short
• I.e., with short search time (approximately constant)
• Size of array should be based on expected # of
entries
• Err on large side if possible
• Hashing function
• Should “spread out” the values relatively uniformly
• Multiplication and division by prime numbers usually works
well
CS-2301 D-term 2009
More on Data Structures
in C
36
Example Hashing Function
• P. 144 of K & R
#define HASHSIZE 101
unsigned int hash(char *s) {
unsigned int hashval;
for (hashval = 0; *s != ‘\0’; s++)
hashval = *s + 31 * hashval;
return hashval % HASHSIZE
}
CS-2301 D-term 2009
More on Data Structures
in C
37
Example Hashing Function
• P. 144 of K & R
#define HASHSIZE 101
unsigned int hash(char *s) {
unsigned int hashval;
for (hashval = 0; *s != ‘\0’; s++)
hashval = *s + 31 * hashval;
return hashval % HASHSIZE
}
CS-2301 D-term 2009
More on Data Structures
in C
38
Using a Hash Table
struct item *lookup(char *s) {
struct item *np;
for (np = hashtab[hash(s)]; np != NULL;
np = np -> next)
if (strcmp(s, np->data) == 0)
return np; /*found*/
return NULL; /* not found */
}
CS-2301 D-term 2009
More on Data Structures
in C
39
Using a Hash Table
struct item *lookup(char *s) {
struct item *np;
for (np = hashtab[hash(s)]; np != NULL;
np = np -> next)
if (strcmp(s, np->data) == 0)
return np; /*found*/
return NULL; /* not found */
}
CS-2301 D-term 2009
More on Data Structures
in C
40
Using a Hash Table
struct item *lookup(char *s) {
struct item *np;
for (np = hashtab[hash(s)]; np != NULL;
np = np -> next)
if (strcmp(s, np->data) == 0)
return np; /*found*/
return NULL; /* not found */
}
CS-2301 D-term 2009
More on Data Structures
in C
41
Using a Hash Table (continued)
struct item *addItem(char *s, …) {
struct item *np;
unsigned int hv;
if ((np = lookup(s)) == NULL) {
np = malloc(item);
/* fill in s and data */
np -> next = hashtab[hv = hash(s)];
hashtab[hv] = np;
};
return np;
}
CS-2301 D-term 2009
More on Data Structures
in C
42
Using a Hash Table (continued)
struct item *addItem(char *s, …) {
struct item *np;
unsigned int hv;
if ((np = lookup(s)) == NULL) {
np = malloc(item);
/* fill in s and data */
np -> next = hashtab[hv = hash(s)];
hashtab[hv] = np;
};
return np;
}
CS-2301 D-term 2009
More on Data Structures
in C
43
Hash Table Summary
• Widely used for constant time access
• Easy to build and maintain
• There exist an art and science to the choice
of hashing functions
• Consult textbooks, web, etc.
CS-2301 D-term 2009
More on Data Structures
in C
44
Questions?
CS-2301 D-term 2009
More on Data Structures
in C
45