Download Container data structures Topics for this lecture

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Rainbow table wikipedia , lookup

Hash table wikipedia , lookup

Linked list wikipedia , lookup

Array data structure wikipedia , lookup

Transcript
Chair of Software Engineering
Einführung in die Programmierung
Introduction to Programming
Prof. Dr. Bertrand Meyer
October 2006 – February 2007
Piotr Nienaltowski
Lecture 18: Container data structures
Topics for this lecture
Containers and genericity
Container operations
Lists
Arrays
Assessing algorithm performance: Big-O notation
Hash tables
Stacks and queues
Intro. to Programming, lecture 18: Container data structures
2
Intro. to Programming, lecture 18: Container data structures
3
1
The ultimate question
LINKED_LIST or ARRAY?
Intro. to Programming, lecture 18: Container data structures
4
Container data structures
Contain other objects (“items”)
Possible operations on a container:
¾ Insert an item
¾ Find out if an element is present
¾ Remove an item
¾ “Traverse” the structure to apply an operation to
every item
¾ ...
The way in which a container stores its items determines
the required storage space and the operation speed.
Intro. to Programming, lecture 18: Container data structures
5
A familiar container: list
after
before
item
start
count
index
1
forth
back
finish
(cursor)
Intro. to Programming, lecture 18: Container data structures
6
2
A standardized naming scheme
Container classes in EiffelBase use standard names for
basic container operations:
is_empty : BOOLEAN
has (v : G): BOOLEAN
count : INTEGER
item : G
make
put (v : G)
Intro. to Programming, lecture 18: Container data structures
7
Containers and genericity
How do we handle variants of a container class
distinguished only by the type of their items?
Solution: using genericity allows explicit type
parameterization consistent with static typing principles.
Container data structures are typically implemented as
generic classes.
LINKED_LIST [G]
l1 : LINKED_LIST [PERSON]
l2 : LINKED_LIST [STRING]
l3 : LINKED_LIST [ANY]
Intro. to Programming, lecture 18: Container data structures
8
Lists
A list is a container keeping items in a certain order.
Lists in EiffelBase have cursors.
after
before
item
index
1
start
back
count
forth
finish
(cursor)
Intro. to Programming, lecture 18: Container data structures
9
3
Cursor properties
The cursor ranges from 0 to count + 1:
0 <= index <= count + 1
If the cursor is at position 0 before is True:
before = (index = 0)
If the cursor is at position count + 1 after is True:
after = (index = count + 1)
In an empty list the cursor is at position 0:
is_empty = (count = 0)
Intro. to Programming, lecture 18: Container data structures
10
A specific implementation: (singly) linked lists
Intro. to Programming, lecture 18: Container data structures
11
Intro. to Programming, lecture 18: Container data structures
12
Adding a cell
4
The corresponding command
put_right (v : G)
-- Add v to right of cursor position; do not move cursor.
require
not_after: not after
local
p : LINKABLE [G]
do
create p.make (v)
if before then
p.put_right (first_element)
first_element := p
active := p
else
p.put_right (active.right)
active.put_right (p)
end
count := count + 1
ensure
next_exists: active.right /= Void
inserted: (not old before) implies active.right.item = v
inserted_before: (old before) implies active.item = v
end
Intro. to Programming, lecture 18: Container data structures
13
Intro. to Programming, lecture 18: Container data structures
14
Removing a cell
The corresponding command
Do remove as exercise.
Intro. to Programming, lecture 18: Container data structures
15
5
Inserting at the end: extend
Intro. to Programming, lecture 18: Container data structures
16
Arrays
An array is a container storing items in contiguous memory
locations.
Each memory location is identified by an integer index.
lower
1
item (4)
2
3
4
upper
5
6
7
Valid index values
Intro. to Programming, lecture 18: Container data structures
17
Bounds and indexes
Arrays are bounded:
lower : INTEGER
-- Minimum index.
upper : INTEGER
-- Maximum index.
The capacity of an array is:
capacity = upper – lower + 1
The number of array items ranges from 0 to capacity :
0 <= count <= capacity
An empty array has no elements:
is_empty = (count = 0)
Intro. to Programming, lecture 18: Container data structures
18
6
Accessing and modifying array items
item (i : INTEGER): G
-- Entry at index i, if in index interval.
require
valid_key: valid_index (i )
put (v : like item; i : INTEGER)
-- Replace i-th entry, if in index interval, by v.
require
valid_key: valid_index (i )
ensure
inserted: item (i ) = v
Intro. to Programming, lecture 18: Container data structures
19
Resizing an array
At any point in time arrays have a fixed lower and upper
bound, and thus a fixed capacity.
Unlike most other programming languages, Eiffel allows
resizing an array (resize).
Feature force resizes an array if required.
Resizing usually requires reallocating the array and copying
the old values. Such operations are costly!
Intro. to Programming, lecture 18: Container data structures
20
Linked list or array?
The choice of a container data structure depends on the
speed of its container operations.
The speed of a container operation depends on how it is
implemented, on its underlying algorithm.
Intro. to Programming, lecture 18: Container data structures
21
7
How fast is an algorithm?
Depends on the hardware, operating system, load on the
machine...
But most fundamentally depends on the algorithm!
Intro. to Programming, lecture 18: Container data structures
22
Big-O notation
f is O (g (n)) means there exists a constant k such that
for all n:
|f (n)| / g (n) ≤ k
Provides measure as function of size (count) of data
structure.
Defines function not by exact formula but by order of
magnitude, e.g. O(1), O(log count), O(count), O(count 2),
O(2count).
7count 2 + 20count + 4 ⇒ O(count 2)
Intro. to Programming, lecture 18: Container data structures
23
Examples
put_right of LINKED_LIST: O (1)
Regardless of the number of elements in the linked list it
takes a constant time to insert an item at cursor
position.
force of ARRAY: O (count)
At worst the time for this operation grows proportionally
to the number of elements in the array.
Intro. to Programming, lecture 18: Container data structures
24
8
Cost of (singly-) linked list operations
Operation
Feature
Complexity
put_right
O (1)
extend
O (count)
O (1)
remove_right
O (1)
remove
O (count)
Index-based access
i_th
O (count)
Search
has
O (count)
Insert right to cursor
Insert at end
Remove right neighbor
Remove at cursor position
Intro. to Programming, lecture 18: Container data structures
25
Cost of (doubly-) linked list operations
Operation
Feature
Complexity
put_right
O (1)
extend
O (1)
remove_right
O (1)
remove
O (1)
Index-based access
i_th
O (count)
Search
has
Insert right to cursor
Insert at end
Remove right neighbor
Remove at cursor position
O (count)
Intro. to Programming, lecture 18: Container data structures
26
Cost of array operations
Operation
Feature
Complexity
Index-based access
item
O (1)
Index-based replacement
put
O (1)
Index-based replacement
outside of current bounds
force
O (count)
has
O (count)
-
O (log count)
Search
Search in sorted array
Intro. to Programming, lecture 18: Container data structures
27
9
Hash tables
Both arrays and hash tables are indexed structures; item
manipulation requires an index or, in case of hash tables,
a key.
Unlike arrays hash tables allow keys other than integers.
Intro. to Programming, lecture 18: Container data structures
28
An example
helen : PERSON
personnel_directory : HASH_TABLE [PERSON, STRING]
create personnel_directory.make (100)
Storing an element
create helen
personnel_directory.put (helen, ”Marie-Helene”)
Retrieving an element
helen := personnel_directory.item (”Marie-Helene”)
Intro. to Programming, lecture 18: Container data structures
29
Hash function
The hash function maps K, the set of possible keys, into an
integer interval a..b.
A perfect hash function gives a different integer value for
every element of K.
Whenever two different keys give the same hash value a
collision occurs.
Intro. to Programming, lecture 18: Container data structures
30
10
Collision handling
Open hashing:
ARRAY [LINKED_LIST [G]]
Intro. to Programming, lecture 18: Container data structures
31
A better technique: closed hashing
Class HASH_TABLE [G, H] implements closed hashing:
HASH_TABLE [G, H] uses a single ARRAY [G] to store the
items. At any time some of positions are occupied and
some free:
Intro. to Programming, lecture 18: Container data structures
32
Closed hashing continued
If the hash function yields an already occupied position,
the mechanism will try a succession of other positions
(i1, i2, i3) until it finds a free one:
With this policy and a good choice of hash function
search and insertion in a hash table are essentially
O (1).
Intro. to Programming, lecture 18: Container data structures
33
11
Cost of hash table operations
Operation
Feature
Complexity
item
O (1)
O (count)
put, extend
O (1)
O (count)
Removal
remove
O (1)
O (count)
Key-based replacement
replace
O (1)
O (count)
has
O (1)
O (count)
Key-based access
Key-based insertion
Search
Intro. to Programming, lecture 18: Container data structures
34
Dispensers
Unlike indexed structures, as arrays and hash tables, there
is no key or other identifying information for dispenser
items.
Dispensers are container data structures that prescribe a
specific retrieval policy:
¾ Last In First Out (LIFO): choose the element inserted
most recently -> stack.
¾ First In First Out (FIFO): choose the oldest element
not yet removed -> queue.
¾ Priority queue: choose the element with the highest
priority.
Intro. to Programming, lecture 18: Container data structures
35
Stacks
A stack is a dispenser applying a LIFO policy. The basic
operations are:
Push an item to the top of
the stack (put)
Pop the top element
(remove)
Access the top element
(item)
Top
Body, what
would
remain after
popping.
A new item
would be
pushed
here
Intro. to Programming, lecture 18: Container data structures
36
12
Using stacks
from
until
loop
“All terms of Polish expression have been read”
“Read next term x in Polish expression”
if “x is an operand” then
s put (x)
else -- x is a binary operator
-- Obtain and pop two top operands:
op1 := s item; s remove
op2 := s item; s remove
-- Apply operator to operands and push result:
s put (application (x, op2, op1))
end
.
.
.
.
.
.
end
Intro. to Programming, lecture 18: Container data structures
37
Evaluating 2 a b + c d - * +
b
2
c
a
a
(a+b)
(a+b)
2
2
2
2
d
c
(c-d)
(a+b)
(a+b)
(a+b)*(c-d)
2
2
2
2+(a+b)*(c-d)
Intro. to Programming, lecture 18: Container data structures
38
The run-time stack
The run-time stack contains the activation records for all
currently active routines.
An activation record contains a routine’s locals (arguments
and local entities).
Intro. to Programming, lecture 18: Container data structures
39
13
Implementing stacks
Common stack implementations are either arrayed or linked.
Intro. to Programming, lecture 18: Container data structures
40
LINKED_LIST or ARRAY or …?
Use linked lists:
¾ for data structures that are unbounded
¾ when sequencing of elements is vital
¾ when elements are mainly accessed in the given
sequence
Use arrays:
¾ for data structures that are bounded
¾ when elements have an integer index
¾ when elements are mainly accessed based on their
indexes
Intro. to Programming, lecture 18: Container data structures
41
LINKED_LIST or ARRAY or …?
Use hash table:
¾ when elements have an associated key
¾ when elements are mainly accessed based on their
keys
¾ (rather) for data structures that are bounded
Use stacks:
¾ when there is need for a LIFO dispenser
Use queues:
¾ when there is need for a FIFO dispenser
Intro. to Programming, lecture 18: Container data structures
42
14
End of lecture 18
Intro. to Programming, lecture 18: Container data structures
43
15