Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Basic Data Structures Page 1 BFH-TI: Softwareschule Schweiz Algorithms and Data Structures Basic Data Structures Dr. Rolf Haenni CAS SD01 Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Basic Data Structures Page 2 Outline Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Data Structures and Abstract Data Types Basic Data Structures Page 3 Outline Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Basic Data Structures Data Structures and Abstract Data Types Page 4 Data Structures I A data structure is a way of storing complex data in a computer so that it can be used efficiently I Carefully chosen data structures are crucial for building efficient algorithms I Therefore, the quality and performance of large systems depends heavily on choosing the best data structure I Different data structures are suited to different kinds of applications, and some are highly specialized to certain tasks I Many basic data structures are included in standard libraries of modern programming languages (e.g. Java Collection API) I The fundamental building blocks of most data structures are arrays, records, and references Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Basic Data Structures Data Structures and Abstract Data Types Page 5 Abstract Data Types I I An abstract data type (ADT) is an abstraction of a data structure An ADT specifies Ý Data stored Ý Operations on the data Ý Error conditions associated with the data I The object-oriented programming paradigm supports the creation of complex ADTs Ý ADTs are specified as interfaces Ý ADTs are implemented as classes (which themselves implement the ADT interface) Ý Concrete data structures are wrapped into objects of corresponding classes Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Basic Data Structures Data Structures and Abstract Data Types Page 6 Properties of Well-Designed ADTs Universality The same ADT can be used in different programs Encapsulation The interface provides an impenetrable barrier Simplicity The implementation details are entirely hidden Integrity Internal data is protected against improper use or bugs Flexibility The internal implementation can be changed without affecting the main application(s) Modularity Important sub-problems are solved independently Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 7 Outline Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 8 Linear Data Structures A linear data structure is a collection of linearly arranged elements with various ways to access its elements Stack Insert/remove elements at one end of the collection Queue Insert/remove elements at different ends of the collection Vector Access elements w.r.t. the rank within the collection List Access elements w.r.t. the position within the collection Sequence Access elements w.r.t. both ranks and positions The elements of a linear data structure have all the same “rights” (no priorities, no hierarchy) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 9 The Stack ADT I The stack ADT is a linear data structure that stores arbitrary elements according to the last-in-first-out (LIFO) scheme I Thus insertion and deletions take place at the same “end” of the data structure I Think of a spring-loaded plate dispenser Applications of stacks I Ý Ý Ý Ý Ý Visited-page history in a web browser Undo sequence in a text editor Towers of Hanoi problem Chain of method calls in the Java Virtual Machine (JVM) Parsing of arithmetic expressions Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 10 The Queue ADT I The queue ADT is a linear data structure that stores arbitrary elements according to the first-in-first-out (FIFO) scheme I Thus insertion and deletions take place at the opposite “ends” of the data structure I Think of the queue at the airport security check Applications of queues I Ý Waiting lists Ý Access to shared resources (e.g. a printer) Ý Multiprogramming Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 11 The Vector ADT I The vector ADT extends the notion of an array by storing a sequence of arbitrary elements Ý An element can be accessed, inserted, or removed by specifying its rank ∈ {0, 1, 2, . . .} = number of elements preceding it Ý The size (number of stored elements) of a vector changes when elements are inserted or deleted Ý Proper vectors have no fixed maximal size Ý The ranks of some elements may change when elements are inserted or deleted I Think of a ranking in ski racing events I An exception is thrown if an incorrect rank is specified (e.g. a negative rank) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 12 The List ADT I The list ADT models a linear sequence of positions I Each position stores an arbitrary element I List manipulations are always performed relative to some given positions I Special positions are the first and the last position in the list To be as general as possible, we need a position ADT with two simple operations: I Ý getElement(): returns the element stored at the position Ý setElement(e): sets the stored element to e I The position ADT gives a unified view of diverse ways of storing data (cell of an array, node of a linked list, etc.) I A list establishes a before/after relation between positions Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 13 The Sequence ADT I The sequence ADT is the union of the vector and the list ADT I Elements can be accessed by their rank and/or their position To transform ranks into positions and vice versa, two “bridge” operators are needed I Ý atRank(r): returns the position at rank r Ý rankOf(p): returns the rank of a position p I The sequence ADT is thus a general-purpose data structure for storing linearly ordered collections of elements I Stacks, queues, vectors, and lists are included as special cases Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 14 Operations for Linear Structures I I General Operations All: isEmpty(), size() I Accessing Elements/Positions Stack: top() Queue: front() Vector: elemAtRank(r) List: first(), last(), before(p), after(p) I Inserting Elements Stack: push(e) Queue: enqueue(e) Vector: insertAtRank(r,e) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 15 Operations for Linear Structures II List: insertFirst(e), insertLast(e), insertBefore(p,e), insertAfter(p,e) I Removing Elements Stack: pop() Queue: dequeue() Vector: removeAtRank(r) List: removeElement(p) I Replacing/Swaping Elements Vector: replaceAtRank(r,e), swapAtRanks(r,q) List: replaceElement(p,e), swapElements(p,q) I Rank/position conversion Sequence: atRank(r), rankOf(p) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 16 UML Diagram <<interface>> <<interface>> Collection size() isEmpty() <<interface>> Stack push(e) pop() top() <<interface>> Queue enqueue(e) dequeue() front() <<interface>> Vector elemAtRank(r) insertAtRank(r,e) removeAtRank(r) replaceAtRank(r) swapAtRanks(r,q) <<interface>> Sequence atRank(r) rankOf(p) Berner Fachhochschule Technik und Informatik Position element() <<interface>> List first() last() before(p) after(p) isFirst(p) isLast(p) insertFirst(e) insertLast(e) insertBefore(p,e) insertAfter(p,e) removeElement(p) replaceElement(p,e) swapElements(p,q) Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 17 BasicSequence Interface in Java public interface BasicSequence { public int size(); public boolean isEmpty(); } I For more information on Java interfaces, see §6.10 in “Java ist auch eine Insel” Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 18 Stack Interface in Java public interface Stack extends BasicSequence { public Object top() throws EmptyStackException; public void push(Object e); public Object pop() throws EmptyStackException; } I I I Stack inherits general methods from BasicSequence Requires the definition of a class EmptyStackException Generic stacks of a particular type can be defined using a technique called Java Generics (see §6.12 in “Java ist auch eine Insel”) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 19 Sequence Interface in Java public interface Sequence extends Vector, List { public Position atRank(int r); public Position rankOf(Position p); } I Example of multiple inheritance of interfaces (not allowed for classes) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Linear Data Structures Basic Data Structures Page 20 The Collection Interface The java.util package provides some predefined interfaces and classes <<interface>> <<interface>> Iterable Collection <<interface>> <<interface>> Queue List AbstractCollection AbstractList AbstractSequentialList LinkedList Stack Vector ArrayList AttributeList RoleList RoleUnsolvedList See §12 in “Java ist auch eine Insel” Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementing Linear Data Structures Basic Data Structures Page 21 Outline Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementing Linear Data Structures Basic Data Structures Page 22 Implementing Linear Data Structures I Linear data structures can be implemented in multiple ways: Ý Arrays Ý Records and references (singly/doubly-linked lists) Ý Combinations of arrays and linked lists I I The choice of the implementation determines the running times and space requirements of the basic operations Choosing the “right” implementation . . . Ý depends on the intended application Ý is a trade-off between running time, memory space, simplicity Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementing Linear Data Structures Basic Data Structures Page 23 Implementing Linear Data Structures in Java I In Java, implementing a ADT means to write a class which implements the interface I You may have several implementations of the same interface <<interface>> <<interface>> Collection size() isEmpty() ArrayStack n: Integer S: Array size() isEmpty() push(e) pop() top() <<interface>> Stack push(e) pop() top() Berner Fachhochschule Technik und Informatik Position element() LinkedListStack n: Integer top: ListNode size() isEmpty() push(e) pop() top() ListNode element: Object next: ListNode element() Rolf Haenni Algorithms and Data Structures Implementing Linear Data Structures Basic Data Structures Page 24 Arrays Most programming languages provide arrays as a simple linear data structure I Arrays have a fixed size N I The elements are usually indexed by i ∈ {0, . . . , N − 1} A[0] = first element A[i] = i+1-th element A[N−1] = last element I The running time for accessing elements is usually O(1) → Random access I Arrays are similar to vectors (but not identical) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementing Linear Data Structures Basic Data Structures Page 25 Linked Lists A linked list is another fundamental data structure which is easy to implement in most programming languages I I It consists of a linked sequence of nodes Each node is a record (or object) which contains: Ý A data field to store an element (number, string, object, etc.) Ý One or two references (links, pointers) pointing to the next and/or the previous nodes I Other than arrays, a linked list detaches the order of the list elements from the one used to store them in memory or disk I It allows only sequential (no random) access to its elements I Nodes should implement the interface of the position ADT Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementing Linear Data Structures Basic Data Structures Page 26 Singly Linked Lists I A singly linked list is the simplest form of a linked list I Each node contains a reference to the next node I A single reference to the first node is kept in memory I Sometimes it is useful to keep a reference to the last node Main Reference Optional Reference Node Record next element Berner Fachhochschule Technik und Informatik A B C Rolf Haenni Algorithms and Data Structures Implementing Linear Data Structures Basic Data Structures Page 27 Doubly-Linked Lists I A doubly-linked list is another simple form of a linked list I Each node contains two references, one to the next node and one to the previous node I Usually references to both ends are kept I For simplicity, special header and trailer nodes are often added Reference 1 Reference 2 Node Record prev next element Berner Fachhochschule Technik und Informatik A B C Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 28 Outline Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 29 Array-Based Stacks I The elements are added from “left” to “right” and removed from “right” to “left” I A variable n keeps track of the stack size (= next available location in the array) S 0 1 2 3 4 n N –1 I When the array becomes full, i.e. for n = N, a push operation will throw a FullStackException I This exception is implementation-specific (not intrinsic to stack ADT) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 30 Implementing Array-Based Stacks I Algorithm size() // runs in O(1) time return n Algorithm isEmpty() // runs in O(1) time return (n = 0) Algorithm top() // runs in O(1) time if isEmpty() then throw EmptyStackException else return S[n − 1] Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 31 Implementing Array-Based Stacks II Algorithm push(e) // runs in O(1) time if n = N then throw FullStackException else S[n] ← e n ←n+1 Algorithm pop() // runs in O(1) time if isEmpty() then throw EmptyStackException else n ←n−1 return S[n] Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 32 Array-Based Queues I To implement the queue ADT, the array should be used in a circular fashion I The elements are added and removed from “left” to “right” I Two variables f and r keep track of the front and rear element’s indices (the location r is kept empty) Q 0 1 f 0 1 2 r N –1 f N –1 Q I r When the array becomes full, a enqueue operation will throw a FullQueueException (implementation-specific exception) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 33 Implementing Array-Based Queues I Algorithm size() // runs in O(1) time return (N + r − f ) mod N Algorithm isEmpty() // runs in O(1) time return (f = r ) Algorithm front() // runs in O(1) time if isEmpty() then throw EmptyQueueException else return Q[f ] Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 34 Implementing Array-Based Queues II Algorithm enqueue(e) // runs in O(1) time if size() = N − 1 then // the capacity is only N−1! throw FullQueueException else Q[r ] ← e r ← (r + 1) mod N Algorithm dequeue() // runs in O(1) time if isEmpty() then throw EmptyQueueException else e ← Q[f ] f ← (f + 1) mod N return e Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 35 Array-Based Vectors I Vectors are most naturally implemented with arrays (i.e. ranks = array indices) I A variable n keeps track of the size of the vector (number of elements stored) V 0 1 2 3 4 r n N –1 I Operation elemAtRank(r) is implemented in O(1) time by returning V [r ] I When the array becomes full, a insertAtRank operation will throw a FullVectorException (implementation-specific exception) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 36 Inserting Elements I In the operation insertAtRank(r,e), we need to make room for the new element Ý shift forward the n − r elements V [r ], . . . , V [n − 1] V 0 1 2 3 4 r n N –1 0 1 2 3 4 r n N –1 0 1 2 3 4 V V I e r n N –1 In the worst case, i.e. for r = 0, this takes O(n) time Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Arrays Basic Data Structures Page 37 Removing Elements I In the operation removeAtRank(r), we need to fill the hole left by the removed element Ý shift backward the n − r − 1 elements V [r + 1], . . . , V [n − 1] V 0 1 2 3 4 r n N –1 0 1 2 3 4 r n N –1 0 1 2 3 4 r V V I n N –1 In the worst case, i.e. for r = 0, this takes O(n) time Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Growable Arrays Basic Data Structures Page 38 Outline Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Growable Arrays Basic Data Structures Page 39 Growable Array-Based Stack I To solve the FullStackException problem, replace the array with a larger one when necessary Ý Incremental strategy: increase the size by a constant c Ý Doubling strategy: double the size Algorithm push(e) if size() = N then A ← new array of size ... for i ← 0 to n − 1 do A[i] ← S[i] S ←A S[n] ← e n ←n+1 Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Growable Arrays Basic Data Structures Page 40 Running Times of Strategies Doubling Strategy running time of push(e) running time of push(e) Incremental Strategy 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 current number of elements Berner Fachhochschule Technik und Informatik 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 current number of elements Rolf Haenni Algorithms and Data Structures Implementation with Growable Arrays Basic Data Structures Page 41 Comparison of the Strategies I I Consider the the total time T (n) needed to perform a series of n push operations We assume that we start with an empty stack represented by an array of size Ý c, for the incremental strategy Ý 1, for the doubling strategy I We call amortized running time T (n)/n the average time taken by a push operation over the series of operations I We only consider one primitive operation: storing an object in the array Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Growable Arrays Basic Data Structures Page 42 Incremental Strategy I Let n be a multiple of c, i.e. n = kc I The array needs to be replaced k − 1 times I For the total running time T (n) we get T (n) = n + c + 2c + 3c + . . . + (k − 1)c = n + c(1 + 2 + 3 . . . + (k − 1)) 1 2 1 (k − 1)k = ··· = n + n =n+c 2 2c 2 I The total running time for n push operations is O(n2 ) I The amortized running time for a push operation is O(n) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Growable Arrays Basic Data Structures Page 43 Doubling Strategy I Let n be a power of 2, i.e. n = 2k I The array needs to be replaced k = log n times I For the total running time T (n) we get T (n) = n + 1 + 2 + 4 + . . . + 2k−1 = n + 2k − 1 = 2n − 1 I The total running time for n push operations is O(n) I The amortized running time for a push operation is O(1) I As a consequence, the doubling strategy outperforms the incremental strategy for large stacks Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Growable Arrays Basic Data Structures Page 44 Performance Growable arrays can also be used to implement queues and vectors (and lists, but this is not very natural) Operation size isEmpty top, front, elemAtRank push, enqueue insertAtRank pop, dequeue removeAtRank Berner Fachhochschule Technik und Informatik Doubling Strategy O(1) O(1) O(1) O(1) O(n) O(1) O(n) Incremental Strategy O(1) O(1) O(1) O(n) O(n) O(1) O(n) Rolf Haenni Algorithms and Data Structures Implementation with Linked Lists Basic Data Structures Page 45 Outline Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Linked Lists Basic Data Structures Page 46 Stacks and Queues with Singly Linked Lists I I Stacks and queues are often implemented as singly linked lists Nodes implement the position ADT by storing: Ý Element Ý Reference next to the next node I The top/front element is stored at the first node I For stacks, only a reference to the first node needs to be kept (called top) I For queues, references to both ends of the list needs to be kept (called front and rear ) I Keep track of the current size n of the stack/queue I All operations run in O(1) time, memory grows in O(n) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Linked Lists Basic Data Structures Page 47 Implementing Stack with Singly Linked Lists Algorithm push(e) // runs in O(1) time new ← new Node(e) new .next ← top top ← new n ←n+1 Algorithm pop() // runs in O(1) time if isEmpty() then throw EmptyStackException else e ← top.getElement() top ← top.next n ←n−1 return e Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Implementation with Linked Lists Basic Data Structures Page 48 Doubly-Linked List Implementation I I A doubly-linked list provides a natural implementation of the list ADT Nodes implement the position ADT by storing Ý Element Ý Reference prev to the previous node Ý Reference next to the next node I The list itself stores two references first and last first A last B Berner Fachhochschule Technik und Informatik C D E Rolf Haenni Algorithms and Data Structures Implementation with Linked Lists Basic Data Structures Page 49 Inserting Elements I All four insertion operations need to create some new and redirect some existing links, i.e. they run in O(1) time p A B C A B C D E q D E D E X A B Berner Fachhochschule Technik und Informatik C X Rolf Haenni Algorithms and Data Structures Implementation with Linked Lists Basic Data Structures Page 50 Removing Elements I The operation removeElement(p), which needs to redirect some existing links, runs in O(1) time p A B C A B C X D E D E p X A B Berner Fachhochschule Technik und Informatik C D E Rolf Haenni Algorithms and Data Structures Implementation with Linked Lists Basic Data Structures Page 51 Performance Overview Sequence Operation size, isEmpty insertFirst, insertLast replaceElement, swapElements first, last, isFirst, isLast, after before insertAfter insertBefore, removeElement atRank, rankOf, elemAtRank replaceAtRank, swapAtRanks insertAtRank, removeAtRank Berner Fachhochschule Technik und Informatik Array (circular, growable) 1 1 1 1 1 n n 1 1 n Doubly Linked List 1 1 1 1 1 1 1 n n n Singly Linked List 1 1 1 1 n 1 n n n n Rolf Haenni Algorithms and Data Structures Iterators Basic Data Structures Page 52 Outline Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures Iterators Basic Data Structures Page 53 The Iterator ADT I I An iterator abstracts the process of scanning through a sequence by keeping a “pointer” to the “current element” Methods of an iterator ADT Ý Ý Ý Ý element(): returns the current element hasNext(): checks whether the iteration has completed nextElement(): advances the pointer to the next element reset(): resets the iterator I Can be realized for array or linked-list implementations I The order in which the elements are traversed is not necessarily the “normal” rank-based or position-based order I We can use several iterators for the same sequence I See §7.1.2 in “Java ist auch eine Insel” (interface iterable) Berner Fachhochschule Technik und Informatik Rolf Haenni Algorithms and Data Structures