Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 2 The Big Picture Overview ● The big picture answers to several questions. What are data structures? What data structures do we study? What are Abstract Data Types? Why Object-Oriented Programming (OOP) and Java for data structures? How do I choose the right data structures? 2.1 What Are Data Structures 2.1 What Are Data Structures ● A data structure is an aggregation of data components that, together, constitute a meaningful whole. The components themselves may be data structures. Stops at some “atomic” unit. 2.1 What Are Data Structures ● The definition of “atomic” unit depends on the observer. As a reader, the characters that make up a word is the atomic unit. One who prints the paper, what goes into making that character appear on the page in ink—for me, the lithographic components that go into printing characters are the atomic units. 2.2 What Data Structures Do We Study? ● An array is an aggregation of entries that are arranged in contiguous fashion with the provision of a single-step random access to any entry. There are numerous other situations where more sophisticated data structures are required. Data structure categories: ● ● Linear Non-linear Category is based on how the data is conceptually organized or aggregated. 2.2 What Data Structures Do We Study? ● One may also classify data structures according to the physical or abstract systems they model. Trees ● Model hierarchies. Graphs ● Model symmetric relationships. 2.2 What Data Structures Do We Study? ● Linear structures List, Queue and Stack are linear collections, each of them serves as a repository in which entries may be added or removed at will. ● Differ in how these entries may be accessed once they are added. 2.2 What Data Structures Do We Study? ● List The List is a linear collection of entries in which entries may be added, removed, and searched for without restrictions. ● Two kinds of list: Ordered List Unordered List 2.2 What Data Structures Do We Study? ● Queue Entries may only be removed in the order in which they are added. ● ● First out (FIFO) data structures No search for an entry in the Queue 2.2 What Data Structures Do We Study? ● Stack Entries may only be removed in the reverse order in which they are added. ● ● Last In, First Out (LIFO) No search for an entry in the Stack. 2.2 What Data Structures Do We Study? ● Trees Non-linear arrangement. There are various tree structures. 2.2 What Data Structures Do We Study? ● Binary Tree Consists of entries each of which contributes to the tree as a whole based on its position in the tree. ● Moving an entry from one position to another changes the meaning of the Binary Tree. 2.2 What Data Structures Do We Study? ● General Tree Models a hierarchy such as the organizational structure of a company, or a family tree. ● A non-linear arrangement of entries, it is a generalization of the binary tree structure, hence the name. 2.2 What Data Structures Do We Study? ● Binary Search Tree Same structural form as the Binary Tree, each entry is self-contained: it does not contribute differently if its position in the tree is changed, nor does the tree as a whole carry a meaning that is tied to the relative arrangement of the entries. Arranged (effectively) in sorted order: tree analogue of the Ordered List. 2.2 What Data Structures Do We Study? ● AVL Tree Height-balanced, binary search tree. ● AVL Tree derives its importance from the fact that it speeds up this search process to a remarkable degree. 2.2 What Data Structures Do We Study? ● Heap as a Priority Queue A priority queue is a specialization of the FIFO Queue. ● ● Entries are assigned priorities. The entry with the highest priority is the one to leave first. 2.2 What Data Structures Do We Study? ● Updatable Heap Once an entry is added to a simple priority queue, i.e. Heap, its priority may not change. ● ● A heap with updatable priorities. There are safety and efficiency issues that may be resolved only at the expense of fairly complex solutions. 2.2 What Data Structures Do We Study? ● Hash Table Stores entires with the sole aim of enabling efficient search. ● Requires a sound knowledge of certain mathematical properties of numbers, and so-called hash functions that manipulate numbers. 2.2 What Data Structures Do We Study? ● Graphs A general tree is a special kind of graph, since a hierarchy is a special system of relationships among entities. ● ● ● Graphs may be used to model systems of physical connections such as computer networks, airline routes, etc., as well as abstract relationships such as course prerequisite structures. Standard graph algorithms answer certain questions we may ask of the system. Two kinds of graphs: Directed Graph—asymmetric relationship Undirected Graph—a symmetric relationship 2.3 What are Abstract Data Types? ● Software developers struggle to write code that is robust, easy to maintain, and reusable. A data structure is an abstract data type. ● i.e. The primitive data types built into a language integer, real, character and boolean. ● We do not at all worry about its internal representation. We know we can perform certain operations on integers that are guaranteed to work the way they are intended to. “+”, “-”, “*”, “/” language designers and compiler writers were responsible for designing the interface for these data types, as well as implementing them. 2.3 What are Abstract Data Types? ● Exactly how these operations are implemented by the compiler in the machine code is of no concern. On a different machine, the behavior of an integer does not change, even though it's internal representations may change. As the programmers were concerned, the primitive data types were abstract entries. 2.3 What are Abstract Data Types? ● Viewers see the TV in terms of its volume, contrast, brightness, and other controls. Going one lever lower, the TV itself is built out of various components. One of these may be programmable chips. 2.3 What are Abstract Data Types? ● The chip consists of transistors. A transistor itself is made of silicon, whose constituent atoms are the bottom line in this system of abstractions. ● “abstraction” is a relative term, depending on where the interface line is drawn. 2.3 What are Abstract Data Types? ● A data structure consisting of several layers of abstraction. A Stack (of integers) may be built using a List (of integers), which in turn may be built using an array (of integers). ● ● Integer and array are language-defined types. Stack and List are user-defined. 2.3 What are Abstract Data Types? 2.3 What are Abstract Data Types? ● Since an ADT makes a clean separation between interface and implementation, the user only sees the interface and therefore does not need tamper with the implementation. The responsibility of maintaining the implementation is separated from the responsibility of maintaining th code that uses an ADT. This makes the code easier to maintain. ADTs may be used many times in various contexts. ● A List ADT may be used directly in application code, or may be used to build another ADT, such as the Stack. 2.3 What are Abstract Data Types? ● Object-oriented programming is paradigm that addresses exactly these issues. 2.4 Why OOP and Java for Data Structures? ● OOP paradigm views the program as a system of interacting objects. Objects in a program may model physical objects, or abstract entities. An object encapsulates state and behavior. ● ● For instance, the state of an integer, is its current value, it behavior is the set of arithmetic operations that may be applies to integers. A class is analogous to the ADT, an object is analogous to a variable of an ADT. The user of an object is called its client. 2.4 Why OOP and Java for Data Structures? ● A stack may be built using a List ADT. ● A stack interface defines four operations: push pop get-Size isEmpty 2.4 Why OOP and Java for Data Structures? ● The stack object contains a List object which implements its state, and the behavior of the Stack object is implemented in terms of the List object's behavior. 2.4 Why OOP and Java for Data Structures? ● Reuse by inheritance. Data structures may be built by inheriting from other data structures; if data structure A inherits from data structure B, then every A object is is-A B object. ● AVL Tree can inherit from a Binary Search Tree, since every AVL Tree is is-A special, height balanced, binary search tree. Every binary search tree is not an AVL tree. 2.4 Why OOP and Java for Data Structures? ● ● Inheriting means sharing the implementation code. A language that implements the OOP paradigm automatically ensures: Separation of interface from implementation by encapsulation Code reuse inheritance when permitted. 2.5 How Do I choose the Right Data Structures? ● The interface of operations that is supported by a data structure is one factor to consider when choosing between several available data structures. The efficiency of the data structures: ● ● How much space does the data structure occupy? What are the running times of the operation in its interface? 2.5 How Do I choose the Right Data Structures? ● Example Implementing a printer queue, requires a queue data structure. ● ● Maintains a collection of entries in no particular order. An unordered list would be the appropriate data structure in this case. It is not too difficult to fit the requirements of the application to the operations supported by a data structure. ● It is more difficult to choose from a set of candidate data structures that all meet the operational requirements. 2.5 How Do I choose the Right Data Structures? ● Important, and sometimes contradictory factor to consider: The running time of each operation in the interface. ● ● ● A data structure with the best interface with the best fit may not necessarily be the best overall fit, if the running times of its operations are not up to the mark. When we have more than one data structure implementation whose interfaces satisfy our requirements, we may have to select one based on comparing the running times of the interface operations. Time is traded off for space, i.e. more space is consumed to increase speed, or a reduction in speed is traded for a reduction in the space consumption. 2.5 How Do I choose the Right Data Structures? ● Time-space tradeoff We are looking to “buy” the best implementation of a stack. ● StackA. Does not provide a getSize operation. ● ● i.e. there is not single operation that a client can use to get the number of entries in StackA. StackB. Provides a getSize operation, implemented in the manner we discussed earlier, transferring entries back and forth between two stacks. StackC. Provides a getSize operation, implemented as follows: a variable called size is maintained that is incremented every time an entry is pushed, and decremented every time an entry is popped. 2.5 How Do I choose the Right Data Structures? ● Three situations: Need to maintain a large number stacks, with no need to find the number of entries. Need to maintain only one stack, with frequent need to find the number of entries. Need to maintain a large number of stacks. With infrequent need to find the number of entries. 2.5 How Do I choose the Right Data Structures? ● Situation 1, StackA fits the bill. Tempting to pick StackC, simply because we may want to play conservative: what if we need getSize in the future? 2.5 How Do I choose the Right Data Structures? ● Situation 2, StackB or Stack C. Need to use getSize. getSize in StackB is more time-consuming than that in StackC. We need only one stack, the additional size variable used by StackC is not an issue. Since we need to use getSize frequently, it is better to with StackC. 2.5 How Do I choose the Right Data Structures? ● Situation 3 presents a choice between StackB and StackC. If getSize calls are infrequent, we may choose to go with StackB and suffer a loss in speed. The faster getSize delivered by StackC is at the expense of an extra variable per stack, which may add up to considerable space consumption since we plan to maintain a number of stacks. 2.5 How Do I choose the Right Data Structures? ● getSize in StackB is more time-consuming that that in StackC. How can we quantify the time taken in either case? For each data structure we study, we present the running time of each operation in its interface.