Download Lecture 6: Intro to Data Structures and the Standard Template Library

COMS W3101: Programming Languages (C++) Instructor: Austin Reiter Lecture 6 Outline for Today (Last Lecture) • Intro to Data Structures • Standard Template Library (STL) Last Homework • HW 4 questions? • Show example robot output DISCLAIMER • Today we are condensing what is usually a semesterlong course into two hours! • Take it with a grain of salt – I’m just trying to introduce the tools, what’s out there, and hope you play with them on your own • We’ve spent the entire course on the “rules and practices” of C++ – STL is an entire other area of study of C++ – I wish we did an entire 6-week course on STL alone! – Now that you know templates and the rules of objects, hopefully you can appreciate the powers of the library. Data Structures • Up to now we’ve studied fixed-size data structures (arrays) • More useful are dynamically-sized data structures: grow and shrink during execution (size unknown during compile time) • Also, the data structures are arranged (conceptually) different than arrays – Ex: the data doesn’t need to be arranged contiguously in memory. This often helps speed up certain processes (sorting, searching, reordering, etc) Data Structures • These data structures are implemented independent of type – Templates! • The concepts of how the data is arranged is independent of what is being stored – However, as usual, you must consider the operations being done to your data in the storage container. • Ex: many containers store things as sorted in some way. So your structure must have a concept of “less than” Data Structures • Vector: just like an array, but can grow and shrink dynamically • Linked List: collection of data items logically “lined up in a row” – We can insert and remove anywhere in the list • Stack: list of items arranged in a last-in, first-out ordering. – Insertions and removals are only made at the top of the stack – Very important for compilers and operating systems • Think about memory allocations: Stack-vs-Heap • Queue: opposite of stacks; arranged in a first-in, first-out ordering. – Insertions are made at the back and removals are made at the front – Like a “waiting line” Data Structures • Binary Tree: useful for high-speed searching and sorting of data. – Often useful for representation of file directories • In the data structures we present today, we use classes, class templates, inheritance and many other concepts we’ve already learned to create and package reusable and maintainable data structure! STL • This prepares us for using the Standard Template Library (STL), which is a major part of the C++ Standard Library. • Once we understand the structures and concepts they represent, we can make more informed decisions about which are best for our applications • They are all implemented as templates Self-Referential Classes • A self-referential class contains a pointer member to a class object of the same class type: class Node { public: Node( int ); void setData( int ); int getData() const; void setNextPtr( Node * ); Node* getNextPtr() const; private: int data; Node* nextPtr; }; // // // // // constructor set data member get data member set pointer to next Node get pointer to next Node // data stored in this Node // pointer to another object of same type Self-Referential Classes • The member nextPtr is a link. It can “tie” an object of type Node to another object of the same type. • These types of objects can be linked together to form useful data structures such as lists, queues, stacks and trees  15  10 2 self-referential class objects linked together to form a list. Self-Referential Classes • The member nextPtr is a link. It can “tie” an object of type Node to another object of the same type. • These types of objects can be linked together to form useful data structures such as lists, queues, stacks and trees  15  10 This represents a NULL “next” Node ptr. It usually represents the end of a data structure. Pointers • This should start to answer how pointers are useful beyond simple memory allocation and data passing Memory Allocation • Dynamic data structures means dynamic memory allocations (both larger and smaller) which enable programs to hold different amounts of memory during run-time. • The data structure must maintain how many elements it currently has and how to best re-allocate to reduce calls to new and delete – For example, often STL will resize by 2x greater than the current capacity when it needs more memory, thereby reducing (over time) the number of times it needs to reallocate • However, this can be wasteful when it gets to larger and larger sizes! Linked Lists • A linear collection of self-referential class objects, called nodes, connected by pointer links (hence the term “linked list”) • A linked list is accessed via a pointer to that list’s first node – Each subsequent node is accessed via the linkpointer member stored in the previous node – The last node points to a NULL node, indicating the end of the list Linked Lists • They are dynamic in the sense that new nodes are created as needed • A node can contain any type of data • This along with stacks and queues are linear data structures, whereas trees are nonlinear data structures – More on these in a bit Linked Lists • Linked lists are advantageous to arrays when the number of data elements to be represented at one time is unpredictable – The length of the list can increase/decrease as necessary – C++ array lengths are fixed at compile time, and can become “full” – Linked lists only become full if the system runs out of memory Linked Lists • However, the data in a linked list is not stored contiguously – This means accessing arbitrary elements from a list is not as efficient as in a vector or array – They are accessed via pointers from the previous element (i.e., no indices) • The nodes are stored contiguously  H firstPtr   D  … Q lastPtr Linked Lists • Usually we provide functions to add elements to the front or to the back as well as remove from the front or back • We provide pointers (referred to as iterators) to the beginning and end of the list and we can go through the nodes one-by-one • This is called a singly linked list – Each node contains a pointer to the next node “in sequence” • We can also construct a circular, singly linked list – The last node pointer is not NULL. It points back to the first element Linked Lists • A doubly linked list allows traversal both forwards and backwards – Each node has a pointer to both the “next” and “previous” nodes, separately • And finally, we can construct a circular, doubly linked list – Same as a doubly linked list but the forward pointer of the last node points to the first node and the backward pointer of the first node points to the last node  12 lastPtr  firstPtr   7  …  5 Stacks • We previously implemented a fixed-size stack using an array • We can also do it using a pointer-based linked-list implementation • A stack allows nodes to be added and removed only from the top. It is referred to as a LIFO data structure, for last-in first-out. • It can be thought of as a constrained version of a linked-list – The link member in the last node of the stack is set to NULL to indicate the bottom of the stack Stacks • The push() method inserts a new node at the top • The pop() method removes a node from the top • By using a linked-list as the implementation: – – – – A push inserts data at the front of the list A pop removes an element from the front of the list Nothing else changes Reusability! Queues • Similar to a stack, a queue is like a checkout line from a supermarket. The first person on the line is the first person processed • Queue nodes are removed from the head (front) of the queue and are inserted at the tail (back) of the queue • It is referred to as a FIFO, for first-in first out ordering • The insert operation is often referred to as enqueue. The remove operation is often referred to as dequeue. Queues • We can use a linked-list to implement a queue also: – The enqueue inserts elements at the back of the list – The dequeue removes elements from the front of the list – Nothing else changes – Reusability! Linear Data Structures • Vectors are fairly straightforward, as they are simply resizable arrays – We’ll show some concrete examples in STL • Let’s look at a non-linear data structure… Trees • A two-dimensional nonlinear data structure, tree nodes contain 2 or more links • In a binary tree, all nodes contain two links – None, one or both of which may be NULL  left subtree of node containing B  B root node pointer   A C D right subtree of node containing B Trees • Node B is the root of the tree • Each link in the root node refers to a child (nodes A and D) – The children of a node are called siblings  left subtree of node containing B  B root node pointer   A C D right subtree of node containing B Binary Search Tree • A binary search tree (BST) has the characteristic that the values in any left subtree of a node are less than the value in its parent. – Similarly, all values in any right subtree of a node are greater than the value in its parent • The shape of a BST can vary depending on the order that the data is inserted into the tree! 47 25 77 11 43 31 65 44 68 Binary Search Trees • We could spend a few lectures on BSTs. • They are very important for efficient searching of values • They represent the (provably) fastest way to search for an element using a comparison approach! • There are different ways to traverse a tree to achieve different goals, which we won’t go into here. • TAKE THE DATA STRUCTURES COURSE (taught in Java) STL • We’ve repeatedly (hopefully!) reiterated the importance of software reuse • STL defines powerful, template-based reusable software components that implement common data structures and algorithms • Developed by Alexander Stepanov and Meng Lee at Hewlett Packard and is based on research in generic programming STL • There are 3 main components to STL: – Containers: popular templatized data structures – Iterators: like pointers – Algorithms STL • Let’s define a few terms: – A container is a holder which stores a collection of elements. They are implemented in STL as class templates. – An iterator is how we reference individual elements in containers, and they are similar (in concept) to pointers • However they are just another class with overloaded operators! • STL algorithms work on iterators, however standard arrays can be manipulated by STL algorithms by using pointers as iterators STL Algorithms • Functions that perform common data manipulations, such as: – Searching – Sorting – Comparing Elements (or entire containers) • There are approximately 70 algorithms available – Most of them use iterators Containers Standard Library container class Description SEQUENCE CONTAINERS vector Rapid insertions and deletions at back. Direct access to any element. deque Rapid insertion and deletions at front or back. Direct access to any element. list Doubly-linked list, rapid insertion and deletion anywhere. Containers Standard Library container class Description ASSOCIATIVE CONTAINERS set Rapid lookup, no duplicates allowed. multiset Rapid lookup, duplicates allowed. map One-to-one mapping, no duplicates allowed, rapid key-based lookup. multimap One-to-many mapping, duplicates allowed, rapid key-based lookup. Containers Standard Library container class Description CONTAINER ADAPTORS stack Last-in, first-out (LIFO). queue First-in, first-out (FIFO). priority_queue Highest priority element is always the first element out. Containers Overview • Sequence Containers: represent linear data structures, such as vectors and linked lists. • Associative Containers: nonlinear containers that typically can locate elements stored in the containers quickly. – These usually store sets of key/value pairs • Container Adaptors: constrained versions of sequential containers. STL implements these using the sequence containers, but more constrained in use. Wed Reference • Good reference for STL online: – Containers: http://www.cplusplus.com/reference/stl/ – Algorithms: http://www.cplusplus.com/reference/algorithm/ Common Functions • Most STL containers provide common functionality • Many generic operations, such as: – size() - how many elements in the container – Constructors – can create empty containers or copies of containers (copy constructor) – empty() – insert() – add an item to the container (behavior changes according to data structure) – Assignment (=) – Comparison (<, <=, >, >=, ==, !=) – swap() – swap the elements of two containers STL Headers <vector> <list> <deque> <queue> <stack> <map> <set> Considerations • When an element is inserted into a container, a copy of that element is made! – One of the biggest mistakes is not realizing this fact!! – The element should provide its own copy constructor and assignment operator • Many associative containers require overloading of comparison operators (==, <) – Example: set orders elements using a binary tree. It must be able to say one object is “less-than” another object – Similar for the std::sort() function Iterators • Many features in common with pointers • Hold state information sensitive to particular containers on which they operate – Therefore, iterators are implemented appropriately to each container type • Certain iterator operations are uniform across containers – Example: the dereferencing operator (*) dereferences an iterator like a pointer. Also the ++ operator moves it to the next element (again, this is specific to the container type). • Also the -> operator is overloaded Iterators • STL containers usually provide begin() and end() member functions – begin() – returns an iterator pointing to the first element of the container – end() – returns an iterator pointing to the first element past the end of the container (i.e., an element that doesn’t exist) • If iterator i points to a particular element, then ++i points to the “next element” in the container – Also, *i refers to the element pointed to by i. Iterators • The end() iterator is used to determine when you’ve reached the end of the container. – For example, to loop through the elements of a container, you’d like to do it just like an array, but not all containers access elements like this. So the analog is: for (std::map::iterator i = myMap.begin(); i != myMap.end(); i++) { // process elements of std::map myMap } Iterators • There are two types of iterators: – We use an object of type iterator to refer to a container element that can be modified (readwrite) – We use an object of type const_iterator to refer to a container element that cannot be modified (read-only) Introduction to Algorithms • STL algorithms are used generically across a variety of STL containers • Some examples include: inserting, deleting, searching, sorting – The algorithms operate on container elements indirectly through iterators • STL algorithms often return iterators that indicate the results of the algorithms – Example: std::find() locates an element and returns an iterator to that element, or the end iterator to indicate the element wasn’t found in the container Algorithms • Some common mutating-sequence algorithms, meaning algorithms that result in modifications of the containers to which the algorithms are applied: – Copy – copy elements of one container, element-by-element, to another container of the same type – Remove – remove an element from a container – Fill – fill all elements of the container with a single “value” – Swap – swap elements of two containers of the same type – Find – search for an element in a container – Many, many, many more… • Usually don’t have think about memory allocations or sizes. The overloaded operators all work themselves out. Sequence Containers • vector, list and deque – vector and deque based on arrays • Vector is one of the most popular containers in STL – Changes size dynamically – Can be assigned to one another (unlike “raw” arrays) – Insertion at the back is efficient, but expensive in the middle • Applications that require frequent insertions and deletions at both ends normally use deque instead of vector (more efficient) – Frequent insertions/deletions in the middle use a list Sequence Containers • The front() method returns a reference to the first element (not an iterator) • The back() method returns a reference to the last element (not an iterator, and not one past the last element) • The push_back() method adds an element to the back of the container • The pop_back() removes the last element of the container vector • See example code of basic operations – Size = the number of elements currently stored in the vector – Capacity = the number of elements that can be stored in the container without allocating more memory (usually double capacity when more memory is needed) • There is resize() and reserve() for you to control this getting out of control manually – See example code of element manipulation functions list • Let’s look at some list code – sort arranges elements in ascending order (different from std::sort()). You can supply a binary predicate function to sort user-defined objects – splice removes elements from one container and places them into the other container before the iterator position specified as the first argument – merge removes all elements from one container and inserts them in sorted order into the other container (both lists must be sorted in the same order before this operation is performed!) • You can probably imagine this algorithm: it’s pretty straightforward deque • Let’s look at some deque code – Provides benefits of vector and list in one container Associative Containers • STL’s associative containers provide direct access to store and retrieve elements via keys • The four associative containers are: – – – – set multiset map multimap • Each container maintains keys in sorted order – set and multiset use the values as the keys (the object must have the comparison operator< overloaded) – map and multimap have a std::pair<key,value> to sort the objects • Here, the key type must have the operator< overloaded Associative Containers • Let’s look at some set code • Let’s look at some map code Container Adaptors • stack – implemented with a deque underneath, by default – push() inserts elements at the top of the stack (calls push_back() of deque) – pop() removes elements from top of the stack (calls pop_back() of deque) – top() gets a reference to the element at the top of the stack (calls back() of deque) – empty() – size() • Can also choose a list or vector as the implementation: std::stack<int> s1; // stack using deque as implementation std::stack<int, std::vector<int> > s2; // uses vector as implementation std::stack<int, std::list<int> > s3; // uses list as implementation Container Adaptors • Queue – implemented with deque, by default – Push() inserts elements at the back of the queue – Pop() removes elements from front of the queue – Back() retrieves a reference to the back of the queue and front() gets a reference to the front of the queue • Can also choose a list as the implementation: std::queue<double> q1; // uses deque as implementation std::queue<double, std::list<double> > q2; // uses list implementation Algorithms • Let’s look at some code for some algorithms (can’t cover them all) – fill/generate • fill, fill_n: set every element in a range of container elements to a specific value • generate, generate_n: create values for every element in a range of container elements Algorithms • Let’s look at some code for some algorithms (can’t cover them all) – equal/mismatch • equal: compares two sequences of values for equality. If any value is different, false is returned (or if they are of different length). The operator== must be overloaded for user-defined types. • mismatch: compares two sequences of values and returns an std::pair of iterators indicating the location in each sequence of the mismatched elements. If all elements match, return the end iterators. Algorithms • Let’s look at some code for some algorithms (can’t cover them all) – Mathematical Algorithms • random_shuffler: randomly reorders the elements in the range from v.begin() up to, but not including, v.end() in v. • count, count_if: counts the elements with a particular value in the container. The second variant specifies an arbitrary function to check a value with a condition (greater than 9) • accumulate: sum the values in the container. The third argument is the initial value of the total. • for_each: apply a general function to every element of the container, one-by-one. The function takes a single argument of the type of the container (and may also modify it via reference) • transform: apply a general function to every element in a container and stores the result in another container Algorithms • Let’s look at some code for some algorithms (can’t cover them all) – find/sort/binary_search • find, find_if: locates a particular value in a container and returns an iterator where it is located, or end if not found. If multiple copies, returns the first occurrence. • sort: arranges the elements in a container in ascending order. You may also supply a binary predicate function which takes 2 arguments and returns a comparison b/w them which determines the ordering. • binary_search: searches for a value in a sorted sequence (in ascending order). Returns a bool indicating if value was found. Algorithms • Let’s look at some code for some algorithms (can’t cover them all) – Function Objects: most STL algorithms allow you to pass a function pointer into the algorithm to help the algorithm perform its task. – STL’s designers allowed for more flexibility by allowing any algorithm that can receive a function pointer to receive an object of a class that overloads the parentheses operator(): • Example: if using the binary_search algorithm the function object must receive two arguments and return a bool. – The advantage over function pointers is that they are implemented as class templates. Also you can have data members which work within the functor operator(). Algorithms • Let’s look at some code for some algorithms (can’t cover them all) – Function Objects: see code example STL function objects Type STL function objects Type divides<T> arithmetic logical_or<T> logical equal_to<T> relational minus<T> arithmetic greater<T> relational modulus<T> arithmetic greater_equal<T> relational negate<T> arithmetic less<T> relational not_equal_to<T> relational less_equal<T> relational plus<T> arithmetic logical_and<T> logical multiplies<T> arithmetic logical_not<T> logical More Algorithms • There are many more STL algorithms that we can’t cover. • Some are mathematical, some manipulate the values • Good textbook on STL containers and algorithms: “Effective STL: 50 Ways to Improve Your Use of the Standard Template Library” by Scott Meyers • Lots of nuances to the containers and algorithms that are important to understand – Most are straightforward, but some use-cases require a complete understanding of the side-effects of certain operations. Course Wrap-Up • I hope you learned something about C++ • I know we went fast, but I wanted to present a full treatment of the language. • Please go off and play and learn more on your own • Hopefully can appreciate the thinking that is involved with programming in C++ – Very low level memory concepts, real-time aspects, interactions with the OS, etc… • C++ is a beautiful language, and even after years of using it you will continue to learn new things about it every day. – I certainly do! • Thank you!

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 6: Intro to Data Structures and the Standard Template Library