Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Java Collection Classes Eric Roberts CS 106A February 24, 2016 Once upon a time . . . Extensible vs. Extended Languages • As an undergraduate at Harvard, I worked for several years on the PPL (Polymorphic Programming Language) project under the direction of Professor Thomas Standish. • In the early 1970s, PPL was widely used as a teaching language, including here in Stanford’s CS 106A. • PPL is rarely remembered today, but it was one of the first languages to offer syntactic extensibility, paving the way for similar features in modern languages like C++. • In a paper entitled “PPL: The Extensible Language That Failed,” Standish concluded that programmers are less interested in extensible languages than they are in extended languages that already provide the capabilities they need. Java Collection Classes The Java Collections Framework • As you know from our discussions of abstract data types earlier in the quarter, Java’s collection types fall into three general categories: 1. Lists are ordered collections of values that allow the client to add and remove elements. 2. Sets are unordered collections of values in which a particular object can appear at most once. 3. Maps are structures that define an association between keys and values. The Collection Hierarchy The following diagram shows the portion of the Java Collections Framework that implements the Collection interface. The dotted lines specify that a class implements a particular interface. «interface» Collection «interface» List AbstractList ArrayList «interface» Set AbstractCollection LinkedList AbstractSet HashSet TreeSet «interface» SortedSet ArrayList vs. LinkedList • If you look at the left side of the collections hierarchy on the preceding slide, you will discover that there are two classes in the Java Collections Framework that implement the List interface: ArrayList and LinkedList. • Because these classes implement the same interface, it is generally possible to substitute one for the other. • The fact that these classes have the same effect, however, does not imply that they have the same performance characteristics. – The ArrayList class is more efficient if you are selecting a particular element or searching for an element in a sorted array. – The LinkedList class can be more efficient if you are adding or removing elements from a large list. • Choosing which list implementation to use is therefore a matter of evaluating the performance tradeoffs. The Set Interface • The right side of the collections hierarchy diagram contains classes that implement the Set interface, which is used to represent an unordered collection of objects. The two concrete classes in this category are HashSet and TreeSet. • Both sets and lists allow you to add and remove elements, but sets do not support the notion of an index position. All you can know is whether an object is present or absent from a set. • The difference between the HashSet and TreeSet classes is primarily a difference in implementation. The HashSet class is built on an structure called a hash table, while the TreeSet class is based on a structure called a binary tree. In practice, the only difference arises when you iterate over the elements of a set, as described on the next slide. • You will have a chance to learn the details of these techniques if you take CS 106B. Iteration in Collections • One of the most useful operations for any collection is the ability to run through each of the elements in a loop. This process is called iteration. • The programming pattern for iterating over a collection looks like this: for (type var : collection) { . . . statements that process the element stored in the variable . . . } Iteration Order • For a collection that implements the List interface, the order in which iteration proceeds through the elements of the list is defined by the underlying ordering of the list. The element at index 0 comes first, followed by the other elements in order. • The ordering of iteration in a Set is more difficult to specify because a set is, by definition, an unordered collection. A set that implements only the Set interface, for example, is free to deliver up elements in any order, typically choosing an order that is convenient for the implementation. • If, however, a Set also implements the SortedSet interface (as the TreeSet class does), the iterator sorts its elements so they appear in ascending order according to the compareTo method for that class. An iterator for a TreeSet of strings therefore delivers its elements in alphabetical order. The Map Hierarchy The following diagram shows the portion of the Java Collections Framework that implements the Map interface. The structure matches that of the Set interface in the Collection hierarchy. The distinction between HashMap and TreeMap is the same as that between HashSet and TreeSet and affects the iteration order. «interface» Map AbstractMap HashMap TreeMap «interface» SortedMap Constructing Maps • Even though it represents the root of the hierarchy on the previous slide, the Map type in Java is an interface rather than a class. One implication of this design decision is that you cannot use the identifier Map as a constructor. You must instead create either a HashMap or a TreeMap. • The classes in the Map hierarchy are parameterized with two type names, one for the key type and one for the value type. For example, the type HashMap<String,Integer> indicates a HashMap that uses strings as keys to obtain integer values. • The textbook goes to some length to describe how to use the HashMap in older versions of Java that do not support generic types. Although this strategy was necessary when the book first came out, generic types are supported in all modern versions of Java, so there is no need to learn the older style. A Simple Application of Maps • Suppose that you want to write a program that displays the name of a state given its two-letter postal abbreviation. • This program is an ideal application for maps because you need to create an association between two-letter codes and state names. Each two-letter code uniquely identifies a particular state and therefore serves as a key for the map; the state names are the corresponding values. • To implement this program in Java, you need to perform the following steps, which are illustrated on the following slide: 1. 2. 3. 4. Create a Map containing all 50 key/value pairs. Read in the two-letter abbreviation to translate. Call get on the Map to find the state name. Print out the name of the state. The StateCodeLookup Application public void run() { Map<String,String> stateMap = new TreeMap<String,String>(); private void initStateMap(Map<String,String> map) { initStateMap(stateMap); map.put("AL", "Alabama"); while (true) { map.put("AK", "Alaska"); String code = readLine("Enter two-letter state abbreviation: "); map.put("AZ", "Arizona"); if (code.length() == 0) break; ... String state = stateMap.get(code); map.put("FL", "Florida"); if (state == null) { map.put("GA", "Georgia"); println(code + " is not a known state abbreviation"); map.put("HI", "Hawaii"); } else { . . . println(code + " is " + state); map.put("WI", "Wisconsin"); state code stateMap } map.put("WY", "Wyoming"); map } Hawaii Wisconsin null VE WI HI } } StateCodeLookup Enter HI is Enter WI is Enter VE is Enter two-letter state abbreviation: HI Hawaii two-letter state abbreviation: WI Wisconsin two-letter state abbreviation: VE not a known state abbreviation two-letter state abbreviation: AL=Alabama AK=Alaska AZ=Arizona ... FL=Florida GA=Georgia HI=Hawaii ... WI=Wisconsin WY=Wyoming skip simulation Iteration and Maps • In Java, the classes in the Map hierarchy don’t support iteration directly. If you want to go through all the keys in a map, you have to call the keySet method, which returns a set consisting of all the keys in the map. Once you have the set of keys, you can then iterate through that. • The iteration order for a map depends on whether you have created a HashMap or a TreeMap. The keySet method for a HashMap returns a HashSet, which has an undefined iteration order. The keySet method for a TreeMap returns a TreeSet, which guarantees that the keys will be returned in increasing order. • The HashMap class is slightly more efficient than TreeMap and is much more common in practice. As a Java programmer, you have both options. Iteration Order in a HashMap The following method iterates through the keys in a map: private void listKeys(Map<String,String> map) { String className = map.getClass().getName(); int lastDot = className.lastIndexOf("."); String shortName = className.substring(lastDot + 1); println("Using " + shortName + ", the keys are:"); for (String key : map.keySet()) { print(key + " "); } println(); } If you call this method on a HashMap containing the two-letter state codes, you get: MapIterationOrder Using RI VT AZ MO TN CA CO PA HashMap, HI ME VA MT MS NH OK OH WY the keys MI DE ID NJ NM AK FL SD SC are: IA MD MA AR IL UT IN MN AL TX NC ND NE NY GA NV CT WV KY WI KS OR LA WA Iteration Order in a TreeMap The following method iterates through the keys in a map: private void listKeys(Map<String,String> map) { String className = map.getClass().getName(); int lastDot = className.lastIndexOf("."); String shortName = className.substring(lastDot + 1); println("Using " + shortName + ", the keys are:"); for (String key : map.keySet()) { print(key + " "); } println(); } If you call instead this method on a TreeMap containing the same values, you get: MapIterationOrder Using AK AL KY LA NV NY WV WY TreeMap, AR AZ CA MA MD ME OH OK OR the keys CO CT DE MI MN MO PA RI SC are: FL GA HI IA ID IL IN KS MS MT NC ND NE NH NJ NM SD TN TX UT VA VT WA WI Exercise: Read the Map from a File In the starter program for the application that expands two-letter state abbreviations, the state abbreviations are listed explicitly as part of the program. How would you change the implementation of this application so that the program read the state abbreviation table from a data file that looks like this: TwoLetterStateCodes.txt AK=Alaska AL=Alabama AR=Arkansas AZ=Arizona CA=California CO=Colorado CT=Connecticut DE=Delaware FL=Florida . . . WV=West Virginia WY=Wyoming The Collections Toolbox • The Collections class (not the same as the Collection interface) exports several static methods that operate on lists, the most important of which appear in the following table: binarySearch(list, key) Finds key in a sorted list using binary search. sort(list) Sorts a list into ascending order. min(list) Returns the smallest value in a list. max(list) Returns the largest value in a list. reverse(list) Reverses the order of elements in a list. shuffle(list) Randomly rearranges the elements in a list. swap(list, p1, p2) Exchanges the elements at index positions p1 and p2. replaceAll(list, x1, x2) Replaces all elements matching x1 with x2. • The java.util package exports a similar Arrays class that provides the same basic operations for any array. The End