Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Building Java Programs Chapter 6: File Processing File input using Scanner reading: 6.1 - 6.2 Token Based File Processing Most every piece of software you use handles files. Eclipse handles .java and other files. Word handles .doc files. Music apps use .mp3 or other format files. 1 File Objects • • • • The File class in the java.io package represents files. import java.io.*; I/O stands for Input/Output. You can create a File object to get information about a file. File f = new File("example.txt"); if ( f.canRead() ) System.out.printf("found file, size is %d\n", f.length() ); else System.out.println(“File named example.txt not found.”); • Creating a File object does not create a new file. Use canRead(), not exists() because if you give it a directory name, it exists but can’t be read. 2 File Object methods: File f = new File("example.txt"); Creating a File object does not create a new file. It contains info about the file. This is metadata. Method name Description f.canRead() returns whether file is able to be read f.delete() removes file from disk f.exists() whether this file exists on disk f.getName() returns file's name f.length() returns number of bytes in file f.renameTo(file) changes name of file f.isFile() Is this a normal file? f.isDirectory() Is this a directory? f.getAbsolutePath() The full path name: e.g. C:\docs\bob\160\L20\example.txt Relative vs. Absolute Paths • When you specify just a file name, it looks in your current directory (Java project). "example.txt" • You can specify file names with relative path names: "src/FileTest.java" "../Paintings/spiral.gif“ (Go up one directory then down to Paintings/spiral.gif.) • or absolute path names: "C:/docs/bob/160/L20/example.txt" • Windows uses backslashes but Unix machines use forward slashes. Java allows either! 4 Use a Scanner to read a File • To read a file, create a File object and pass it as a parameter when constructing a Scanner. This scanner will have all the functionality we had before and more. • General syntax: File <fname> = new File("<file name>")); Scanner <sname> new Scanner(<fname>); • Example: File f = new File("example.txt"); Scanner input = new Scanner(f); or just: Scanner input = new Scanner( new File("example.txt") ); Read an int from the file: . int val = input.nextInt(); Scanner objects can connect to • System.in (the Console, as done before) • A file (as we are seeing now). • A string (discussed later) 5 • Methods of Scanner for a file are the same as we have seen for console input: Method Description nextInt() reads and returns an int value nextDouble() reads and returns a double value next() reads and returns the next token* as a String nextLine() reads and returns the next line of input as a String Method Name hasNext() Description whether any more tokens remain hasNextDouble() whether the next token can be interpreted as type double hasNextInt() whether the next token can be interpreted as type int hasNextLine() whether any more lines remain in the file * A token on input is any contiguous data separated by white space. We will see examples later. hasNext and hasNextLine can each return false. Console could not. 6 Almost correct program to read a file import java.io.*; import java.util.*; // for File // for Scanner public class FileTest { public static void main(String[] args){ File f = new File("example.txt"); Scanner input = new Scanner(f); while (input.hasNext()) { System.out.println(input.next()); } Line 8 This will be explained in the next two slides. } } Exception in thread "main" java.lang.Error: Unresolved compilation problem: Unhandled exception type FileNotFoundException at FileTest.main(FileTest.java:8) This does not say that an exception occurred, it says that connecting a Scanner to a file can cause an exception to occur. The system wants you to recognize this fact and check for a possible exception. There are several ways to take care of exceptions, we will use the easiest one. 7 Checked Exceptions • Earlier we saw some common exceptions: IllegalArgumentException ArithmeticException InputMismatchException StringIndexOutOfBoundsException • The idea of these is: “A bad thing has happened. Kill the program and print an error message.” Usually these are problems that a good programmer will and does handle by using tests. Compiler assumes that the programmer will handle these. Compile the program as usual. if ( n != 0 && h/n = 3 ) // do not divide if n is 0. if (s1.length() >= 2 && s2.length() >= 2 // check the string lengths before indexing. && s1.endsWith(s2.substring(s2.length() - 2))) { • Java also has some exceptions designated as "Checked Exceptions": FileNotFoundException • The idea of these is: "A good programmer should handle these situations using a catch clause or throwing it up to the next level. Don't compile their program if they don’t." 8 Easiest way to handle checked exception: Throw the exception up to the next method to handle it if it occurs. The following compiles and will at least start running. An exception may then occur. import java.io.*; // for File import java.util.*; // for Scanner public class FileTest { This says that if an exception public static void main(String[] args) occurs, it will be thrown up to throws FileNotFoundException the next level to be handled. { File f = new File("example.txt"); Scanner input = new Scanner(f); while (input.hasNext()) { System.out.println(input.next()); } }} In the above main() does not handle an exception if it occurs so it throws the exception up to the operating system which prints the usual message if the error occurs. 10 20 30 A file named example.txt does exist, so no exception occurs. 40 50 A file named example.txt exists, so no exception occurs. The program runs. import java.io.*; // for File import java.util.*; // for Scanner public class FileTest { public static void main(String[] args) throws FileNotFoundException { File f = new File("example.txt"); Scanner input = new Scanner(f); while (input.hasNext()) { System.out.println(input.next()); } }} Example.txt: 10 20 30 40 50 60 Output: 10 20 30 40 50 60 10 20 30 40 50 Easiest way to handle checked exception: Throw the exception up to the next method to handle it if it occurs. If no method handles it we get the usual FileNotFoundException. import java.io.*; // for File Changed the file import java.util.*; // for Scanner name to grab.txt public class FileTest { which does not exist. public static void main(String[] args) throws FileNotFoundException { File f = new File(“grab.txt"); Scanner input = new Scanner(f); while (input.hasNext()) { System.out.println(input.next()); } }} Error occurs here, when trying to attach a scanner to the nonexistent file In the above main() does not handle it so it throws the exception up to the operating system which prints the usual message if the error occurs. Exception in thread "main" java.io.FileNotFoundException: grab.txt (The system cannot find the file specified) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(Unknown Source) at java.util.Scanner.<init>(Unknown Source) 11 at FileTest.main(FileTest.java:8) Exceptions • exception: An object that represents a program error. – Programs that contain invalid logic cause (or "throw") exceptions. – Trying to read a file that does not exist will throw an exception. • checked exception: An error that Java forces us to handle or explicitly choose not to handle in our program (otherwise it will not compile). – We must specify what our program will do to handle any potential file I/O failures. We must either: • declare that our program will handle ("catch") the exception, using a try catch block or • explicitly state that we choose not to handle the exception (and we accept that our program will crash if an exception occurs) by adding a throws clause. We will use the throws clause for a while. • throws clause, general syntax for a method that could throw an exception: public static <type> <name>(<params>) throws <type> { – When doing file I/O, we use FileNotFoundException. 12 public static void main(String[] args) throws FileNotFoundException { • Finding these exceptions: – Read the exception text for line numbers in your code (the first line that mentions your method; often near the bottom): Exception in thread "main" java.util.FileNotFoundException at java.util.Scanner.throwFor(Scanner.java:838) This is called at java.util.Scanner.next(Scanner.java:1347) the runtime at MyProgram.myMethodName(MyProgram.java:19) at MyProgram.main(MyProgram.java:6) stack Note 1: This is the Scanner method that originally found Note 2: Then this method was and threw the exception. returned to and threw the exception up to the next level. Note 3: Then this method. Note 4:Finally main() threw it to the operating system which printed the message Scanner exceptions Exception in thread "main" java.util. FileNotFoundException at java.util.Scanner.throwFor(Scanner.java:838) This is called at java.util.Scanner.next(Scanner.java:1347) the runtime at MyProgram.myMethodName(MyProgram.java:19) stack at MyProgram.main(MyProgram.java:6) Invert the runtime stack for bubble analogy. Finally bubbles up to operating system which prints the message. at MyProgram.main(MyProgram.java:6) Continues bubbling at MyProgram.myMethodName(MyProgram.java:19) at java.util.Scanner.next(Scanner.java:1347) at java.util.Scanner.throwFor(Scanner.java:838) Exception occurs here. Does not handle it, it bubbles up to he next method. Throwing exception syntax • throws clause, general syntax: public static <type> <name>(<params>) throws <type> { – When doing file I/O, we use FileNotFoundException. public static void main(String[] args) throws FileNotFoundException { • Like saying, “I know this method may cause the program to crash but I am not going to handle the problem. Some method that called me should handle this if it happens or the program will crash.” • In this case, main() is throwing the exception which means the operating system will handle it and you will get this if the file is not found: Exception in thread "main" java.util. FileNotFoundException 15 import java.io.*; // for File import java.util.*; // for Scanner public class FileTest { public static void main(String[] args) throws FileNotFoundException { File f = new File("example.txt"); Scanner input = new Scanner(f); while (input.hasNext()) { System.out.println(input.next()); } } } Contents of file example.txt: 16.2 23.5 19.1 7.4 22.8 18.5 -1.8 14.9 Output: 16.2 23.5 19.1 7.4 22.8 18.5 -1.8 14.9 Note: When input.hasNext() is false it means that the end of the file has been reached. Thus, we have no need for a sentinel value, the EOF (End Of File) marker is the built-in sentinel that hasNext() recognizes as being at the end of the file and returns false. 16 6.2 Details of Token-Based Processing If the file consists only of tokens that can be processed individually, token-based processing is the best bet. If you find yourself using input.nextLine(), that is line based processing (Section 6.3) and should not be used with token-based processing. This will be explained in Section 6.3. Files and input cursor • Consider a file numbers.txt that contains this text: 308.2 14.9 7.4 3.9 4.7 2.8 2.8 -15.4 • A Scanner views all input as a stream of characters, which it processes with its input cursor: – 308.2\n 14.9 7.4 2.8\n\n\n3.9 4.7 -15.4\n2.8\n ^ – When you call the methods of the Scanner such as next(), nextInt() or nextDouble(), the Scanner returns the next token. 17 Adding a data file to an Eclipse project. Just copy the file and paste it into the project. It will automatically go into the JRE System Library [jre6] because its extension is not .java. Do not open the file and copy the contents. The way that seems to always work on any system is to save the file onto the Desktop (or other folder on the disk). Then go to the Desktop (or get into Windows Explorer), find the file, right click on it and choose Copy. Then right click on the project in Eclipse and choose Paste. Example file on a web page: hamlet.txt Right click on it and save the file onto your disk. Then copy and paste it from your disk to the appropriate project. Just as you do with .java files. Do not open the data file and try to copy and paste the contents into your java project. That will not work. Java knows by the file contents if it is a java file because it contains the line public class FileTest. Your data file will not have that line so java does not know what it is. It will probably treat it as a “snippet” and put it into the snippet folder. 18 Input tokens • token: A unit of user input. Tokens are separated by whitespace (spaces, tabs, new lines). • Example: If an input file contains the following: 23 3.14 "John Smith" – The tokens in the input are the following, and can be interpreted as the given types: Token Type(s) 1. 23 int, double, String 2. 3.14 double, String 3. "John String 4. Smith" String Note: The double quotes are just input characters, outside Java they have no meaning as string delimiters. 19 Consuming tokens • Each call to next, nextInt, nextDouble, etc. advances the cursor to the position just after the end of the current token, skipping over any whitespace. We call this consuming input. 308.2\n 14.9 7.4 2.8\n\n\n3.9 4.7 -15.4\n2.8EOF ^ double q =input.nextDouble(); 308.2\n 14.9 7.4 2.8\n\n\n3.9 4.7 -15.4\n2.8EOF ^ q contains 308.2 double r = input.nextDouble(); 308.2\n 14.9 7.4 2.8\n\n\n3.9 4.7 -15.4\n2.8EOF ^ r contains 14.9 EOF is the end of file marker that is put onto the end of every file. If you try to read it or read beyond it, you will get an error. 20 File input question • Consider an input file named numbers.txt that contains the following text: 308.2 14.9 7.4 3.9 4.7 2.8 2.8 -15.4 • Write a program that reads the first 5 values from this file and prints them along with their sum. Its output: number = 308.2 number = 14.9 number = 7.4 number = 2.8 number = 3.9 Sum = 337.19999999999993 Note: Round off error causes the Sum to look like this. 21 File input answer // Displays the first 5 numbers in the given file, // and displays their sum at the end. import java.io.*; // for File, FileNotFoundException import java.util.*; public class Echo { public static void main(String[] args) throws FileNotFoundException { Scanner input = new Scanner(new File("numbers.txt")); double sum = 0.0; for (int i = 1; i <= 5; i++) { double next = input.nextDouble(); System.out.println("number = " + next); sum += next; } System.out.println("Sum = " + sum); } } 22 What if the file only contained 3 values? 308.2 14.9 7.4 This basically means that we attempted to read beyond the end of file marker. We hit the end of file marker when there is no data left in the file. number = 308.2 number = 14.9 number = 7.4 Exception in thread "main" java.util.NoSuchElementException at java.util.Scanner.throwFor(Unknown Source) at java.util.Scanner.next(Unknown Source) at java.util.Scanner.nextDouble(Unknown Source) at Echo.main(Echo.java:13) 23 Testing before reading • The preceding program is impractical because it only processes exactly 5 values from the input file. – A better program would read the entire file, regardless of how many values it contains. • Reminder: The Scanner has useful methods for testing to see what the next input token will be: Method Name Description hasNext() whether any more tokens remain hasNextDouble() whether the next token can be interpreted as type double hasNextInt() whether the next token can be interpreted as type int hasNextLine() whether any more lines remain 24 Test existence of value before reading it question • Rewrite the previous program so that it reads the entire file. Assume that the file contains only double values. Its output: number = 308.2 number = 14.9 number = 7.4 number = 2.8 number = 3.9 number = 4.7 number = -15.4 number = 2.8 Sum = 329.29999999999995 25 Test before read answer // Displays each number in the given file, // and displays their sum at the end. import java.io.*; import java.util.*; public class Echo2 { public static void main(String[] args) throws FileNotFoundException { Scanner input = new Scanner(new File("numbers.dat")); double sum = 0.0; while (input.hasNextDouble()) { double next = input.nextDouble(); System.out.println("number = " + next); sum += next; number } number System.out.println("Sum = " + sum); number } number } number 26 = 308.2 = 14.9 = 7.4 = 2.8 = 3.9 number = 4.7 number = -15.4 number = 2.8 Sum = 329.299999 File processing question • Modify the preceding program again so that it will handle files that contain non-numeric tokens. – The program should skip any such tokens. • For example, the program should produce the same output as before when given this input file: 308.2 hello 14.9 7.4 bad stuff 2.8 3.9 4.7 oops -15.4 :-) 2.8 @#*($& 27 number = 308.2 number = 14.9 number = 7.4 number = 2.8 number = 3.9 number = 4.7 number = -15.4 number = 2.8 Sum = 329.29999999999995 File processing answer // Displays each number in the given file, // and displays their sum at the end. import java.io.*; import java.util.*; public class Echo3 { public static void main(String[] args) throws FileNotFoundException { Scanner input = new Scanner(new File("numbers.dat")); double sum = 0.0; while (input.hasNext()) { // Is there is another token? if (input.hasNextDouble()) { double next = input.nextDouble(); System.out.println("number = " + next); sum += next; } else { input.next(); // consume & throw away bad token } } System.out.println("Sum = " + sum); } } 28 File processing question • Write a program that accepts an input file containing integers representing daily high temperatures. Example input file: weather.dat 42 45 37 49 38 50 46 48 48 30 45 42 45 40 48 • Your program should print the difference between each adjacent pair of temperatures, such as the following: 29 Temperature Temperature Temperature Temperature Temperature Temperature Temperature Temperature Temperature Temperature Temperature Temperature Temperature Temperature changed changed changed changed changed changed changed changed changed changed changed changed changed changed by by by by by by by by by by by by by by 3 deg F -8 deg F 12 deg F -11 deg F 12 deg F -4 deg F 2 deg F 0 deg F -18 deg F 15 deg F -3 deg F 3 deg F -5 deg F 8 deg F File processing answer import java.io.*; import java.util.*; public class TemperaturesPrevCurrent { public static void main(String[] args) throws FileNotFoundException { Scanner input = new Scanner(new File("weather.dat")); int prevTemp = input.nextInt(); while (input.hasNextInt()) { int currentTemp = input.nextInt(); System.out.println("Temperature changed by " + (prevTemp - currentTemp) + " deg F"); prevTemp = currentTemp; } }} This pattern of keeping a current value and the previous value is common in programming. 30 Input Cursor • A Scanner views all input as a stream of characters. • The current position is called the input cursor. 10 20 30\n40 50\n\n60\n ^ input.next() -> 10 10 20 30\n40 50\n\n60\n ^ input.next() -> 20 10 20 30\n40 50\n\n60\n ^ input.next() -> 30 10 20 30\n40 50\n\n60\n ^ Calling next() is called "consuming input". 31 Example: Reading in the name of the file to be processed. This program finds the average of the numbers in the file. public static void main(String[] args) throws FileNotFoundException { // get the filename Scanner console = new Scanner(System.in); example.txt System.out.print("File: "); // prompt String filename = console.next(); 10 20 30 // read and process the file 40 50 int sum = 0; 60 int count = 0; File f = new File(filename); Scanner input = new Scanner(f); console while (input.hasNextInt()) { File: example.txt int d = input.nextInt(); Sum is: 210.0 sum += d; Count is: 6 count++; Average is: 35.0 } System.out.println("Sum is: " + sum); System.out.println("Count is: " + count); System.out.println("Average is: " + (double)sum/count); }} Review reading the file public static void main(String[] args) throws FileNotFoundException { File f = new File("example.txt"); Scanner input = new Scanner(f); while (input.hasNext()) { System.out.println(input.next()); } } example.txt 10 20 30 40 50 60 output: 10 20 30 40 50 60 Notice: No need for a sentinel. hasNext is false if there is no more data in the file. 33 Review: Working With Files • Create a File object: File f = new File("example.txt"); • Open the file for reading with a Scanner object: Scanner input = new Scanner(f); • For now, just throw the FileNotFoundException to whoever called you. { . . . method(. . .) throws FileNotFoundException . . . Scanner input = new Scanner(f); . . . } 34 Mixing token and line processing Don’t do it Token based methods nextInt() nextDouble() next() Never the twain (two) shall meet, at least almost never. hasNextInt() hasNextDouble() hasNext() Line based methods nextLine() hasNextLine() Skip the remainder of these notes. They deal with why you should not mix them but if you never mix them you do not need these notes . Mixing tokens and lines ( do not do this)(skip these notes) • Using nextLine() in conjunction with the token-based methods (nextInt(), nextDouble(), next()) on the same Scanner can cause bad results. 23 Joe 3.14 "Hello" world 45.2 19 – You'd think you could read 23 and 3.14 with nextInt and nextDouble, then read Joe "Hello" world with nextLine . System.out.println(input.nextInt()); // 23 System.out.println(input.nextDouble()); // 3.14 System.out.println(input.nextLine()); // – But the nextLine call produces no output! Why? The first line is actually 23 3.14\n After the 3.14 is read, the input cursor is placed at the \n. The input.nextLine() reads up to but not including the \n, it gets the empty string and prints that. input.next() would get “Joe”. Mixing lines and tokens(skip these notes) • Don't read both tokens and lines from the same Scanner: 23 Joe 3.14 "Hello world" 45.2 19 input.nextInt() 23\t3.14\nJoe\t"Hello" world\n\t\t45.2 ^ input.nextDouble() 23\t3.14\nJoe\t"Hello" world\n\t\t45.2 ^ input.nextLine() 23\t3.14\nJoe\t"Hello" world\n\t\t45.2 ^ // 23 19\n // 3.14 19\n // "" (empty!) 19\n input.nextLine() // "Joe\t\"Hello\" world" 23\t3.14\nJoe\t"Hello" world\n\t\t45.2 19\n ^ A more complex example (skip these notes) • Processing a file of names and hours worked each day to compute weekly totals. Input file: hours.txt Desired output: Aaron Aardvark 8 8 8 8 8 Bob Baboon 4 4 Chucky Cheetah Jr. 6 6 2 2 Donald Duck 8 4 8 40 8 16 20 hours hours hours hours Aaron Aardvark Bob Baboon Chucky Cheetah Jr. Donald Duck 38 Plan: (skip these notes) • Outer loop: for each employee process 2 lines 1. Use nextLine() to read the name. nextLine() consumes and returns the string but consumes and throws away the following \n. 2. Use an inner loop to read the line of numbers using nextInt() Aaron Aardvark 8 8 8 8 8 Bob Baboon 4 4 Chucky Cheetah Jr. 6 6 2 2 Donald Duck 8 4 8 nextLine() -> "Aaron Aardvark" nextInt() -> 8 nextInt() -> 8 nextInt() -> 8 nextInt() -> 8 nextInt() -> 8 Wait a minute: Welty said not to mix linebased processing (nextLine()) with tokenbased processing. We can’t really handle this situation with the mix. Must go to the next section named, strangely enough, line39 based processing. Ignore this if not mixing token and line based processing. public static void hoursWorkedV1(String filename) throws FileNotFoundException { // read and process the file File f = new File(filename); Scanner input = new Scanner(f); while (input.hasNextLine()) { String name = input.nextLine(); // read name, line based int sum = 0; // read and add data while (input.hasNextInt()) { // token based sum += input.nextInt(); // token based } System.out.printf("%2d hours %s\n", sum, name); } } Aaron Aardvark 8 8 8 8 8 Bob Baboon 4 4 Chucky Cheetah Jr. 6 6 2 2 Donald Duck 8 4 8 nextLine() -> "Aaron Aardvark" nextInt() -> 8 nextInt() -> 8 nextInt() -> 8 nextInt() -> 8 nextInt() -> 8 40 Ignore this if not mixing token and line based processing. public static void hoursWorkedV1(String filename) throws FileNotFoundException { // read and process the file File f = new File(filename); Scanner input = new Scanner(f); while (input.hasNextLine()) { String name = input.nextLine(); // read name, line based int sum = 0; // read and add data while (input.hasNextInt()) { // token based sum += input.nextInt(); // token based } System.out.printf("%2d hours %s\n", sum, name); } } Actual Output: Aaron Aardvark 8 8 8 8 8 Bob Baboon 4 4 Chucky Cheetah Jr. 6 6 2 2 Donald Duck 8 4 8 40 0 8 0 16 0 20 0 0 0 hours hours hours hours hours hours hours hours hours hours Aaron Aardvark Bob Baboon Chucky Cheetah Jr. Donald Duck 41 Ignore this if not mixing token and line based processing. Don't mix line based input: nextLine() With token based input: next(), nextInt(), nextDouble() Aaron Aardvark\n8 8 8 8 8\nBob Baboon\n4 4\nChucky ^ nextLine() -> "Aaron Aardvark" Aaron Aardvark\n8 8 8 8 8\nBob Baboon\n4 4\nChucky ^ nextInt() -> 8 nextInt() -> 8 nextInt() -> 8 nextInt() -> 8 nextInt() -> 8 Aaron Aardvark\n8 8 8 8 8\nBob Baboon\n4 4\nChucky ^ nextLine() -> "" 42 Ignore this if not mixing token and line based processing. Why do we get the above? nextLine() consumes the line it is on through the next line marker (“\n”) it returns the string containing everything up to but not including the end of line marker. Leaves the cursor just after the \n. nextInt() consumes the next integer, including all preceding white space and leaves the cursor pointing to the space just following the integer. Example, we have just consumed the last 8 of Aaron’s hours: Aaron Aardvark\n8 8 8 8 8\nBob Baboon\n4 4\nChucky ^ The cursor sees nothing followed by a \n. It returns the empty string and consumes through the \n nextLine() -> "" Aaron Aardvark\n8 8 8 8 8\nBob Baboon\n4 4\nChucky ^ There are no ints after the \n so it just sets the sum to 0 and then reads Bob Baboon. 43 Ignore this if not mixing token and line based processing. Example, we have just consumed the last 8 of Aaron’s: Aaron Aardvark\n8 8 8 8 8\nBob Baboon\n4 4\nChucky ^ The cursor sees nothing followed by a \n. It returns the empty string and consumes through the \n nextLine() -> "" Aaron Aardvark\n8 8 8 8 8\nBob Baboon\n4 4\nChucky ^ 44