Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
n COMP209 Object Oriented Programming Java IO Part 3 Mark Hall n n n LZ77T unCompress() Sequential access of records Object Serialization Random Access of Records Department of Computer Science 1 LZ77T.java: The unCompress() method • Algorithm for our unCompress() method – While there are more tokens – Output it – Append it to the searchBuffer • Else, if it is a number – If the next token is a separator followed by another number • Output a offset, offset+length substring from the searchBuffer • Append the outputted substring to the searchBuffer – Else, the first number read was part of a word so • Output the number and the subsequent word • Append number and subsequent word to the searchBuffer – Do nothing Department of Computer Science Department of Computer Science 2 public void unCompress(String infile) throws IOException { mIn = new BufferedReader(new FileReader(infile+".lz77")); StreamTokenizer st = new StreamTokenizer(mIn); Make these into “ordinary” chars so that they don’t get parsed as whitespace or parts of numbers 3 int offset, length; while (st.nextToken() != StreamTokenizer.TT_EOF) { switch (st.ttype) { case StreamTokenizer.TT_WORD: mSearchBuffer.append(st.sval); System.out.print(st.sval); // continued on next slide – We will use the StreamTokenizer class to parse words, tokens and numbers Set all normal keyboard characters (+ ‘\n’), with the exception of ‘~’, to wordChars • Else, if it is a separator token (ie. ‘~’) // Adjust search buffer size if necessary trimSearchBuffer(); break; • Last time we looked at the compress() method of LZ77T.java • Now lets take a look at the unCompress() method st.ordinaryChar((int)' '); st.ordinaryChar((int)'.'); st.ordinaryChar((int)'-'); st.ordinaryChar((int)'\n'); st.wordChars((int)'\n', (int)'\n'); st.wordChars((int)' ', (int)'}'); • Read a token • If it is a ‘word’ Department of Computer Science LZ77T.java: The unCompress() method case StreamTokenizer.TT_NUMBER: offset = (int)st.nval; // set the offset st.nextToken(); // get the separator (hopefully) if (st.ttype == StreamTokenizer.TT_WORD) { // we got a word instead of the separator, therefore // the first number read was actually part of a word mSearchBuffer.append(offset+st.sval); System.out.print(offset+st.sval); break; // break out of the switch } // if we got this far then we must be reading a // substitution pointer st.nextToken(); // get the length length = (int)st.nval; // output substring from search buffer String output = mSearchBuffer.substring(offset, offset+length); System.out.print(output); mSearchBuffer.append(output); // Adjust search buffer size if necessary trimSearchBuffer(); break; default: // consume a '~' } // close switch statement } // end while 1 Sequential Access of Records • So far we have seen sequential access of binary and textual data • Eg. Scores class — encapsulates info about a sports score (name, score, country) Sequential Access of Records • With an array (or collection) of Scores we could iterate through sequentially asking each Scores object to write its data to the file – Could add a “writeScores” method to write out the data as strings and numbers public void writeScores(PrintWriter os) throws IOException { os.println(name + ’,’ + score + ‘,’ + country); } – Makes for a portable data file—as long as the record format is known, programs written in other languages could read the data • Similarly, could add a “readScores” method to read and parse scores data to restore the state of a Scores object Department of Computer Science 7 Department of Computer Science Object Serialization Object Serialization • In Java there is an even easier way to write sequential data — object serialization – Entire objects can be written to disk in binary form with almost no extra work on the part of the programmer • If an object contains references to other objects, these are also saved – The process is automatic and recursive – Ensures that only a single copy of each referenced object is saved to the stream • Serialization — is the ability to save the state of an object (or several objects) to a stream • Deserialization — is the ability to restore the state of an object (or several objects) from a stream Department of Computer Science 9 Object Serialization Scores myScore = new Scores(“Chris Harris”,135,”India”); ObjectOutputStream os = new ObjectOutputStream(new FileOutputStream(“scores.dat”)); os.writeObject(myScore); • The object output stream automatically saves all instance variables of the object Department of Computer Science 2 C B 4 3 5 D Department of Computer Science 6 8 7 E 9 10 Object Serialization • To read the object back in, use the readObject method of the ObjectInputStream class • To save object data we need to use the ObjectOutputStream class Department of Computer Science A 1 – The stream is typically associated with a file, but need not be (eg sending serialized objects over a network connection) 8 ObjectInputStream is = new ObjectInputStream(new FileInputStream(“scores.dat”)); Scores myScore = (Scores)is.readObject(); • readObject returns an Object reference, so we need to cast to the appropriate type • readObject can throw a ClassNotFoundException as well as the normal IOException – ClassNotFoundException gets thrown if the virtual machine cannot find the class of the read object in the classpath 11 Department of Computer Science 12 2 Object Serialization • Now if we want to save a collection of Scores all we have to do is write out the collection object Object Serialization • To place objects of a particular class into an object stream, the class must implement the java.io.Serializable interface – Is an indicator interface (ie. has no methods) – A java.io.NotSerializableException is thrown if a class does not implement Serializable Public class protected protected protected Vector myScoresList = new Vector(); // add a whole bunch of scores into the Vector os.writeObject(myScoresList); Scores implements Serializable { String mName; int mScore; String mCountry; public Scores(String name, int score, String country) { mName = name; mScore = score; mCountry = country; } Department of Computer Science 13 } Object Serialization • Only nonstatic and nontransient parts of an object’s state are saved by serialization • Many of the classes provided with the JDK libraries have been designed to be serializable • However, there are some that are not serializable – Static fields are considered part of the state of the class, not the state of an object – Transient fields are not saved, since they contain temporary data not needed to correctly restore the object later Department of Computer Science Object Serialization – Almost none of the classes in java.io are serializable • Ridiculous to consider “freezing” info about file handles, read/ write positions etc and expect to use it later - even on the same machine – Objects of type Thread are not serializable • Implementation of threads is tightly coupled with the particular platform on which the JVM (java virtual machine) is running 15 Department of Computer Science Random Access of Records • Files are normally processed from start to end but this can be time consuming • Eg. file that contains a set of bank accounts – To update the balances of some of the accounts we have to read the whole lot in to some collection, find the ones we want, make the changes, and write the whole lot out again – If the file is large we could end up doing a lot of reading and writing • It would be better if we could locate the required records in the file and just change them Department of Computer Science Department of Computer Science 16 Random Access of Records • This access pattern is called random access – There is nothing “random” about the access — the term just means that you can read and write any byte stored at any location – Only disk files support random access Sequential access Random access 17 Department of Computer Science 18 3 Random Access of Records • Each disk file has a special file pointer position – Normally, the file pointer is at the end of the file, and any output is appended to the end – If you move the file pointer to the middle of the file, the output overwrites what is already there • java.io.RandomAccessFile allows you to access a file and move a file pointer – Note that RandomAccessFile is not a Stream or Reader/Writer Department of Computer Science Random Access of Records • To open a random access file, you supply a file name and a string to specify the open mode • Can open a file either for reading only (“r”) or for reading and writing (“rw”) RandomAccessFile raf = new RandomAccessFile(“bank.dat”, “rw”); 19 Random Access of Records Department of Computer Science 20 Random Access of Records • If you want to manipulate a data set in a file, you have to pay special attention to the formatting of the data: • Some useful methods in RandomAccessFile – seek(long pos) — move the file pointer to byte number pos – long getFilePointer() — get the current position of the file pointer – long length() — get the number of bytes in the file – Suppose we store account balances and interest rates as text • Example record (balance, rate): 950,10 – If the balance is increased by 10% ($95), the new price has more digits – Placing the file pointer at the first character of the old value and then writing the new value gives: • 104510 – Doh! Not good. The update has overwritten the comma that separates fields Department of Computer Science 21 Department of Computer Science 22 Random Access of Records Random Access of Records • In order to be able to update a file, we must give each field a fixed size that is sufficiently large • The readInt and writeInt methods read and write integers as four byte quantities • The readDouble and writeDouble methods process double precision floating point numbers (eight byte quantities) – As a result, every record has the same size – This has the advantage that it is now easy to skip to record n — just set the file pointer to n ¥ the record size • When storing numbers in a file with fixed record sizes, it is easier to store them in binary format rather than text – For that reason, the RandomAccessFile class stores binary data Department of Computer Science Department of Computer Science 23 Department of Computer Science 24 4 Example: Random Access of Savings Accounts • If we save the balance and interest rate as double values, then each savings account record consists of 16 bytes • BankData.java – Translates between the random access file format and SavingsAccount objects – The size method determines the total number of accounts public int size() throws IOException { return (int)(mFile.length() / RECORD_SIZE); } Department of Computer Science 25 Example: Random Access of Savings Accounts • To read the nth account in the file, the read method positions the file pointer to the offset n ¥ RECORD_SIZE public SavingsAccount read(int n) throws IOException { mFile.seek(n * RECORD_SIZE); double balance = mFile.readDouble(); double interestRate = mFile.readDouble(); SavingsAccount account = new SavingsAccount(interestRate); account.deposit(balance); return account; } Department of Computer Science Example: Random Access of Savings Accounts • Writing an account works the same way public void write(int n, SavingsAccount account) throws IOException { mFile.seek(n * RECORD_SIZE); mFile.writeDouble(account.getBalance()); mFile.writeDouble(account.getInterestRate()); } 26 Readings • Chapter 14 of Budd • javadocs for java.io.RandomAccessFile • There is a command line driven test program (BankDataTest.java) that allows the user to: – Choose an existing account number—details are printed and then interest is calculated and added to the balance – Add a new account to the database Department of Computer Science Department of Computer Science 27 Department of Computer Science 28 5