Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Streams (Based on Chapter 19 (Binary I/O) Liang book and Chapter 4 (Streams) Elliotte Rusty book) Dr. Tatiana Balikhina 1 Motivations Data stored in a text file is represented in human-readable form. Data stored in a binary file is represented in binary form. You cannot read binary files. They are designed to be read by programs. For example, Java source programs are stored in text files and can be read by a text editor, but Java classes are stored in binary files and are read by the JVM. The advantage of binary files is that they are more efficient to process than text files. 2 Why do we recall Binary I/O ? A large part of what network programs do is simple input and output: moving bytes from one system to another. Bytes are bytes; to a large extent, reading data a server sends you is not all that different from reading a file. Sending text to a client is not that different from writing a file. 3 Objectives To discover how I/O is processed in Java (§19.2). To distinguish between text I/O and binary I/O (§19.3). To read and write bytes using FileInputStream and FileOutputStream (§19.4.1). To read and write primitive values and strings using DataInputStream/DataOutputStream (§19.4.3). To use Readers and Writers 4 How is I/O Handled in Java? A File object encapsulates the properties of a file or a path, but does not contain the methods for reading/writing data from/to a file. In order to perform I/O, you need to create objects using appropriate Java I/O classes. Scanner input = new Scanner(new File("temp.txt")); System.out.println(input.nextLine()); Program Input object created from an input class Output object created from an output class Input stream 01011…1001 11001…1011 File File Output stream Formatter output = new Formatter("temp.txt"); output.format("%s", "Java 101"); output.close(); 5 Text File vs. Binary File Data stored in a text file are represented in human-readable form. Data stored in a binary file are represented in binary form. You cannot read binary files. Binary files are designed to be read by programs. For example, the Java source programs are stored in text files and can be read by a text editor, but the Java classes are stored in binary files and are read by the JVM. The advantage of binary files is that they are more efficient to process than text files. Although it is not technically precise and correct, you can imagine that a text file consists of a sequence of characters and a binary file consists of a sequence of bits. For example, the decimal integer 199 is stored as the sequence of three characters: '1', '9', '9' in a text file and the same integer is stored as a byte-type value C7 in a binary file, because decimal 199 equals to hex C7. 6 Binary I/O Text I/O requires encoding and decoding. The JVM converts a Unicode to a file specific encoding when writing a character and coverts a file specific encoding to a Unicode when reading a character. Binary I/O does not require conversions. When you write a byte to a file, the original byte is copied into the file. When you read a byte from a file, the exact byte in the file is returned. Text I/O program (a) The Unicode of the character e.g. "199" , Encoding/ Decoding The encoding of the character is stored in the file 00110001 00111001 00111001 0x31 0x39 0x39 Binary I/O program (b) A byte is read/written e.g. 199 , The same byte in the file 00110111 0xC7 7 A low-level output stream receives bytes and writes bytes to output device. A high-level filter output stream receives general-format data and writes bytes to a low-level output stream or to another filter output stream. A writer is a similar to a filter output stream but is specialized for writing Java strings in units of Unicode characters. 8 I/O in java is built on streams Input streams read data Output streams write data Filter streams can be chained to either an input stream or an output stream. Filters can modify the data as it's read or written (e.g. encrypt or compress) Readers and writers can be chained to input and output streams to allow programs to read and write text (that is, characters) rather than bytes 9 Binary I/O Classes FileInputStream DataInputStream InputStream FilterInputStream BufferedInputStream ObjectInputStream Object OutputStream FileOutputStream BufferedOutputStream FilterOutputStream DataOutputStream ObjectOutputStream PrintStream 10 Java's basic output class is: java.io.OutputStream public abstract class OutputStream Subclasses: – FileOutputStream – TelnetOutputStream – ByteArrayOutputStream Java's basic input class is java.io.InputStream public abstract class InputStream Subclasses: – FileInputStream – TelnetInputStream – ByteArrayInputStream 11 InputStream The value returned is a byte as an int type. java.io.InputStream +read(): int Reads the next byte of data from the input stream. The value byte is returned as an int value in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value –1 is returned. +read(b: byte[]): int Reads up to b.length bytes into array b from the input stream and returns the actual number of bytes read. Returns -1 at the end of the stream. +read(b: byte[], off: int, len: int): int Reads bytes from the input stream and stores into b[off], b[off+1], …, b[off+len-1]. The actual number of bytes read is returned. Returns -1 at the end of the stream. +available(): int Returns the number of bytes that can be read from the input stream. +close(): void Closes this input stream and releases any system resources associated with the stream. +skip(n: long): long Skips over and discards n bytes of data from this input stream. The actual number of bytes skipped is returned. +markSupported(): boolean Tests if this input stream supports the mark and reset methods. +mark(readlimit: int): void Marks the current position in this input stream. +reset(): void Repositions this stream to the position at the time the mark method was last called on this input stream. 12 a) read() method example: 13 b) read(byte[] input, int offset, int length) example: 1,024 bytes might never arrive (as opposed to not being immediately available). It is required to test the return value of read( ) before adding it to bytes read. For example: int bytesRead=0; int bytesToRead=1024; byte[] input = new byte[bytesToRead]; while (bytesRead < bytesToRead) { int result = in.read(input, bytesRead, bytesToRead - bytesRead); if (result == –1) break; bytesRead += result; } 14 OutputStream The value is a byte as an int type. java.io.OutputStream +write(int b): void Writes the specified byte to this output stream. The parameter b is an int value. (byte)b is written to the output stream. +write(b: byte[]): void Writes all the bytes in array b to the output stream. +write(b: byte[], off: int, Writes b[off], b[off+1], …, b[off+len-1] into the output stream. len: int): void +close(): void Closes this input stream and releases any system resources associated with the stream. +flush(): void Flushes this output stream and forces any buffered output bytes to be written out. 15 write(int b) method example For example, the character generator protocol defines a server that sends out ASCII text. The output looks like this: Since ASCII is a 7-bit character set, each character is sent as a single byte. Consequently, this protocol is straightforward to implement using the basic write( ) methods. 16 public static void generateCharacters(OutputStream out) throws IOException { int firstPrintableCharacter = 33; int numberOfPrintableCharacters = 94; int numberOfCharactersPerLine = 72; int start = firstPrintableCharacter; while (true) { /* infinite loop */ for (int i = start; i < start+numberOfCharactersPerLine; i++) { out.write(( (i-firstPrintableCharacter) % numberOfPrintableCharacters) + firstPrintableCharacter); } out.write('\r'); // carriage return out.write('\n'); // linefeed start = ((start+1) - firstPrintableCharacter) % numberOfPrintableCharacters + firstPrintableCharacter; } 17 Streams can be buffered in software and hardware The flush() method breaks the deadlock It's important to flush your streams When it is done with a stream, it should be closed by invoking its close( ) method. This releases any resources associated with the stream streams need to be flushed immediately before closing them Writing to a closed output stream will throw IOException 18 Data can get lost if you don’t flush your streams 19 FileInputStream/FileOutputStream FileInputStream DataInputStream InputStream FilterInputStream BufferedInputStream ObjectInputStream Object OutputStream FileOutputStream BufferedOutputStream FilterOutputStream DataOutputStream ObjectOutputStream PrintStream FileInputStream/FileOutputStream associates a binary input/output stream with an external file or network. All the methods in FileInputStream/FileOuptputStream are inherited from its superclasses. 20 FileInputStream To construct a FileInputStream, use the following constructors: public FileInputStream(String filename) public FileInputStream(File file) A java.io.FileNotFoundException would occur if you attempt to create a FileInputStream with a nonexistent file. 21 FileOutputStream To construct a FileOutputStream, use the following constructors: public FileOutputStream(String filename) public FileOutputStream(File file) public FileOutputStream(String filename, boolean append) public FileOutputStream(File file, boolean append) If the file does not exist, a new file would be created. If the file already exists, the first two constructors would delete the current contents in the file. To retain the current content and append new data into the file, use the last two constructors by passing true to the append parameter. TestFileStream Run 22 FilterInputStream/FilterOutputStream FileInputStream DataInputStream InputStream FilterInputStream BufferedInputStream ObjectInputStream Object OutputStream FileOutputStream BufferedOutputStream FilterOutputStream DataOutputStream ObjectOutputStream PrintStream Filter streams are streams that filter bytes for some purpose. The basic byte input stream provides a read method that can only be used for reading bytes. If you want to read integers, doubles, or strings, you need a filter class to wrap the byte input stream. Using a filter class enables you to read integers, doubles, and strings instead of bytes and characters. FilterInputStream and FilterOutputStream are the base classes for filtering data. When you need to process primitive numeric types, use DatInputStream and DataOutputStream to filter bytes. 23 24 Filter Streams The filters come in two versions: the filter streams and the readers and writers filter streams work primarily with raw data as bytes The readers and writers handle the special case of text in a variety of encodings such as UTF-8 25 DataInputStream/DataOutputStream DataInputStream reads bytes from the stream (or a socket) and converts them into appropriate primitive type values or strings. FileInputStream DataInputStream InputStream FilterInputStream BufferedInputStream ObjectInputStream Object OutputStream FileOutputStream BufferedOutputStream FilterOutputStream DataOutputStream ObjectOutputStream PrintStream DataOutputStream converts primitive type values or strings into bytes and output the bytes to the stream (or a socket). 26 DataInputStream DataInputStream extends FilterInputStream and implements the DataInput interface. InputStream FilterInputStream DataInputStream +DataInputStream( in: InputStream) java.io.DataInput +readBoolean(): boolean Reads a Boolean from the input stream. +readByte(): byte Reads a byte from the input stream. +readChar(): char Reads a character from the input stream. +readFloat(): float Reads a float from the input stream. +readDouble(): float Reads a double from the input stream. +readInt(): int Reads an int from the input stream. +readLong(): long Reads a long from the input stream. +readShort(): short Reads a short from the input stream. +readLine(): String Reads a line of characters from input. +readUTF(): String Reads a string in UTF format. 27 DataOutputStream DataOutputStream extends FilterOutputStream and implements the DataOutput interface. OutputStream FilterOutputStream DataOutputStream +DataOutputStream( out: OutputStream) java.io.DataOutput +writeBoolean(b: Boolean): void Writes a Boolean to the output stream. +writeByte(v: int): void Writes to the output stream the eight low-order bits of the argument v. +writeBytes(s: String): void Writes the lower byte of the characters in a string to the output stream. +writeChar(c: char): void Writes a character (composed of two bytes) to the output stream. +writeChars(s: String): void Writes every character in the string s, to the output stream, in order, two bytes per character. +writeFloat(v: float): void Writes a float value to the output stream. +writeDouble(v: float): void Writes a double value to the output stream. +writeInt(v: int): void Writes an int value to the output stream. +writeLong(v: long): void Writes a long value to the output stream. +writeShort(v: short): void Writes a short value to the output stream. +writeUTF(s: String): void Writes two bytes of length information to the output stream, followed by the UTF representation of every character in the string s. 28 Characters and Strings in Binary I/O A Unicode consists of two bytes. The writeChar(char c) method writes the Unicode of character c to the output. The writeChars(String s) method writes the Unicode for each character in the string s to the output. Why UTF-8? What is UTF-8? UTF-8 is a coding scheme that allows systems to operate with both ASCII and Unicode efficiently. Most operating systems use ASCII. Java uses Unicode. The ASCII character set is a subset of the Unicode character set. Since most applications need only the ASCII character set, it is a waste to represent an 8-bit ASCII character as a 16-bit Unicode character. The UTF-8 is an alternative scheme that stores a character using 1, 2, or 3 bytes. ASCII values (less than 0x7F) are coded in one byte. Unicode values less than 0x7FF are coded in two bytes. Other Unicode values are coded in three bytes. 29 Using DataInputStream/DataOutputStream Data streams are used as wrappers on existing input and output streams to filter data in the original stream. They are created using the following constructors: public DataInputStream(InputStream instream) public DataOutputStream(OutputStream outstream) The statements given below create data streams. The first statement creates an input stream for file in.dat; the second statement creates an output stream for file out.dat. DataInputStream infile = new DataInputStream(new FileInputStream("in.dat")); DataOutputStream outfile = new DataOutputStream(new FileOutputStream("out.dat")); TestDataStream Run 30 Order and Format CAUTION: You have to read the data in the same order and same format in which they are stored. For example, since names are written in UTF-8 using writeUTF, you must read names using readUTF. Checking End of File TIP: If you keep reading data at the end of a stream, an EOFException would occur. So how do you check the end of a file? You can use input.available() to check it. input.available() == 0 indicates that it is the end of a file. 31 BufferedInputStream/ BufferedOutputStream Using buffers to speed up I/O FileInputStream DataInputStream InputStream FilterInputStream BufferedInputStream ObjectInputStream Object OutputStream FileOutputStream BufferedOutputStream FilterOutputStream DataOutputStream ObjectOutputStream PrintStream BufferedInputStream/BufferedOutputStream does not contain new methods. All the methods BufferedInputStream/BufferedOutputStream are inherited from the InputStream/OutputStream classes. 32 Buffered Streams BufferedOutputStream: stores written data in a buffer until the buffer is full or the stream is flushed. A single write of many bytes is almost always much faster than many small writes that add up to the same thing. This is especially true of network connections because each TCP segment or UDP packet carries a finite amount of overhead, generally about 40 bytes’ worth. This means that sending 1 kilobyte of data 1 byte at a time actually requires sending 40 kilobytes over the wire, whereas sending it all at once only requires sending a little more than 1K of data. BufferedInputStream: provides the stream’s read() method with the data 33 Constructing BufferedInputStream/BufferedOutputStream // Create a BufferedInputStream public BufferedInputStream(InputStream in) public BufferedInputStream(InputStream in, int bufferSize) // Create a BufferedOutputStream public BufferedOutputStream(OutputStream out) public BufferedOutputStream(OutputStreamr out, int bufferSize) 34 Chaining Filters Together Filters are connected to streams by their constructors Example: DataOutputStream dout = new DataOutputStream( new BufferedOutputStream( new FileOutputStream("data.txt") ) ); 35 BufferedOutputStream and BufferedInputStream Methods The methods for BufferedInputStream are the same as the methods for InputStream. The methods for BufferedOutputStream are the same as the methods for OutputStream: 36 Readers and Writers Two abstract superclasses define the basic API for reading and writing characters: The java.io.Reader class specifies the API by which characters are read. The java.io.Writer class specifies the API by which characters are written. Wherever input and output streams use bytes, readers and writers use Unicode characters. Concrete subclasses of Reader and Writer allow particular sources to be read and targets to be written. Filter readers and writers can be attached to other readers and writers to provide additional services or interfaces. 37 Readers and Writers The most important subclasses of Reader and Writer are the InputStreamReader and OutputStreamWriter. InputStreamReader contains an underlying input stream form which it reads raw bytes. It translates these bytes into Unicode characters according to specific encoding. OutputStreamWriter receives Unicode characters from a running program. It translates those characters into bytes using a specified encoding and writes the bytes onto an underlying output stream. To learn more about java.io classes and subclasses go to http://docs.oracle.com/javase/6/docs/api/java/io/packa ge-tree.html 38 Readers and Writers Writer class mirrors the java.io.OutputStream class. The Writer class is never used directly; instead, it is used polymorphically, through one of its subclasses. It has five write() methods and flush() and close() method. OutputStreamWriter is a subclass of Writer, it receives chars and convert them into bytes according to a specified encoding. OutputStreamWriter(OutputStream out, String encoding) throws UnsupporteEncodingException OutputStreamWriter(OutputStream out) 39 Encoding example OutputStreamWriter w = new OutputStreamWriter( new FileOutputStream("OdysseyB.txt"), "Cp1253"); w.write(“ "); Cp1253 is Windows Greek encoding Other than the constructors, OutputStreamWriter has only the usual Writer methods (which are used exactly as they are for any Writer class) and one method to return the encoding of the object: public String getEncoding( ) 40 Readers and Writers IntputStreamReader is a subclass of Reader, it reads bytes and convert them into chars according to a specified encoding. It has two methods: InputStreamReader(InputStream in, String encoding) throws UnsupporteEncodingException InputStreamReader(InputStream in) If no encoding is specified, the default encoding for the platform is used. 41 InputStreamReader Example For example, this method reads an input stream and converts it all to one Unicode string using the MacCyrillic encoding: public static String getMacCyrillicString(InputStream in) throws IOException { InputStreamReader r = new InputStreamReader(in,"MacCyrillic"); StringBuffer sb = new StringBuffer( ); int c; while ((c = r.read( )) != –1) sb.append((char) c); r.close( ); return sb.toString( ); } 42 Filter Readers and Writers After using output streams to the interface from a byte-oriented interface to a character-oriented interface an additional characteroriented filters can be layered on top of the reader or writer using the java.io.FilterReader and java.io.FilterWriter classes. As with filter streams, there are a variety of subclasses that perform specific filtering, including: BufferedReader BufferedWriter LineNumberReader PushbackReader PrintWriter 43 Buffered readers and writers The BufferedReader and BufferedWriter classes are the characterbased equivalents of the byte-oriented BufferedInputStream and BufferedOutputStream classes. BufferedReader and BufferedWriter have the usual methods associated with readers and writers, like read( ), ready( ), write( ), and close( ). They each have two constructors that chain the BufferedReader or BufferedWriter to an underlying reader or writer and set the size of the buffer. If the size is not set, the default size of 8,192 characters is used: public BufferedReader(Reader in, int bufferSize) public BufferedReader(Reader in) public BufferedWriter(Writer out) public BufferedWriter(Writer out, int bufferSize) 44 BufferedReader example Chaining a BufferedReader to the InputStreamReader make the MacCyrillic algorithm run faster. public static String getMacCyrillicString(InputStream in) throws IOException { Reader r = new InputStreamReader(in, "MacCyrillic"); r = new BufferedReader(r, 1024); StringBuffer sb = new StringBuffer( ); int c; while ((c = r.read( )) != –1) sb.append((char) c); r.close( ); return sb.toString( ); } 45