Download Set 2 File

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Streams
(Based on Chapter 19 (Binary I/O) Liang book
and Chapter 4 (Streams) Elliotte Rusty book)
Dr. Tatiana Balikhina
1
Motivations
Data stored in a text file is represented in human-readable
form. Data stored in a binary file is represented in binary
form. You cannot read binary files. They are designed to
be read by programs. For example, Java source programs
are stored in text files and can be read by a text editor, but
Java classes are stored in binary files and are read by the
JVM. The advantage of binary files is that they are more
efficient to process than text files.
2
Why do we recall Binary I/O ?
A
large part of what network programs do is
simple input and output: moving bytes from
one system to another. Bytes are bytes; to a
large extent, reading data a server sends you
is not all that different from reading a file.
Sending text to a client is not that different
from writing a file.
3
Objectives





To discover how I/O is processed in Java (§19.2).
To distinguish between text I/O and binary I/O (§19.3).
To read and write bytes using FileInputStream and
FileOutputStream (§19.4.1).
To read and write primitive values and strings using
DataInputStream/DataOutputStream (§19.4.3).
To use Readers and Writers
4
How is I/O Handled in Java?
A File object encapsulates the properties of a file or a path, but does not
contain the methods for reading/writing data from/to a file. In order to
perform I/O, you need to create objects using appropriate Java I/O classes.
Scanner input = new Scanner(new File("temp.txt"));
System.out.println(input.nextLine());
Program
Input object
created from an
input class
Output object
created from an
output class
Input stream
01011…1001
11001…1011
File
File
Output stream
Formatter output = new Formatter("temp.txt");
output.format("%s", "Java 101");
output.close();
5
Text File vs. Binary File

Data stored in a text file are represented in human-readable
form. Data stored in a binary file are represented in binary form.
You cannot read binary files. Binary files are designed to be
read by programs. For example, the Java source programs are
stored in text files and can be read by a text editor, but the Java
classes are stored in binary files and are read by the JVM. The
advantage of binary files is that they are more efficient to
process than text files.

Although it is not technically precise and correct, you can
imagine that a text file consists of a sequence of characters and a
binary file consists of a sequence of bits. For example, the
decimal integer 199 is stored as the sequence of three
characters: '1', '9', '9' in a text file and the same integer is stored
as a byte-type value C7 in a binary file, because decimal 199
equals to hex C7.
6
Binary I/O
Text I/O requires encoding and decoding. The JVM converts a
Unicode to a file specific encoding when writing a character and
coverts a file specific encoding to a Unicode when reading a
character. Binary I/O does not require conversions. When you write
a byte to a file, the original byte is copied into the file. When you
read a byte from a file, the exact byte in the file is returned.
Text I/O program
(a)
The Unicode of
the character
e.g. "199"
,
Encoding/
Decoding
The encoding of the character
is stored in the file
00110001 00111001 00111001
0x31
0x39
0x39
Binary I/O program
(b)
A byte is read/written
e.g. 199
,
The same byte in the file
00110111
0xC7
7
A
low-level output stream receives bytes
and writes bytes to output device.
 A high-level filter output stream receives
general-format data and writes bytes to a
low-level output stream or to another filter
output stream.
 A writer is a similar to a filter output
stream but is specialized for writing Java
strings in units of Unicode characters.
8
 I/O
in java is built on streams
 Input streams read data
 Output streams write data
 Filter streams can be chained to either an
input stream or an output stream. Filters can
modify the data as it's read or written (e.g.
encrypt or compress)
 Readers and writers can be chained to input
and output streams to allow programs to
read and write text (that is, characters)
rather than bytes
9
Binary I/O Classes
FileInputStream
DataInputStream
InputStream
FilterInputStream
BufferedInputStream
ObjectInputStream
Object
OutputStream
FileOutputStream
BufferedOutputStream
FilterOutputStream
DataOutputStream
ObjectOutputStream
PrintStream
10
Java's basic output class is:
java.io.OutputStream

public abstract class OutputStream
Subclasses:
– FileOutputStream
– TelnetOutputStream
– ByteArrayOutputStream
Java's basic input class is java.io.InputStream
public abstract class InputStream
Subclasses:
– FileInputStream
– TelnetInputStream
– ByteArrayInputStream

11
InputStream
The value returned is a byte as an int type.
java.io.InputStream
+read(): int
Reads the next byte of data from the input stream. The value byte is returned as
an int value in the range 0 to 255. If no byte is available because the end of
the stream has been reached, the value –1 is returned.
+read(b: byte[]): int
Reads up to b.length bytes into array b from the input stream and returns the
actual number of bytes read. Returns -1 at the end of the stream.
+read(b: byte[], off: int,
len: int): int
Reads bytes from the input stream and stores into b[off], b[off+1], …,
b[off+len-1]. The actual number of bytes read is returned. Returns -1 at the
end of the stream.
+available(): int
Returns the number of bytes that can be read from the input stream.
+close(): void
Closes this input stream and releases any system resources associated with the
stream.
+skip(n: long): long
Skips over and discards n bytes of data from this input stream. The actual
number of bytes skipped is returned.
+markSupported(): boolean Tests if this input stream supports the mark and reset methods.
+mark(readlimit: int): void Marks the current position in this input stream.
+reset(): void
Repositions this stream to the position at the time the mark method was last
called on this input stream.
12
a) read() method example:
13
b) read(byte[] input, int offset, int length) example:
1,024 bytes might never arrive (as opposed to not being immediately available).
It is required to test the return value of read( ) before adding it to bytes read.
For example:
int bytesRead=0;
int bytesToRead=1024;
byte[] input = new byte[bytesToRead];
while (bytesRead < bytesToRead) {
int result = in.read(input, bytesRead, bytesToRead - bytesRead);
if (result == –1) break;
bytesRead += result;
}
14
OutputStream
The value is a byte as an int type.
java.io.OutputStream
+write(int b): void
Writes the specified byte to this output stream. The parameter b is an int value.
(byte)b is written to the output stream.
+write(b: byte[]): void
Writes all the bytes in array b to the output stream.
+write(b: byte[], off: int, Writes b[off], b[off+1], …, b[off+len-1] into the output stream.
len: int): void
+close(): void
Closes this input stream and releases any system resources associated with the
stream.
+flush(): void
Flushes this output stream and forces any buffered output bytes to be written out.
15
write(int b) method example
For example, the character generator protocol defines a
server that sends out ASCII text. The output looks like this:
Since ASCII is a 7-bit character set, each character is
sent as a single byte. Consequently, this protocol is
straightforward to implement using the basic write( )
methods.
16
public static void generateCharacters(OutputStream out)
throws IOException {
int firstPrintableCharacter = 33;
int numberOfPrintableCharacters = 94;
int numberOfCharactersPerLine = 72;
int start = firstPrintableCharacter;
while (true) { /* infinite loop */
for (int i = start; i < start+numberOfCharactersPerLine; i++) {
out.write((
(i-firstPrintableCharacter) % numberOfPrintableCharacters)
+ firstPrintableCharacter);
}
out.write('\r'); // carriage return
out.write('\n'); // linefeed
start = ((start+1) - firstPrintableCharacter)
% numberOfPrintableCharacters + firstPrintableCharacter;
}
17
Streams can be buffered in software and hardware
The flush() method breaks the deadlock
It's important to flush your streams
When it is done with a stream, it should be closed by
invoking its close( ) method. This releases any resources
associated with the stream
streams need to be flushed immediately before closing
them
Writing to a closed output stream will throw IOException
18
Data can get lost if you don’t flush your streams
19
FileInputStream/FileOutputStream
FileInputStream
DataInputStream
InputStream
FilterInputStream
BufferedInputStream
ObjectInputStream
Object
OutputStream
FileOutputStream
BufferedOutputStream
FilterOutputStream
DataOutputStream
ObjectOutputStream
PrintStream
FileInputStream/FileOutputStream
associates a binary input/output stream with
an external file or network. All the methods
in FileInputStream/FileOuptputStream are
inherited from its superclasses.
20
FileInputStream
To construct a FileInputStream, use the following
constructors:
public FileInputStream(String filename)
public FileInputStream(File file)
A java.io.FileNotFoundException would occur if you attempt to
create a FileInputStream with a nonexistent file.
21
FileOutputStream
To construct a FileOutputStream, use the following constructors:
public FileOutputStream(String filename)
public FileOutputStream(File file)
public FileOutputStream(String filename, boolean append)
public FileOutputStream(File file, boolean append)
If the file does not exist, a new file would be created. If the file already
exists, the first two constructors would delete the current contents in
the file. To retain the current content and append new data into the
file, use the last two constructors by passing true to the append
parameter.
TestFileStream
Run
22
FilterInputStream/FilterOutputStream
FileInputStream
DataInputStream
InputStream
FilterInputStream
BufferedInputStream
ObjectInputStream
Object
OutputStream
FileOutputStream
BufferedOutputStream
FilterOutputStream
DataOutputStream
ObjectOutputStream
PrintStream
Filter streams are streams that filter bytes for some purpose. The basic byte input
stream provides a read method that can only be used for reading bytes. If you want to
read integers, doubles, or strings, you need a filter class to wrap the byte input stream.
Using a filter class enables you to read integers, doubles, and strings instead of bytes
and characters. FilterInputStream and FilterOutputStream are the base classes for
filtering data. When you need to process primitive numeric types, use DatInputStream
and DataOutputStream to filter bytes.
23
24
Filter Streams
 The
filters come in two versions: the filter
streams and the readers and writers
 filter streams work primarily with raw data as
bytes
 The readers and writers handle the special
case of text in a variety of encodings such as
UTF-8
25
DataInputStream/DataOutputStream
DataInputStream reads bytes from the stream
(or a socket) and converts them into appropriate
primitive type values or strings.
FileInputStream
DataInputStream
InputStream
FilterInputStream
BufferedInputStream
ObjectInputStream
Object
OutputStream
FileOutputStream
BufferedOutputStream
FilterOutputStream
DataOutputStream
ObjectOutputStream
PrintStream
DataOutputStream converts primitive type values
or strings into bytes and output the bytes to the
stream (or a socket).
26
DataInputStream
DataInputStream extends FilterInputStream
and implements the DataInput interface.
InputStream
FilterInputStream
DataInputStream
+DataInputStream(
in: InputStream)
java.io.DataInput
+readBoolean(): boolean Reads a Boolean from the input stream.
+readByte(): byte
Reads a byte from the input stream.
+readChar(): char
Reads a character from the input stream.
+readFloat(): float
Reads a float from the input stream.
+readDouble(): float
Reads a double from the input stream.
+readInt(): int
Reads an int from the input stream.
+readLong(): long
Reads a long from the input stream.
+readShort(): short
Reads a short from the input stream.
+readLine(): String
Reads a line of characters from input.
+readUTF(): String
Reads a string in UTF format.
27
DataOutputStream
DataOutputStream extends FilterOutputStream and implements the
DataOutput interface.
OutputStream
FilterOutputStream
DataOutputStream
+DataOutputStream(
out: OutputStream)
java.io.DataOutput
+writeBoolean(b: Boolean): void Writes a Boolean to the output stream.
+writeByte(v: int): void
Writes to the output stream the eight low-order bits
of the argument v.
+writeBytes(s: String): void
Writes the lower byte of the characters in a string to
the output stream.
+writeChar(c: char): void
Writes a character (composed of two bytes) to the
output stream.
+writeChars(s: String): void
Writes every character in the string s, to the output
stream, in order, two bytes per character.
+writeFloat(v: float): void
Writes a float value to the output stream.
+writeDouble(v: float): void
Writes a double value to the output stream.
+writeInt(v: int): void
Writes an int value to the output stream.
+writeLong(v: long): void
Writes a long value to the output stream.
+writeShort(v: short): void
Writes a short value to the output stream.
+writeUTF(s: String): void
Writes two bytes of length information to the output
stream, followed by the UTF representation of
every character in the string s.
28
Characters and Strings in Binary I/O
A Unicode consists of two bytes. The writeChar(char c) method
writes the Unicode of character c to the output. The
writeChars(String s) method writes the Unicode for each character in
the string s to the output.
Why UTF-8? What is UTF-8?
UTF-8 is a coding scheme that allows systems to operate with both
ASCII and Unicode efficiently. Most operating systems use ASCII.
Java uses Unicode. The ASCII character set is a subset of the
Unicode character set. Since most applications need only the ASCII
character set, it is a waste to represent an 8-bit ASCII character as a
16-bit Unicode character. The UTF-8 is an alternative scheme that
stores a character using 1, 2, or 3 bytes. ASCII values (less than
0x7F) are coded in one byte. Unicode values less than 0x7FF are
coded in two bytes. Other Unicode values are coded in three bytes.
29
Using DataInputStream/DataOutputStream
Data streams are used as wrappers on existing input and output
streams to filter data in the original stream. They are created using the
following constructors:
public DataInputStream(InputStream instream)
public DataOutputStream(OutputStream outstream)
The statements given below create data streams. The first statement
creates an input stream for file in.dat; the second statement creates an
output stream for file out.dat.
DataInputStream infile =
new DataInputStream(new FileInputStream("in.dat"));
DataOutputStream outfile =
new DataOutputStream(new FileOutputStream("out.dat"));
TestDataStream
Run
30
Order and Format
CAUTION: You have to read the data in the same order and same
format in which they are stored. For example, since names are written
in UTF-8 using writeUTF, you must read names using readUTF.
Checking End of File
TIP: If you keep reading data at the end of a stream, an EOFException
would occur. So how do you check the end of a file? You can use
input.available() to check it. input.available() == 0 indicates that it is
the end of a file.
31
BufferedInputStream/
BufferedOutputStream
Using buffers to speed up I/O
FileInputStream
DataInputStream
InputStream
FilterInputStream
BufferedInputStream
ObjectInputStream
Object
OutputStream
FileOutputStream
BufferedOutputStream
FilterOutputStream
DataOutputStream
ObjectOutputStream
PrintStream
BufferedInputStream/BufferedOutputStream does not contain new
methods. All the methods BufferedInputStream/BufferedOutputStream are
inherited from the InputStream/OutputStream classes.
32
Buffered Streams



BufferedOutputStream: stores written data in a buffer until the
buffer is full or the stream is flushed.
A single write of many bytes is almost always much faster
than many small writes that add up to the same thing. This is
especially true of network connections because each TCP
segment or UDP packet carries a finite amount of overhead,
generally about 40 bytes’ worth. This means that sending 1
kilobyte of data 1 byte at a time actually requires sending 40
kilobytes over the wire, whereas sending it all at once only
requires sending a little more than 1K of data.
BufferedInputStream: provides the stream’s read() method
with the data
33
Constructing
BufferedInputStream/BufferedOutputStream
// Create a BufferedInputStream
public BufferedInputStream(InputStream in)
public BufferedInputStream(InputStream in, int bufferSize)
// Create a BufferedOutputStream
public BufferedOutputStream(OutputStream out)
public BufferedOutputStream(OutputStreamr out, int bufferSize)
34
Chaining Filters Together
 Filters
are connected to streams by their
constructors
 Example:
DataOutputStream dout = new DataOutputStream(
new BufferedOutputStream( new
FileOutputStream("data.txt") ) );
35
BufferedOutputStream and
BufferedInputStream Methods
 The
methods for BufferedInputStream are
the same as the methods for InputStream.
 The
methods for BufferedOutputStream are
the same as the methods for OutputStream:
36
Readers and Writers
Two abstract superclasses define the basic API for reading
and writing characters:
 The java.io.Reader class specifies the API by which
characters are read.
The java.io.Writer class specifies the API by which
characters are written.
Wherever input and output streams use bytes, readers and
writers use Unicode characters. Concrete subclasses of Reader
and Writer allow particular sources to be read and targets to be
written.
 Filter readers and writers can be attached to other readers
and writers to provide additional services or interfaces.
37
Readers and Writers



The most important subclasses of Reader and Writer are the
InputStreamReader and OutputStreamWriter.
InputStreamReader contains an underlying input stream
form which it reads raw bytes. It translates these bytes into
Unicode characters according to specific encoding.
OutputStreamWriter receives Unicode characters from a
running program. It translates those characters into bytes
using a specified encoding and writes the bytes onto an
underlying output stream.
To learn more about java.io classes and subclasses go to
http://docs.oracle.com/javase/6/docs/api/java/io/packa
ge-tree.html

38
Readers and Writers
Writer class mirrors the java.io.OutputStream class. The
Writer class is never used directly; instead, it is used
polymorphically, through one of its subclasses. It has
five write() methods and flush() and close() method.
OutputStreamWriter is a subclass of Writer, it receives
chars and convert them into bytes according to a specified
encoding.
OutputStreamWriter(OutputStream out, String encoding) throws
UnsupporteEncodingException
OutputStreamWriter(OutputStream out)
39
Encoding example
OutputStreamWriter w = new OutputStreamWriter(
new FileOutputStream("OdysseyB.txt"), "Cp1253");
w.write(“
");
Cp1253 is Windows Greek encoding
Other than the constructors, OutputStreamWriter has only the usual
Writer methods (which are used exactly as they are for any Writer
class) and one method to return the encoding of the object:
public String getEncoding( )
40
Readers and Writers
IntputStreamReader is a subclass of Reader, it reads
bytes and convert them into chars according to a
specified encoding. It has two methods:
InputStreamReader(InputStream in, String encoding) throws
UnsupporteEncodingException
InputStreamReader(InputStream in)
If no encoding is specified, the default encoding for the platform is
used.
41
InputStreamReader Example
For example, this method reads an input stream and converts it all
to one Unicode string using the MacCyrillic encoding:
public static String getMacCyrillicString(InputStream in)
throws IOException {
InputStreamReader r = new InputStreamReader(in,"MacCyrillic");
StringBuffer sb = new StringBuffer( );
int c;
while ((c = r.read( )) != –1) sb.append((char) c);
r.close( );
return sb.toString( );
}
42
Filter Readers and Writers
After using output streams to the interface from a byte-oriented
interface to a character-oriented interface an additional characteroriented filters can be layered on top of the reader or writer using
the java.io.FilterReader and java.io.FilterWriter classes. As with
filter streams, there are a variety of subclasses that perform
specific filtering, including:
BufferedReader
BufferedWriter
LineNumberReader
PushbackReader
PrintWriter
43
Buffered readers and writers
The BufferedReader and BufferedWriter classes are the characterbased equivalents of the byte-oriented BufferedInputStream and
BufferedOutputStream classes.
BufferedReader and BufferedWriter have the usual methods
associated with readers and writers, like read( ), ready( ), write( ),
and close( ). They each have two constructors that chain the
BufferedReader or BufferedWriter to an underlying reader or
writer and set the size of the buffer. If the size is not set, the
default size of 8,192 characters is used:
public BufferedReader(Reader in, int bufferSize)
public BufferedReader(Reader in)
public BufferedWriter(Writer out)
public BufferedWriter(Writer out, int bufferSize)
44
BufferedReader example
Chaining a BufferedReader to the InputStreamReader make the
MacCyrillic algorithm run faster.
public static String getMacCyrillicString(InputStream in)
throws IOException {
Reader r = new InputStreamReader(in, "MacCyrillic");
r = new BufferedReader(r, 1024);
StringBuffer sb = new StringBuffer( );
int c;
while ((c = r.read( )) != –1) sb.append((char) c);
r.close( );
return sb.toString( );
}
45