Download Zacharewski Bioinformatics Group Large Object

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Concurrency control wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

SQL wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Relational model wikipedia , lookup

ContactPoint wikipedia , lookup

PL/SQL wikipedia , lookup

Oracle Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Versant Object Database wikipedia , lookup

Database model wikipedia , lookup

Transcript
Zacharewski Bioinformatics Group
Large Object (LOB) Data Insertion and Retrieval
Guidance
Prepared by: Lyle D. Burgoon
Version: 1.0
Date: February 17, 2004
Languages Specified: Java (JDBC)
Comments: All of the code and testing was performed on an Oracle 9i v2.0 database.
The version of the JDBC is the oracle JDBC for Java 1.4. The methods contained
within are going to be specific for Oracle, but the general principles are the same for
any database.
Introduction
Storage and retrieval of large objects (LOBs) is critical in the management of biological
data as the capture of raw data typically comes in electronic formats, such as images or
large reports. Although it is possible to store this data in a BFILE field, where the path to
the data object is stored in the database, this does not facilitate data transfer between sites
in an efficient manner. Plus, this would require back-up scripts to identify these paths
during back-up of the database. Furthermore, this data would have to be replaced in the
event of a system crash in the exact spot noted by the database.
Far easier is the concept of storing the data within the database as a LOB. LOBs are
cousins to the LONG datatype, and are essentially just a size “unlimited” datatype storing
bytes of data. LOBs come in two flavors: 1) BLOBs for binary large objects such as
images, and 2) CLOBS for character large objects, such as raw-text manuscripts,
abstracts, or mark-up files such as HTML and XML.
Storage of BLOBs in Oracle
First a table must be created with a field to contain BLOB data (note that BLOB is an
SQL datatype available in most databases, including Oracle). The general algorithm for
inserting a new BLOB value follows:
1. Insert all other values into the table as per usual, but use the EMPTY_BLOB( )
for the BLOB field.
2. Select the newly entered field for update
3. Create an Oracle BLOB object
4. Populate the new BLOB object with a BLOB locator from the ResultSet
a. Cast the ResultSet from the “SELECT…FOR UPDATE” as an
OracleResultSet
b. The BLOB object now contains the locator to the empty BLOB in the
database, thus allowing for direct access to the database through the
BLOB object
5. Create a FileInputStream using the path for the file of interest (file that will go
into the BLOB)
6. Create an OutputStream object using the getBinaryOutputStream( ) method of the
BLOB object
7. Initialize an integer object to the database’s LOB buffer size using the BLOB
object’s getBufferSize( ) method
8. Create a byte[ ] of the same size as the database’s LOB buffer size using the
integer value from 7
9. Initialize an integer object to -1
a. This integer value is used in a while loop to signal when the file is out of
bytes
10. Construct a while loop to read the number of bytes from the FileInputStream that
the database’s buffer can handle
11. Write these bytes to the OutputStream
12. Repeat the reads in 10 until out of bytes
Code Sample:
Notes:
psPathologyImage2 is a PreparedStatement
rsPathologyImage2 is a ResultSet
BLOB is an Oracle BLOB (from oracle.sql) not an SQL Blob (from java.sql)
import
import
import
import
java.sql.*;
java.io.*;
oracle.sql.*;
oracle.jdbc.*;
. . .
rsPathologyImage2 = psPathologyImage2.executeQuery();
BLOB orBLOB;
if(rsPathologyImage2.next()){
orBLOB = ((OracleResultSet)rsPathologyImage2).getBLOB(1);
File pathologyImageTIFF = new File(pathologyImage);
FileInputStream fis = new FileInputStream(pathologyImageTIFF);
OutputStream os = orBLOB.getBinaryOutputStream();
int size = orBLOB.getBufferSize();
byte[] buffer=new byte[size];
int length=-1;
while((length=fis.read(buffer)) != -1) {
os.write(buffer,0,length);
}
}
. . .
Reading BLOB values out of the database
The dbZach database makes great use of BLOB values, especially in the management of
microarray, real-time PCR and pathology data. Reading BLOB data back out of the
database is far easier than inserting it.
The algorithm for reading BLOB values out of the database follows:
1. Regular queries are performed to get the data out of the database, meaning a
ResultSet is constructed from the execution of a Statement or PreparedStatement.
a. This ResultSet must be cast as an OracleResultSet to take advantage of
some of the niceties afforded by Oracle.
2. Create a BLOB object
3. Create an InputStream object
4. Create an int object that will hold the BLOB’s size
5. Create a byte[ ] that will hold the bytes from the BLOB object
6. Use a while loop to get the BLOB value out one at a time (regular ResultSet stuff
here)
7. Within the while loop the BLOB object takes on the value of the BLOB from the
OracleResultSet
8. Set the int object equal to the size of the BLOB by using the BLOB object’s
length( ) method
9. Populate the byte[ ] using the BLOB object’s getBytes(int, int) method, where the
first int is where to start (typically 1), and the second int is the size of the BLOB
(the int created in 4, set in 8)
10. Create an OutputStream object that will populate a file through the
FileOutputStream class (constructor takes the file path as a parameter)
11. Create a for loop that loops through the byte[ ] one byte at a time, and is written
out to the file through the write( byte[i] ) method
Code Sample:
import oracle.jdbc.*;
import oracle.sql.*;
import java.sql.*;
import java.io.*;
. . .
OracleResultSet orsPathologyImage = (OracleResultSet)
psPathologyImage.executeQuery();
BLOB blob;
InputStream is;
int length;
byte[] bytes;
while(orsPathologyImage.next()){
blob = orsPathologyImage.getBLOB(1);
length = (int) blob.length();
bytes = blob.getBytes(1,length);
OutputStream fos = new FileOutputStream("C:\\foo.txt");
for(int i = 0; i < bytes.length; i++){
fos.write(bytes[i]);
}
}
. . .