Download MULTIMEDIA DATABASES - Oracle Software Downloads

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Concurrency control wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

PL/SQL wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Oracle Database wikipedia , lookup

Transcript
MULTIMEDIA DATABASES
MULTI TERABYTE PERFORMANCE
Jim Steiner, Senior Director, Oracle Corporation
Joseph Mauro, Principal Product Manager, Oracle Corporation
INTRODUCTION
The use of multimedia – images, audio, and video – has exploded over the last few years as internet,
intranet and other web-based applications have become the norm. Media has also emerged in many
mainstream business applications such as in the financial industry. This multimedia explosion is
driven by several factors: the communications power of the medium, the commoditization of media
capture devices, the standardization of data formats, and the emergence of media capable web page
authoring tools and browsers.
Certainly multimedia adds value but it also brings challenges. Multimedia objects are large,
unstructured and complex in nature, very different from traditional business data. Mainstream
applications are now being faced with managing multi terabyte media stores. Consequently both
storage and bandwidth costs can be prohibitive. Multimedia objects such as video can be very large in
size, and can often overwhelm storage capacities of small to medium-sized systems. Additionally,
substantial bandwidth is required to deliver these large objects to a client in real time. Multimedia
also comes in a wide array of formats that are continually evolving. Keeping up with new, standard
formats is a challenge in its own right. Also, applications need to manage multimedia data in an
integral fashion with “business” data – for example, a picture of a car is associated with information
on its model and price..
Oracle interMedia meets these challenges by adding support that enables Oracle to manage and
deliver image, audio, and video data in an integrated fashion with other enterprise data. interMedia
provides the means to add audio, image, and video columns or objects to existing database tables,
insert and retrieve multimedia data, perform image processing on popular image formats, and
perform conversion between or transcode image formats. Oracle interMedia adds the native data
type services, metadata management facilities and operators to support a number of flexible storage
options to access media data including internet URLs, operating system files, and specialized servers
for streaming media. With Oracle10g, intermedia’s new features underscore Oracle’s continued
strategic commitment to multimedia data management.
Applications that make extensive use of multimedia face the same challenge as most business
applications: performance, scalability, high availability, at lowest possible cost. Multimedia
applications often have greater storage, distribution, security, and ‘demand peaks’ requirements.
Oracle 10g Enterprise Grid Computing benefits multimedia applications through dynamic
provisioning of resources.
This paper examines several Oracle interMedia customer applications including a medical information
firm’s online web publishing service, a state’s road inventory system, a central banks’ customer
records handling system, and a prestigious museum’s inventory system. All have incorporated
multimedia successfully in their mainstream business applications.
WHAT IS ORACLE INTERMEDIA?
The foundation for Oracle interMedia is the Oracle extensibility framework, a set of unique services
that enable application developers to model complex logic and extend the core database services,
including query optimization, indexing, type system, and SQL, to meet the specific needs of an
application. Oracle uses these unique services to provide a consistent architecture for the rich data
types supported by interMedia. interMedia uses object types, similar to Java or C++ classes, to
describe multimedia data. These object types are called ORDImage, ORDAudio, ORDVideo, and
ORDDoc and have attributes and methods associated with them. With Oracle, the data type services
available in these object types is also available through a relational interface and a SQL multimedia
ISO standard interface, so application developers can now choose to store media data in BLOB
columns and use the full range of interMedia functionality through PL/SQL and Java API calls.
Oracle interMedia supports multimedia storage, retrieval, and management of:

Binary large objects (BLOBs) stored locally in Oracle and containing audio, image, or video
data

File-based large objects, or BFILEs, stored locally in operating system-specific file systems
and containing audio, image, or video data

URLs that point to audio, image, or video data stored on any HTTP server.

Streaming audio or video data retrieved and delivered via specialized media streaming
servers, such as RealNetworks and MicroSoft.

Any user-defined sources on other specialty servers.

interMedia objects are tightly integrated with SQL and the Oracle database engine, and are
easily accessible through various thick and thin client interfaces including Java.
In addition, Oracle offers a content-based retrieval feature so that images stored in the database can
be easily searched using image matching technology. Given an image, content-based retrieval
provides the ability to search images stored in a database table for other images using specific visual
attributes such as color, texture, shape, and location. Examples of database applications where this
content-based retrieval is useful span a range from business trademarks, copyrights, and logos to
artistic works in art galleries and museums.
Various application development and web authoring tools are also tightly integrated with Oracle
interMedia so that it is much easier to develop and implement media- rich applications. Java
developers can use Oracle Jdeveloper, since the Business Components for Java (BC4J) framework
now automatically recognizes the interMedia object types and integrates seamlessly with interMedia at
multiple levels. This approach provides great flexibility as well as granting all the productivity benefits
of BC4J. Web Developers can use Oracle Portal for Internet and Intranet delivery. Since interMedia
objects, including source location information, are stored in Oracle tables, they can be included in the
types of data available to Oracle Portal components. Additionally, third party web authoring tools
such as MacroMedia UltraDev, the leading content creation environment for the design of web sites
are now integrated for the dynamic display of the rich media objects. This integration can keep a
web site current while reducing site maintenance.
In its strategic commitment to multimedia, Oracle10g interMedia further commits to standards and
adds several new features. In the standards area, interMedia now supports the SQL/MultiMedia Still
Image standard. This makes it possible for imaging applications to be portable across various
vendors’ databases. interMedia is now integrated with the latest version of the Java Advanced Imaging
package. This provides support for more image processing and, object methods including arbitrary
image rotate, flip and mirror, gamma correction, contrast enhancement, quantization methods, page
selection, and alpha channel. Oracle10g interMedia also has new media format support including
MPEG (2,4) and Microsoft ASF. And with 10g comes a database plugin for both Real Networks
Helix Server and the Microsoft Windows Media Format Server. This ensures the ability to stream the
most popular streaming media formats. Additionally, interMedia now adds support for wireless
devices in the middle tier. This allows for media to be adapted to the wireless network bandwidth
and client device characteristics.
ORACLE INTERMEDIA AT WORK
Now lets look at some examples of Oracle interMedia at work. These customer applications
demonstrate an array of media management tasks including:

Media-rich, mission critical production application support;

Global distributed media access - anytime/anywhere;

Amortization of media objects across multiple applications;

Media rich web based workflow in support of business processes;
 Storage and delivery of a diversity of media types from > 1TB size media repositories.
Note that these tasks are not specific to a particular application but are common to many types of
applications. They also demonstrate how the Oracle server provides for the management of
multimedia content and helps in overcoming the challenges listed previously.
BioMed Central
BioMed Central is a publisher of original, peer-reviewed scientific research that includes 70 online
journals. Many world class research institutions use BioMed Central’s publications -- Dana-Farber
Cancer Institute, Harvard University, National Institutes of Health, and the World Health
Organization.
The fundamental objective of BioMed Central is to change the model of scientific publishing. The
key challenge to meet this goal is the development of easy to use web tools that allow scientists to
personally perform publication tasks that traditionally introduce significant administrative costs.
During the past two years, BioMed Central has built a fully web-based system that covers the
publication cycle of manuscript submission, peer review, editorial acceptance/rejection and final
online delivery of the completed material. To do this, BioMed Central addressed many technical
challenges including:

Handling a wide variety of formats that carry document and media data;

Mechanisms that allow multiple roles such as author, editor, and publisher to perform their
functions in a distributed fashion;

Security so that malicious access and modification can not occur;

Management of a considerable amount of data with multiple versions;

Workflow that moves the data through the publication process;

Web access by all of the roles in the system including consumers of the published materials.
When manuscript documents, figures, and supplementary files are submitted, the original files and the
respective converted PDFs are stored in the database using Oracle interMedia. When new versions of files
are uploaded, and when new versions of the manuscript as a whole are submitted to the editors, it’s
recorded relationally in the Oracle database, while the associated files are stored directly in the database
using interMedia. Oracle 9i handles the various media formats and rendering into GIF and JPG for use on
the web.
WEB
Distributed Scientific Community
Author
submission
Documents
graphs
images
Peer
Reviewer
Editors
Readership
Multiple versions
Ht
ml
workf
low
Oracle with interMedia
Next, editors and invited reviewers receive emails containing web links that allow them to download
and view PDFs and original files as needed. Web based tools expedite the peer review process. Peer
reviewers simply “Agree” or “Decline” online to examine a submitted manuscript. Manuscript PDFs
are automatically sent to the reviewers who have accepted. When they finish, peer reviewers submit
reports via a structured online form. The final accepted manuscript is stored in the database as
searchable XML with the figures, figure thumbnails, PDFs etc all being delivered from the Oracle 9i
database via interMedia.
The database server is currently a dual CPU Sun E420, with 2 gigabytes of RAM running Solaris 7.5.
The system software is on internal mirrored drives, and the data files are stored on approximately 300
gigabytes of external RAID devices.
The Web servers are 1U Dell Powerapp and Poweredge machines, containing single or dual Intel
processors, 512 megabytes of RAM, and running Windows 2000/IIS/ASP. Currently, half dozen
web servers run a large number of different web sites using the common backend Oracle database.
Intelligent load sharing between pairs of front end web servers is used to give high availability.
The data architecture is key to BioMed Central system and consists of a number of Oracle-resident
relational tables. The table hierarchy defines the journals managed by BioMed Central as well as
information on all types of system users and the roles they play in the publishing process. The system
also maintains information on the composition and versions of each manuscript including web
compatible format article files delivered via Oracle interMedia services. The system also maintains the
workflow status of a manuscript through the phases of submission, peer review, acceptance, and
rejection. Finally, an access log is kept for institutional reporting on system utilization
New Mexico Department of Transportation
The State of New Mexico Road Feature Inventory (RFI) application maintains information on the
entire roadway system throughout the state including pictures taken on the highways at every 50 feet.
The application is designed to enable the state to make better road / asset maintenance decisions and
to help the state comply with the federal mandate which requires state Departments of
Transportation to provide detailed inventories of assets in order to receive federal funds. This
application is a perfect example of a multi terabyte multimedia database with performance.
Patrol personal are equipped with a GPS enabled PDA and a digital camera. They are able to update
/ insert / delete assets on their PDA, synchronizing with the database when they return to the office.
Because so much of New Mexico is rural, wireless access to the application is not an option. An
image is associated with each new asset added to the database.
The RFI customers are the six districts around the state that are responsible for the everyday
maintenance of the State of New Mexico’s roadways. The RFI application allows them to prioritize
their maintenance. The RFI application interfaces to the Highway Maintenance Management System
(HMMS) that the Districts use to enter Daily Work Reports and keep information for stockpiles and
like information. The RFI application will tie into the HMMS system so that any maintenance done
on the road will be reflected in updates to the RFI database, keeping the inventory current.
The Database server consists of Oracle 9.0.1.4 running on Windows 2000 on a Compaq Proliant
with 8GB Ram and 500GB of local disk storage and 3 TB of IBM Shark storage. The Application
server consists of Oracle 9iAS 9.0.1.2.2a running on Windows 2000 on a Compaq Proliant with 8GB
Ram and 500GB of local disk storage.
The RFI application makes use of an array of development languages and tools including: Java,
XML, HTML, JavaScript, PL/SQL, Oracle Portal Forms and Reports, Discoverer, and JDeveloper.
It also makes use of some key database technologies including partitioning, RMAN, materialized
views, web cache, and virtual private database.
The RFI application data is made up of traditional assets and media assets. There are approximately
one million traditional (non media) assets which account for appx. 100GB of storage. There
approximately five million media assets, typically JPEG images, which account for appx. 5 TB of
storage.
Despite the enormous size of the database, a single DBA designed, built, deployed, and maintains the
database. This is possible because for Oracle, data is data, whether it is a four byte integer or 5 TB of
digital images.
There are two main parts to the application: reporting, and virtual drive.
Reporting, is done using Portal 3.0.9. Currently, Oracle Reports Builder and Discoverer are used to
generate these reports. JSP’s are being written in JDeveloper to generate reports that have the
interMedia images embedded.
The virtual drive is probably the most widely used piece of the application. A user can choose a
Route, a direction, and a start and end mile marker and take a “virtual drive” of the Route. Each
route has an image taken every 50 feet. The image has an associated mile point to the route. The
user can then press a play button which will, essentially, drive them down the route that they have
chosen, displaying the images one after another.
The virtual drive is intended for use by the whole highway department. For instance, the
maintenance bureau can use this data to pin point an area of highway that needs maintenance or
lacks appropriate signage. Legal can use the data to verify if the route was safely signed or if a
guardrail was where it needed to be, etc.
THE UNITED STATES CENTRAL BANKING SYSTEM
There are several United States Central Bank branches located in several locations around the
country and each branch in turn services various member (commercial) banks constituting the system
as a whole. The Central Banking System acts as a lender of money to these commercial banks, and as
a clearinghouse for checks. In its clearinghouse capacity, the Central Bank often faces handling of
problem checks including checks that cannot be cleared for payment or those that have been
damaged. The process for handling these checks involves:

A Central Bank branch receives faxes of problem checks and cover letters containing
metadata from member commercial banks.

The Central Bank performs optical character recognition (OCR) on the cover letters at a fax
receiving server. The fax image of the check and the now text-based metadata are entered
into the database for efficient indexing. The bitmap image of the faxed check can be used
to reproduce a high fidelity image when needed.

By using a web-based application for their queries, investigators can then search the database
tables containing the checks and associated cover letters and view them via a web browser as
required.
Member
Branches
Central Bank
Check image
Fax
Cover Sheet
Check image
Cover Sheet
Fax
Fax +
OCR
Check image
Fax +
OCR
Check image
Cover Text
Cover Text
.
.
.
Check image
Cover Sheet
Fax
Oracle
Web
Browsers
Documents
Images
Customer
Data
Transaction
Data
Fax +
OCR
Check image
Cover Text
This approach has major benefits. The capture process for problem checks is decoupled from the
resolution process for optimal performance. Additionally, the database keeps a permanent record of
the checks and problem cover sheets for legal purposes and does so in a secure fashion.
The check images are stored in TIFF with the average check size ranging from 17 to 37 KB. Check
images and cover sheets are kept online for a total of 13 months before being archived. As fax cover
letters and check images are added to the system, the database is expected to grow to upwards of
1TB.
The resolution of problem checks between banks is a mission critical problem because substantial
financial resources or float may be involved. This adds up, given the system takes in approximately
26,000 checks per day! The ability to ‘scale up’ this application with more front end problem check
and cover letter capture, and to scale up the back end processing with more problem resolution staff
is vital to keeping the float that is held up to a minimum.
This is a perfect example of how an Oracle 10g application can save money.
Banks and financial institutions host a number of mission critical media applications. A second
example comes from the Brazilian Federal Savings Institution (Caixa Economica Federal do Brasil) a
private, nationwide - the largest in the country - savings bank. Their example is a customer currency
slip application. This application takes and stores the image of currency adjustment slips showing
weekly deposits, withdrawals, and currency adjustments. Bank customers can query their account
records online through a web application. This bank is running a SUN 10000 Ultra Sparc with 34 x
400Mhz CPUs and14GB RAM. They have used this equipment to upload 140M bitonal deposit
receipt images of 25KB each and to convert from TIFF to GIF resulting in app. 4 TB of data. The
upload rate of document images is appx. 3000 per minute. The use of Oracle 10g interMedia to
facilitate automatic image file transcoding saves the bank time and money. This is yet another
example of a main stream business application that fits the mold of a multimedia, multi terabyte
application.
A third example is UBS Paine Weber that has a 1TB check image database that allows its customer
to see images of checks (both sides) that they’ve written and have cleared.
Oracle interMedia brings several advantages to these mission critical solutions:

Fast upload of images to the database;

Image format conversion capacity;

Performance and scalability for reduced costs and financial risk;

Secure management of sensitive financial information;

Web based media access for ease of the user;

Inherent support for business-to-business operations.
THE PALAZZO BRASCHI MUSEUM
The Palazzo Braschi Museum is a prestigious, public Rome museum that hosts over 40,000 works of
art. The museum has several applications requiring use of photographic images of the artwork.
These include:

Presentation of the art works to the public via an internet site;

Query access for restorers and students to information related to the art works;

Cataloging support to add new images, ancillary data, and descriptions (historical and
technical).
Storage and bandwidth challenges certainly prevail in this example. 40,000 pieces of art and the
associated images can result in an archive of significant size. Size can be a more serious problem if a
solution involves replicating media data for each of the applications. Furthermore, replication holds
the potential for duplicate images becoming out of synchronization, adding yet another manageability
challenge. Related to this, is the synchronization of the images with the historical and technical
metadata that can change over time. Bandwidth limitations of the internet pose additional
requirements – the timely delivery of the original digital image or some facsimile (thumbnail) to the
client must be an important characteristic of the system. And finally, security must be addressed, as
controlling access to the media and metadata based upon class of user, is necessary.
Oracle9i
Oracle
Application Server
public
Oracle
Oracle9i
72 x 72 JPG
restorer
150 x 150 JPG
1200 x 1200 TIFF
300 x 300 JPG
student
1200 x 1200 JPG
Historical and
technical
metadata
interMedia
catalog
An Italian systems integration firm with the help of Oracle Consulting developed the solution that
the museum now runs. The base platform consists of an Intel system running NT. The applications
are based completely on Oracle technology – the database, Application Server, Oracle forms and
Reports Servers, and Oracle Portal.
The system stores all the images and metadata related to the art works in the Oracle database so that
they can be amortized across all of the applications, resulting in a saving of both storage and money.
The database also provides appropriate security for the different classes of users, and solves the
synchronization issues for images and metadata. The textual descriptions of the art works are indexed
by Oracle Text to let Internet users, restorers, museum employees, and museum visitor’s search for
art works of interest. The client was developed with Oracle Portal saving both time and effort as
Oracle Portal is integrated with Oracle interMedia. interMedia objects can be incorporated easily and
transparently into portlets by using the standard Oracle Portal interfaces and components.
The images presented a different problem. For each art piece, one, high resolution pictures is
captured, scanned and processed to derive variants for different purposes. These derivations include
an icon size of 72x72 pixels, a caption size of 150x150 pixels, an intranet size 1200x1200 pixels, and
an Internet size 300x300 pixels for each art piece. These images are stored in the Oracle database in a
single table with multiple columns as interMedia objects. Generating these derivative images using
desktop tools would be costly and slow. Instead, they are generated via interMedia in the following
manner:

The pictures are scanned at 1200 dpi and TIFF files are produced.

A PL/SQL procedure is then used to load the TIFF files into the database.

Another PL/SQL procedure processes the TIFF image and through interMedia’s automatic
thumbnail generation capabilities, produces the four different sizes described above. The
TIFF file is also transformed to JPG format.

When completed, the original TIFF image is dropped from the database to free space.
The TIFF images range from 15 to 20 MB and allocated process time for each is 30 seconds. The
BLOB fields in the database have been tuned to gain optimal performance -- for example, the insert
of one TIFF image into the database takes 4-5 seconds. In the remaining 25 seconds, four "process"
calls are made to resize the original image, four more "process" calls change the format from TIFF to
JPG, and the additional four BLOB fields are loaded into the target database table.
The Museum’s content will triple over the next two years to 120,000 works of art. The size of the
image set for each art piece is approximately 600KB (the sum of the four formats). Considering
60,000 art works per year , we have 60,000 x 600Kb = 36,000 KB (~36 GB). The Museum credits
Oracle interMedia with the following benefits:

Consistent storage of image objects in the database;

Easy creation of derivative images;

Shared access to image objects across multiple applications;

Synchronization of the associated metadata and images;

Secure access to image objects by all users;
 Integrated, simplified image and metadata management.
Their successful management of images in the Oracle database for use by several applications has led
the Museum to consider the addition of streaming audio and video.
CONCLUSION
Customers have found that Oracle is well suited for managing multi media, multi Terabyte databases.
Oracle performs well for these databases - a 1TB image repository renders images in a web browser
in less than 0.4 seconds. It also loads media content at device speeds. They have also found that
Oracle scales. Customers already have 5 TB databases with over 140 million images. Bulk loading and
associated processing also scales – parallel processing has scaled to loading 300,000 images per hour
while scaling (thumbnail) and transcoding the images. Oracle is also easier to manage using tools
such as RMAN for backup. Oracle is also more secure as multimedia data inherits all of the built in
security of the Oracle database (authentication, auditing, encryption, access control…) – even banks
use it.
Digital media has become a central component of today’s online applications and information
services in a broad array of situations ranging from e-commerce and B2B, to traditional mission
critical business applications. The increase in volume and valuation of various multimedia types has
highlighted the importance of managing media objects. Complexity is a factor as well because the
management and use of media presents unique problems to each application area.
Oracle with interMedia and the Oracle Application Server provide integrated services for the
management, access, and multi channel (internet and wireless) distribution of media content. This
includes both static and streaming media delivery and management in most popular formats. With
Oracle, media data is managed along with other application specific data making both overall
management and application development easier. Together, these data management and distribution
services are proving invaluable in diverse established enterprise application areas as well as the
emerging wireless space.