* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Strategies for All Your Data
Survey
Document related concepts
Data center wikipedia , lookup
Data analysis wikipedia , lookup
Information privacy law wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Data vault modeling wikipedia , lookup
Versant Object Database wikipedia , lookup
Open data in the United Kingdom wikipedia , lookup
3D optical data storage wikipedia , lookup
Relational model wikipedia , lookup
Business intelligence wikipedia , lookup
Clusterpoint wikipedia , lookup
Transcript
Session id: 40236 Strategies for All Your Data Sandeepan Banerjee Vishu Krishnamurthy Oracle Corporation Where are you spending your money ? Data Management Labor Software Integration Hardware and System Integration Too much information in too many places Relational Documents Specialty Servers For Different Kinds Of Data Data Isolation High Systems Admin And Management Costs Scalability Problems High Training Costs Complex Support Problems Multimedia Specialized … Location Messages XML One Management System for All Your Data Relational Characters, Numbers and Dates Complete Integrated Robust Scalable Secure Available on all platforms XML DB Integrated Native XML Database Oracle Text & Ultra Search Text management and search Oracle Locator & Spatial Location and Proximity Searching Oracle interMedia Multimedia management Oracle Collaboration Suite Unified Messaging and Files Extensibility Framework Chemical, Genetic, Engineering,… What is Oracle XML DB? Database support for the XML data model – XMLType, XMLSchema, DOM Fidelity, Xpath, … Hierarchical organization of the data – WebDAV compliant with indexing for fast access Transparent storage optimizations Query Language: SQLX and XQuery Classes of XML DB Applications Exchanging Structured Documents – Well-formed templated business-documents e.g. Purchase Orders, Phone Bills, … Managing Unstructured Documents – Documents, Messages, Instructions Integrating and normalizing data from diverse sources Structured Document Exchange Relational storage remains the “right” way to store highly structured data As an XML programmer, you do not want to think about “tables” – A hierarchical data model is what you want to manipulate XML DB’s XMLType is about preserving the XML paradigm while getting the benefits of relational performance and scalability Structured Document Exchange with Oracle XML DB XML data model and API’s familiar to XML programmers – – XML Schema, Schema Validation, Dom Fidelity JNDI, DOM, XPATH, SQLX, XQuery Enterprise Class Performance & Scalability – – – – Piecewise updates Schema caching Lazy materialization Server-based XSL transformations Structured Data: Temenos GLOBUS Banking platform: #1 selling platform, major banks worldwide Contract-based system, deeply nested data model, user-customizable 80+ major subsystems, 6000 Tables, 100s of GB “Using Oracle XML DB, we successfully benchmarked 22 million banking transactions per day, which translated to 2500 databasetransactions-per-second, for Temenos' GLOBUS banking platform. Oracle XML DB’s performance assured us that powerful XML innovations can be operationalized and deployed without sacrificing enterprise-class scalability.” - TEMENOS Managing Unstructured Data More and more content is being produced as XML (Microsoft Word, Corel XMetal, Arbortext Epic, …) – Markup improves search, processing, organization, … XML DB’s Repository enables XML document content to be stored as ‘files’ in ‘folders’ without losing strong-management, queryability, unbreakable security etc. XML is doing for unstructured data what Relational did for structured: create a standard way to store, query and manage unstructured data Managing Unstructured Data with Oracle XML DB XML data model and API’s familiar to Content Developers Integrated Repository – – – WebDAV compliant Xpath index for fast traversal of foldering hierarchies SQL Queryable Integrated Text Processing – Optimizations such as “tag aware” search Reed Elsevier Large technical publishing conglomerate More than 1700 scientific, technical & medical peer-reviewed journals Over 59 million abstracts Over two million full-text scientific journal articles , another one million full-text articles via CrossRef (http://www.crossref.org/) to other publishers' platforms XML DB chosen as Repository Database g 10 : What’s new in XML DB Broad Performance Improvements – – – – – SQLX query rewrites XSLT optimizations Repository Access and Query optimizations Direct loader support, loading large XML documents Storage optimizations I18N: support for differing character sets on client and server Schema Evolution – Transparently achieves data load/reload Unified XML API between XDK and XML DB – Unified C interfaces XML-based Integration: XQuery Why XQuery ? – Declarative way to query XML documents Why Java? – – Run in mid-tier or database Future server implementation in C Why XML Database ? – – – – Native XML storage XML data management Performance optimizations SQL/XML or XQuery depending on data Status – OTN downloads (pending W3C standard finalization in ’04) XQuery Engine XQuery Engine iAS J2EETM Platform Server JVM XML DB XQuery Example Assume a document – emp.xml <empset> <emp empno=“21” ename=“SCOTT” salary=“120000”/> <emp empno=“22” ename=“JONES” salary=“344000”/> </empset> To get the names of employees with salary > 200000 for $i in document(‘emp.xml’)/empset let $j = 200000 where $i/@salary > $j return $i/@ename Result (attribute node) JONES Differences from SQL Navigation-oriented (using XPath expressions) Different type system (XMLSchema based simple types) Identity-based (XML Node identities and document order) Namespace aware name-resolution (functions, variables, element creation) Row based versus Item based Results are heterogeneous sequences Does not have all SQL extensions (e.g, OLAP, FullText..) Oracle XQuery API JXQI – Java API (ongoing standards discussions) import oracle.xquery; XQueryContext ctx = new XQuerycontext(); Reader strm = new FileReader(“exmpl1.xml”) XQueryPreparedStatement xq = ctx.prepareStatement(strm); XQueryResultSet rset = xq.executeQuery(); while (rset.next()) rset.getNode().print(System.out); XQLPlus tool! (like SQLPlus) Datasources Enables arbitrary input sources – files, cache, JCA datasources xmldatasrc – Oracle language addition Datasource API – – – – initialize describe execute Fetch Bind (an existing DOM) Rewrite to SQL XQuery over Oracle databases – Rewrite! for $i in view(“scott.emp”)/ROW where $i/SALARY > 200000 return $i/ENAME -- is translated to --select “$i”.ename from scott.emp “$i” where “$i”.salary > 200000; More SQL rewrite for $i in view(‘purchaseOrder’)/ROW/PurchaseOrder where $i/ShipAddr/City = ‘San Francisco’ return <PO ponum=$i/@Poid> <$i/ShipAddr> </PO> select xmlelement(“PO”, XMLAttributes(extractvalue(“$i”,‘/PurchaseOrder/@Poid’) as “ponum”)), extract(“$i”, ‘/PurchaseOrder/ShipAddr’)) from scott.purchaseorder “$i” where extractvalue(“$i”, ‘/PurchaseOrder/ShipAddr/City’) = ‘San Francisco’ D E M O N S T R A T I O N XQuery Oracle Text Rich Full-Text Capabilities built into the Oracle database Integrated Search support for Applications – OCS, Portal, Ebusiness Suite Catalog Search Document Archives and Warehouses Infrastructure for Intranet and Extranet Search (via Ultra Search.) Oracle Text: Rich Full-Text g 10 : What’s new in Oracle Text? Supervised Classification – Rule-based and SVM Unsupervised Classification (Clustering) – KMeans and Hierarchical Query-Log Analysis Query-Templating for Progressive-Relaxation, Query-rewriting, Alternative scoring etc. Index creation improvements -- Real-time synchronization Better Partitioning: Create local-partitioned indexes in parallel Filtering enhancements – Filter and index RFC-822 email messages Language Enhancements – Japanese stemming, Customization of Japanese & Chinese Lexicons Information Visualization – Stretch viewer Oracle Ultra Search Out-of-the-box heterogeneous search-and-locate capabilities – DB, Web Servers, Files, E-Mail, Apps High performance threaded Java crawlers Web-style interface Extensible, customizable (Java API) – – – Customizable metadata search Custom crawling Custom rendering Integrated administration Fully multilingual and globalized Integrated with Oracle Portal (repository, portlet) and Oracle Collaboration Suite 10g: What’s new in Ultra Search? Enhanced Security – – – Secure Crawling (https support) Better Authentication http Digest and Forms ACL-secured search hitlist Role-based ACLs per datasource Or custom ACLs stamped by crawler Federated Search – JCA-compliant Searchlet API Unified Search – Secure Crawler API OID Integration D E M O N S T R A T I O N Information Visualization The Media-enabled Oracle Platform Oracle Database 10g – Storage, management, & retrieval of image, audio, video data – Native format understanding, metadata extraction, methods for image processing – Support for leading streaming media servers Oracle Application Server 10g – JSP, servlet and PL/SQL application development support – Media Adaptation Services for Wireless – JDeveloper (BC4J/UIX) and Portal integration Oracle Collaboration Suite – Metadata extraction for OCS Files g New Oracle10 Multimedia Features Standards Support – SQL/MM Still Image New version of Java Advanced Imaging (JAI 1.1.1_01) and additional image processing operators Support for additional media formats – • • • Microsoft ASF, MPEG2 & MPEG4 Microsoft Windows Media Server Plugin Real Server Plugin for Helix Server XML DB integration How Oracle’s Multimedia capabilites are better Only Oracle10g: Supports media content natively – – No manual initiation of separate processes to enable database tablespace to accept media data. No need for DBAs to initiate these processes for each table where they wish to store media data Stores all media and its metadata in the same table as the associated relational data – – No triggers on each and every media object created to update the separate “administration” tables that contain media objects and metadata. No added processing and I/O overhead for access and retrieval Provides Java class libraries and JSP Tag libraries for application development and media access. Oracle is the Leading Spatial Database “In repeated surveys, IDC has found that Oracle is used in an 80%-90% share of Spatial Information Management oriented database installations.” IDC, December 2002 Oracle 10g Locator feature: Beginning with Oracle9i LOCATION capabilities have been part of EVERY database at NO ADDITIONAL COST – Enables business, web and LBS applications Oracle Spatial 10g: Enterprise Edition Option – Supports advanced Land Management, GIS, Transportation,Energy / Utilities, Remote Sensing, Defense and Intelligence applications Oracle10g Location Features Locator Spatial (Enterprise Option) Points, lines, polygons 2D, 3D, 4D data Spatial Operators All Locator features Spatial functions – – Distance Relationships Coordinate Systems Long Transactions Table Partitioning* Object Replication** Parallel Query* – NEW! Deferred Spatial Indexes – NEW! * Requires Enterprise Edition with Partitioning Option ** Some replication features on Enterprise Ed. only – – area/length calculation buffer, centroid, intersection, union, etc. Linear Referencing Spatial Aggregates Coordinate Transforms GeoRaster – NEW! Topology Data Model – NEW! Network Data Model – NEW! GeoCoder – NEW! Spatial Data Analysis & Mining – NEW! Location features in the Oracle “Stack” Any device CRM & ERP Applications TCA schema Web Services e-Business Suite Application Server iAS MapViewer / JDeveloper B2B, B2E, B2C iAS LBS Components Oracle Application Server 10g SOAP, WSDL Data Server Spatial Locator Oracle Database 10g Oracle Location Technology Online Service Oracle core technologies Oracle’s Extensibility Framework Open API to plug in new data types and access methods Specialty Data Types Chemical Genetic Engineering Biometric Multimedia Driven by specialized-domain ISVs -MDL, NetGene, Informax, Protegrity, … Extensibility: In Silico Chemistry Chemistry searching requires special techniques “Viagra®” – – Chemical name is not unique Chemists think graphically “sildenafil citrate” H H O O N The solution: H H N N H H N N N S H – A graphical search engine – Specialized operators such as substructure search (“sss”) = a chemical “contains” O H O H Oracle Collaboration Suite Consolidate management of unstructured data (email, shared documents and other collaborative content) Before grid computing, resources such as storage and CPUs had to be managed separately for each component of the suite (e.g. email vs files vs web conferencing). OCS 10g takes advantage of grid infrastructure for greater efficiency, reduced cost and easier management Extended Data Management Oracle Collaboration Suite, Oracle Portal, eBusiness Suite provide solutions Ultra Search crawls and (where desirable) federates non-Oracle or legacy sources, and bring these in the ambit of uniform access • Search, Interchange, Visualization • Analytics and Mining Oracle provides the most robust open and extensible platform and the important services for all your data • Storage and Management • Search, Interchange, Visualization • Analytics and Mining • Structured data will stay Relational • Documents & Messages will move to XML • Multimedia will be in BLOBs, with metadata annotated in XML QUESTIONS ANSWERS