Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Entity–attribute–value model wikipedia , lookup
Data analysis wikipedia , lookup
Versant Object Database wikipedia , lookup
Concurrency control wikipedia , lookup
Information privacy law wikipedia , lookup
Data vault modeling wikipedia , lookup
Relational model wikipedia , lookup
Business intelligence wikipedia , lookup
Open data in the United Kingdom wikipedia , lookup
A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier Agenda 2 ■ Operational Reporting ■ Related Work ■ Architecture of Hybrid System ■ Virtual Cube ■ Outlook and Discussion August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Operational Reporting 3 Dinstinction according to Inmon: ■ Informational Reporting □ Supports long-term, strategic decisions □ Summarized data □ Long-term horizons Typically done using a data warehouse (DW) ■ Operational Reporting □ Supports day-to-day decisions □ Data on a more detailed level □ Takes up-to-the-minute data into account Done using a DW or an OLTP system? August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Operational Reporting (contd.) 4 ■ Using a DW for Operational Reporting □ DW must be designed to the same level of granularity as the OLTP systems huge data volumes □ Updates are required to frequently be replicated into the DW endless optimization ■ Using an OLTP Store for Operational Reporting □ Operational reporting queries are relatively long-running in comparison to pure OLTP workloads □ Resource contention: Locks of long-running queries block the short-running ones □ Different data model: Not optimized for reporting (i.e. no star-schema) August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Common Data Warehouse Architecture 5 ■ DW contains ETL processor which □ ...extracts data from various OLTP sources into a staging area □ ...applies transformations for cleansing and integration □ ...stores data in a dimensional layout ■ OLAP engine runs queries against dimensional data store August 24, 2008 | A Hybrid Row-column OLTP Database Architecture “Real-Time” DW Architectures 6 ■ Microbatch □ Configure ETL process to run in very short intervals □ Up-to-date data but very resource intensive ■ Push Architectures □ Handling of deltas on a business or database transaction level □ Up-to-date data but still resource intensive ■ Operational Data Store (ODS) □ Store copy of the OLTP data using an integrated schema □ High data granularity but no up-to-date data August 24, 2008 | A Hybrid Row-column OLTP Database Architecture “Real-Time” DW Architectures (contd.) 7 ■ ELT □ Data is extracted from the OLTP sources and loaded into the ODS □ Transformations are done in the warehouse at query-runtime □ High granularity (transactional data) but no up-to-date data ■ Virtual ODS □ Virtual in the sense that queries are redirected against OLTP system □ High granularity (transactional data) and up-to-date data □ Performs ETL on-the-fly □ Affects performance of OLTP system August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Column-Stores: New “Trend” for OLAP 8 □ Column-store databases: ◊ Vertical fragmentation ◊ Fast aggregations (sum, min, max, avg, …) more flexibility for ad-hoc reporting ◊ Each column can be compressed individually □ Both disk-based … ◊ Vertica c1 ◊ Greenplum □ … and in-memory: c2 c3 sID c1 sID c2 sID c3 v11 v12 v13 1 v11 1 v12 1 v13 v21 v22 v23 2 v21 2 v22 2 v23 v31 v32 v33 3 v31 3 v32 3 v33 row-oriented column-oriented ◊ SAP BIA ◊ MonetDB ◊ Exasol August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Encoding Schemes 9 Few distinct values Ordered Unordered Sequence of triples: • value • offset position • # occurrences Sequence of tuples: • value • bitmap for positional occurence Many distinct values Delta representation ? August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Architecture of Hybrid System 10 ■ Essentially integration between row- and column store DBs ■ MaxDB is used as the row store □ Database underlying SAP Business ByDesign □ Supports ACID transactions ■ TREX is used as the column store □ Main memory □ Engine underlying SAP BIA □ Has a copy of (some of) the OLTP data □ Primary OLTP system and main-memory database (MMDB) are governed using a single resource manager August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Architecture of Hybrid System (contd.) 11 August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Virtual Cube ■ Similar architecture as virtual ODS □ Virtual Cube provides the same interface as a typical cube (slice, dice, drill-down, …) □ Virtual Cube rewrites queries and issues them against the MMDB (TREX in our case) □ TREX has a copy of the OLTP data □ Primary OLTP system and MMDB are tied together as described above August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Outlook 13 ■ Build a “real” hybrid database in-memory as part of ChunkyStore ■ Data can be stored as either: □ Rows □ Columns □ Chunks (adjacent fragments of rows and columns) ■ DB decides which physical storage alternative is most suitable ■ Main-memory implementation will cater for fast updates as well as fast operational reporting capabilities August 24, 2008 | A Hybrid Row-column OLTP Database Architecture Thank you 14 ■ Questions? August 24, 2008 | A Hybrid Row-column OLTP Database Architecture