* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download V LDB - University of Wisconsin–Madison
Microsoft SQL Server wikipedia , lookup
Oracle Database wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Ingres (database) wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Relational model wikipedia , lookup
ContactPoint wikipedia , lookup
2 V LDB Boris Gelman Vice President Architecture Information Services VISA [email protected] 2 V LDB: The Concept 2 V LDB = Very Very Large Database: New concept or change to VLDB concept ? Data Structure: Petabyte tables with 100s billions of rows Complex table structures Non-uniform physical data representation of petabyte tables Query: Well-defined subsets (index and/or partition) on tables: small (~10,000) -> medium (~300,000) -> large (~1,000,000) Undefined subsets: very large (~1,000,000,000) -> very very large (~100,000,000,000) Complex joins Complex group by’s and sorts Workload: Multiple categories of queries running concurrently (transaction research, analytics, data mining) Inserts and selects concurrently against the same tables 24 * 7 operation with very limited maintenance windows SLAs are very strict 2 V LDB: Problems Data Partitioning: Smart partitioning: hash, expression, … -> hybrid multi-level partitioning Smart partition manipulation: detach / attach partition online Query Execution: Hash join on petabyte tables ? Performance Tuning does not work: Adaptive and buffer-pool aware query optimization ? System-category aware query optimization ? Optimizer efficiency ? Backup/Restore does not work: Data replication is not a substitute for backup: data corruption, application errors, human errors Smart backup/restore related to smart data partitioning ! 2 V LDB: Problems Database Federation: Single database system cannot hold a combination of ODS (> 1 PB) and crossfunctional multi-subject DW (> 200 TB) - it is impractical Data Abstraction Layer: federated tables partitioned across multiple database systems! Federated Database is easier to maintain and backup, and availability is higher! Federated Database Performance = Single Database System Performance !!!