Download Wed., February 7, 11 AM, 535 Mudd

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft SQL Server wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

ContactPoint wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Versant Object Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Columbia Computer Science
Faculty Candidate Colloquium
Query Execution in
Column-Oriented
Database Systems
Daniel Abadi, MIT
Wed., February 7, 11 AM, 535 Mudd
Recent research on column-oriented database systems (DBMSs) has shown that these
systems can outperform existing row-oriented DBMSs by one to two orders of magnitude
on read-mostly query workloads like those found in data warehouses, decision support,
and customer relationship management systems. In this talk, I will discuss this exciting
new class of database systems and will provide an overview of the C-Store system that
we have developed over the past two years at MIT. I will then focus on the design of the
column-oriented query execution engine I have developed. In particular, I will discuss the
impact on query performance of tuple construction (stitching together attributes from
multiple columns into a row-oriented "tuple") and operation on compressed data. Tuple
construction allows column-oriented DBMSs to offer a standards-compliant relational
database interface (e.g., ODBC, JDBC, etc); however, if done at the wrong point in a
query plan, a significant performance penalty is paid. Similarly, data compression can
improve query performance by an order of magnitude by trading cheap CPU cycles for
expensive I/O bandwidth.