Download Link - Faculty

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Clusterpoint wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational algebra wikipedia , lookup

Join (SQL) wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Column Oriented Database
By:
Deepak Sood
Garima Chhikara
Neha Rani
Vijayita Gumber
Columnar Database Systems
•
•
•
•
Stores data by column.
Keeps all attribute information together.
Handles fixed length data.
2-D data represented at conceptual level is mapped to 1-D data
structure at physical level.
In row store data are stored in the disk tuple by tuple.
In Column Store data is stored in disk column by column.
Row Store
Column Store
(+) Easy to add/modify a record
(+) Only need to read in relevant data
(-) Might read in unnecessary data
(-) Tuple writes require multiple accesses
Row Store and Column Store
•
•
•
•
Most of the queries does not process all the attributes of a particular relation.
For example the query
Select c.name and c.address
From CUSTOMES as c
Where c.region=Mumbai;
Only process three attributes of the relation CUSTOMER. But the customer relation
can have more than three attributes.
Column-stores are more I/O efficient for read-only queries as they read, only those
attributes which are accessed by a query.
Why Column Store ?
• Faster.
• Fetch only required columns for a query.
• Better cache effects.
• Better Compresssion.
• Data Warehousing applications make more read operation.
• Row oriented have an overhead of seeking through all columns.
• Can be slower for some applications like OLTP with many row inserts.
Query Execution - Operators
• Select : Same as relational algebra, but produces a bit string
• Project : Same as relational algebra
• Join : Joins projections according to predicates
• Aggregation : SQL like aggregates
• Sort : Sort all columns of a projection
• Decompress: Converts compressed column to uncompressed representation
Query Execution - Operators
• Mask(Bitstring B, Projection Cs) => emit only those values whose
corresponding bits are 1
• Concat: Combines one or more projections sorted in the same order into a
single projection
• Permute: Permutes a projection according to the ordering defined by a join
index
• Bitstring operators: Band – Bitwise AND, Bor – Bitwise OR, Bnot –
complement
Column-store simulation in a row-store
1.Vertical Partitioning: Each column is a relation.
2.Index-Only: B+ Tree on each columns.
3.Materialized Views: Optimal set of views for every query.
Column-Oriented Execution
Four techniques are being introduced for Optimization in order to improve the
performance of column-stores:
• Compression
• Late Materialization
• Block Iteration
• Invisible Join
Invisible Join
Find Total revenue from Asian customers who purchase a product supplied by an Asian supplier
between 1992 and 1997 grouped by nation of the customer, supplier and year of transaction
Invisible Join
Phase 1
Invisible Join
Phase 2
Invisible Join
Phase 3
Applications
• Analyzing unorganized BIG DATA with improved granularity.
• Data Warehouses and Business Intelligence.
• Online Analytical Processing.
• Data Marts Development.
• Data Mining.
THANK YOU !!