Download C-Store: The Life of a Query - Sun Yat

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Microsoft Access wikipedia , lookup

Concurrency control wikipedia , lookup

Functional Database Model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

PL/SQL wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

SQL wikipedia , lookup

Relational algebra wikipedia , lookup

Clusterpoint wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Versant Object Database wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
C-Store: The Life of a Query
Jianlin Feng
School of Software
SUN YAT-SEN UNIVERSITY
Mar 6, 2009
Main Components of a DBMS (1)

A typical DBMS has 5 main components





Client Communication Manager
Process Manager
Relational Query Processor
Transactional Storage Manager
Shared Components and Utilities
Main Components of a DBMS (2)
A Single-Query Transaction

At an airport, a gate agent clicks on a form to
request the passenger list for a flight.

Example of Possible SQL statement
SELECT name
FROM
Passenger
Where
flight = ‘510275’
Stage 1: Submit the SQL Statement

The Client: the personal computer at the
airport gate


Calls an API to create a connection with the Client
Communication Manager
The Client Communication Manager




Establish the security check of the Client
Set up states to remember the connection
And also remember the SQL statement
Forward the Client’s request deeper into the
DBMS for processing.
Different Connection Arrangements

Two-tier or Client-server:



Three-tier


Client  Database Server
Via ODBC or JDBC
Client  Web Server  Database Server
Four-tier

Client  Web Server  App Server  Database Server
Stage 2: Assign a Thread of Computation

Upon receiving the SQL statement from the
Client Communication Manager

The Process Manager first does Admission
Control



The system should begin processing the query at once
Or defer to the time when the query can have enough
resources for execution.
The Process Manager allocates a thread of
control for a query

If the query should be executed at once.
Stage 3: Query Processing

The Relational Query Processor executes the
query of the gate agent


Check if the agent is authorized to run the query
If authorized, compile the SQL query text into an
internal query plan.


Query Parsing, Query Rewite / Optimization
Once compiled, the Plan Executor handles the
query plan.

Invoke relation operators
Relational Operators (1)

Selection


Projection




File scan, B-Tree, Hash Index (Equality Selection)
Remove unwanted attributes
Eliminate any duplicates
Implement via sorting or hashing
Join

Nested Loops Join, Sort-Merge Join, Hash Join
Relational Operators (2)

Set



Aggregation




Union, Intersection, Difference, Cross product
Sorting or Hashing
SUM, MIN, MAX, COUNT, AVG
Data cube
Sorting or Hashing
Sorting

To get some good properties for speeding-up
query
Stage 4: Fetch Data from Transactional
Storage Manager


Plan Executor’s operators request data
Transactional Storage Manager manages
calls for



All data access (READ)
All data manipulation (CREATE, UPDATE,
DELETE)
And ensures ACID properties of transactions


Get locks from the Lock Manager
Interact with Log Manager for recovery
preparation
Access Methods and Buffer Management

Access Methods



Algorithms and data structures for organizing and
accessing data on disk.
Such as B-Tree, Hash, Bitmap Index
Buffer Management

Decides when and what data to transfer between
disk and memory buffers.
Stage 5: Unwinding the Stack





After data access, access methods return
control to the query executor’s operators.
Operators generate result tuples.
Result tuples are placed in a buffer for the
Client Communication Manager
The Client Communication Manager ships the
result tuples back to the Client.
At the end of the query, the transaction is
completed.

Do clean-up jobs in each involved component.
References


Joseph M. Hellerstein, M. Stonebraker and J.
Hamilton. Architecture of a Database System.
Foundations and Trends in Databases 1(2).
2007.
Raghu Ramakrishnan and Johannes Gehrke.
Database Management Systems. Second
Edition. McGraw-Hill Science. 2000.