* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download C-Store: The Life of a Query - Sun Yat
Survey
Document related concepts
Entity–attribute–value model wikipedia , lookup
Microsoft Access wikipedia , lookup
Concurrency control wikipedia , lookup
Functional Database Model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational algebra wikipedia , lookup
Clusterpoint wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Versant Object Database wikipedia , lookup
Transcript
C-Store: The Life of a Query Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar 6, 2009 Main Components of a DBMS (1) A typical DBMS has 5 main components Client Communication Manager Process Manager Relational Query Processor Transactional Storage Manager Shared Components and Utilities Main Components of a DBMS (2) A Single-Query Transaction At an airport, a gate agent clicks on a form to request the passenger list for a flight. Example of Possible SQL statement SELECT name FROM Passenger Where flight = ‘510275’ Stage 1: Submit the SQL Statement The Client: the personal computer at the airport gate Calls an API to create a connection with the Client Communication Manager The Client Communication Manager Establish the security check of the Client Set up states to remember the connection And also remember the SQL statement Forward the Client’s request deeper into the DBMS for processing. Different Connection Arrangements Two-tier or Client-server: Three-tier Client Database Server Via ODBC or JDBC Client Web Server Database Server Four-tier Client Web Server App Server Database Server Stage 2: Assign a Thread of Computation Upon receiving the SQL statement from the Client Communication Manager The Process Manager first does Admission Control The system should begin processing the query at once Or defer to the time when the query can have enough resources for execution. The Process Manager allocates a thread of control for a query If the query should be executed at once. Stage 3: Query Processing The Relational Query Processor executes the query of the gate agent Check if the agent is authorized to run the query If authorized, compile the SQL query text into an internal query plan. Query Parsing, Query Rewite / Optimization Once compiled, the Plan Executor handles the query plan. Invoke relation operators Relational Operators (1) Selection Projection File scan, B-Tree, Hash Index (Equality Selection) Remove unwanted attributes Eliminate any duplicates Implement via sorting or hashing Join Nested Loops Join, Sort-Merge Join, Hash Join Relational Operators (2) Set Aggregation Union, Intersection, Difference, Cross product Sorting or Hashing SUM, MIN, MAX, COUNT, AVG Data cube Sorting or Hashing Sorting To get some good properties for speeding-up query Stage 4: Fetch Data from Transactional Storage Manager Plan Executor’s operators request data Transactional Storage Manager manages calls for All data access (READ) All data manipulation (CREATE, UPDATE, DELETE) And ensures ACID properties of transactions Get locks from the Lock Manager Interact with Log Manager for recovery preparation Access Methods and Buffer Management Access Methods Algorithms and data structures for organizing and accessing data on disk. Such as B-Tree, Hash, Bitmap Index Buffer Management Decides when and what data to transfer between disk and memory buffers. Stage 5: Unwinding the Stack After data access, access methods return control to the query executor’s operators. Operators generate result tuples. Result tuples are placed in a buffer for the Client Communication Manager The Client Communication Manager ships the result tuples back to the Client. At the end of the query, the transaction is completed. Do clean-up jobs in each involved component. References Joseph M. Hellerstein, M. Stonebraker and J. Hamilton. Architecture of a Database System. Foundations and Trends in Databases 1(2). 2007. Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems. Second Edition. McGraw-Hill Science. 2000.