* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Storage Manager
Microsoft SQL Server wikipedia , lookup
Relational algebra wikipedia , lookup
Concurrency control wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Functional Database Model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
CSC 370 – Database Systems Introduction What’s a database? • In essence a database is nothing more than a collection of information that exists over a long period of time. • Databases are empowered by a body of knowledge and technology embodied in specialized software called a database management system, or DBMS. • A DBMS is a powerful tool for creating and managing large amounts of data efficiently and allowing it to persist over long periods of time, safely. • Among the most complex types of software available. The database [management] system 1. Allows users to create new databases and specify their schema (logical structure of the data), using a data-definition language. 2. Enables users to query and modify the data, using a query language and datamanipulation language. 3. Supports intelligent storage of very large amounts of data. – Protects data from accident or not proper use. Example: We can require from the DBMS to not allow the insertion of two different employees with the same SIN. – Allows efficient access to the data for queries and modifications. Example: Indexes over a specified fields 4. Controls access to data from many users at once (concurrency), without allowing “bad” interactions that can corrupt the data accidentally. 5. Recovers from software failures and crashes. Early database systems and file syst. • First commercial database systems evolved from file systems. • File systems allow the storage of big amounts of data (not safely though). • But file systems do not provide a query language for the data in files. People had to write programs in order to extract even the most elementary information from a set of files. Example: Suppose we have stored in a file called Employees records having the fields (emp_code, name, dept_code) and in another file called Departments records having the fields: (dept_code, dept_name) Suppose now that given an employee, for instance with name “Smith”, we want to find out what department is he working for. Continued In the absence of a query language we have to write a program which will: 1. open the file Employees 2. declare a variable of the same type as the records stored in the file 3. scan the file: while the end of the file is not yet encountered, assign the current record to above variable. 4. If the value of the name field is “Smith” then get the value of the dept_code field. Suppose it is “100” 5. Search in a similar way for a record with “100” for the dept_code in the Department file. 6. Print the dept_name when successfully finding the dept_code. Very painful procedure even for the simplest queries. Compare it to the short and elegant SQL query SELECT dept_name FROM Employees, Department WHERE Employees.name="Smith" AND Employees.dept_code = Department.dept_code First important applications of DBMS’s • The ones where the data was composed of – – many small items, and many queries or modifications were made. Examples • • • Airline reservation systems Banking systems Corporate records Airline Reservation Systems • Here the items of data include: – – – • Typical queries ask for: – • • Flights leaving about a certain time from one given city to another, what seats are available, and at what prices. Typical data modifications include: – • Reservations by a single customer on a single flight, including such information as assigned seat… Flights information – the airport they fly from and to, their departure and arrival times… Ticket information – prices, requirements, and availability. Making a reservation in a flight for a customer, assign a seat etc. Many agents will be accessing parts of the data at any given time. – The DBMS must allow concurrent accesses preventing problems such as two agents assigning the same seat simultaneously. Also, the DBMS should protect against loss of records if the system suddenly fails. Banking Systems • Data items include: – – – – Customers, their names, addresses etc. Accounts, and their balances Loans, and their balances Connections between customers and their accounts and loans. • Typical queries are those for account and loan balances. • Typical modifications are those representing a payment from or deposit to an account. • In banking systems failures cannot be tolerated. • – E.g, once the money has been ejected from an ATM machine, the bank must record the debit, even if the power immediately fails. – On the other hand, it is not permissible for the bank to record the debit and then not to deliver the money because the power fails. The proper way to handle this operation is far from obvious and is one of the significant achievements in DBMS architecture. Early DBMS’s (1960’s) • They encouraged the user to view the data much as it was stored. • The chief models were the Hierarchical and Network. • The main characteristic of these models was the possibility of easy jumping or navigating from one object to another through pointers. – • However these models didn’t provide a high-level query language for the data. – • E.g. From one employee to his department. So, one had still to write programs for querying the data. Also they didn’t allow on-line schema modifications. Relational databases Codd (1970) • A database system should present the user with a view of data organized as tables (also called relations). • Behind the scene there could be a complex data structure that allows rapid response to a variety of queries. – • But the user would not be concerned with the storage structure. Queries could be expressed in a very high-level language, which greatly increases the efficiency of database programmers. – This high-level query language for relational databases is called: Structured Query Language (SQL) Example of a Relational DB • • • Relations = Tables. Columns are “headed” by attribute names. Rows = Tuples A relation Accounts might be: accountNo 12345 67890 … • type savings checking … SQL Examples 1. What’s the balance of account “67890” ? 2. Which are the savings accounts with negative balances? 1 SELECT balance FROM Accounts WHERE accountNo = 67890; 1. 2. 3. balance 1000.00 2846.92 … 2 SELECT accountNo FROM Accounts WHERE type = ‘savings’ AND balance < 0; Examine all tuples of the relation Accounts in FROM-clause. Pick out those tuples that satisfy some criterion in the WHERE-clause, Output the attributes of those tuples which are asked for by SELECT-clause. Architecture of a DBMS • The “cylindrical” component contains not only data, but also metadata, i.e. info about the structure of data. • If DBMS is relational, metadata includes: – names of relations, – names of attributes of those relations, and – data types for those attributes (e.g., integer or character string). • A database also maintains indexes for the data. – Indexes are part of the stored data. – Description of which attributes have indexes is part of the metadata. Few words about indexes • Similar to book indexes. – – • • A book index associates words with page numbers where they appear. A database index associates values of some object field(s) with the physical address of the corresponding objects in the disk. Main property of an index: Its size is much smaller than the record set being indexed. Hence, searching in an index is much faster than searching in the corresponding record set. • • Storage Manager The job of the Storage Manager is to – obtain data from the data storage, and – modify the data to the data storage when requested. Storage Manager has two components: – File Manager handles files. – Buffer Manager handles main memory. • Keeps track of the location of files Obtains block(s) of a file on request from the buffer manager. Obtains and returns blocks of data from/to the file manager Stores blocks temporarily in main memory pages. 1 block = 1 page = 4,000 to 16,000 bytes. – Smallest unit of data that is read/written from/to disk. Query Processor • Query Processor handles: queries and modifications to the data. – Finds the best way to carry out a requested operation and – Issues commands to the storage manager which will carry them out. • E.g. A bank has a DB with two relat.: Customers (name, SIN, address), Accounts (accountNo, balance, SIN) Query: “Find the balances of all accounts of which Sally is the owner.” SELECT Accounts.balance FROM Customers, Accounts WHERE Customers.SIN = Accounts.SIN AND Customers.name = 'Sally'; Query Processor (Cont.) • What this query logically says is: 1. 2. 3. • If answer this query as it says the performance would be terrible. – • • Make Cartesian product R of tables specified in the FROM-clause, Choose from R the tuples satisfying the condition in the WHERE clause. Produce as answer only the values of attributes in SELECT-clause. Because of the usually enormous Cartesian product. Suppose we have – Index on name of Customer and – Index on SIN of Accounts. Then, query processor will cleverly create a plan which inexpensively: – – Retrieves the tuple for “Sally” and gets the SIN number. Retrieves the account tuples for this SIN number. Transaction Manager • Transaction Manager is responsible for the integrity of the system. It must assure that: – several queries running simultaneously do not interfere with each other and that, – the system will not lose data even if there is a power failure. • Transaction Manager interacts with: • Query Manager, – Because it may need to delay certain query operations to avoid conflicts. • Storage Manager – Because schemes for protecting data involve storing a log of changes to the data. Database Studies • Design of databases. – What kinds of information go into the database? – How is the information structured? – How do data items connect? • Database programming. – How does one express queries on the database? – How does one use other capabilities of a DBMS, such as transactions or constraints, in an application? – How is database programming combined with conventional programming? • Database system implementation. – How does one build a DBMS, including such matters as query processing, transaction processing and organizing storage for efficient access?