Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Serializability wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Functional Database Model wikipedia , lookup
Clusterpoint wikipedia , lookup
Introduction to Relational Databases S. Sudarshan Comp. Sci. and Engg. Dept. I.I.T., Bombay [email protected] http://www.cse.iitb.ernet.in/~sudarsha 1 OVERVIEW Files and Index Structures Relational Databases and SQL Relational Database Features System Architectures CPU Memory Computer System Disk 2 Storing Data on Disks Byte: unit of information one character File: sequence of bytes File system: stores multiple files organizes files into directories/folders /users/sudarsha/Talks/rbi-dbms.ppt /users/sudarsha/Talks/rbi-wh.ppt 3 File Structure Typical database information large number of small pieces of data -records records stored within large files Sequential files sorted on a key (e.g., account number) Index needed for efficient access e.g. find information of account 2345 similar to library card catalogs 4 Relational Model: Tabular Data Account Number 23456 23457 23458 Account Name Address H. Mehta Bill Gates S.Sudarshan Date 1/4/99 1/4/99 Transactions 4/4/99 4/5/99 4/5/99 Balance Madhuli, Mumbai 40,000,000 Microsoft Corp 9999,999,999 C-148, IIT Powai 1,000 Acct Num 23458 23456 23456 23456 23458 Debit Credit Balance 2000 3000 2000 3000 9000 12000 200000 212000 4000 7000 5 Tabular Data - more examples Emp-num Name Address Vajpayee Bill Gates S.Sudarshan 1 2 3 Employee Date 1/4/99 1/4/99 4/4/99 4/5/99 4/5/99 Salary PMO, Delhi Microsoft Corp C-148, IIT Powai 10,000 9999,999 10,000 Overdue-Loan Loan-Num State 123 456 998 765 666 TN MH MH GJ AP Balance Category Repaid 3000 3000 12000 212000 7000 Farm Smallbiz Coop Bigbiz Govtbiz 0 100 50 25 10 6 Querying/Updating the Database using SQL SQL Query: select name, balance from account where acct-num = 23456 Result is: H. Mehta 40,000,000 insert into transactions values (3/5/99, 23456, 5000, -, 14000) 7 More Queries select acct-num, avg(balance) from transactions where date between 1/4/99 and 30/4/99 groupby acct-num 23456 23457 23458 41,000,000 99,000,000,000 1,000 8 More Queries: Joins select name, balance from account, transactions where acct.acct-no = transactions.acct-no H. Mehta 41,000,000 Bill Gates 99,000,000,000 Sudarshan 1,000 9 Terminology Relation = Table e.g. Account Tuple = Record = Row of table e.g. record for Sudarshan’s account Attribute = Column of table e.g. balance and account-number attributes of Account Primary key: uniquely identifies row e.g. account number 10 Schema Design Schema = set of tables and their columns Bad schema design may make it impossible to store some records result in duplication of data e.g. address of customer stored with each deposit account Good schema design via Entity-relationship model, and via normalization 11 Relational Database Features 12 Relational Databases Provide: Tabular Data model: simple, yet powerful A Standard easy-to-use query Language: SQL Mature Products with Reliable, Fault-Tolerant Operations available Good Performance High number of transactions per second Parallel operation for scalability (handle growth) Distributed and Replicated Data Bases Interoperation, High availability 13 In Contrast: Traditional File Processing COBOL: Common Business Oriented language Files contain sequence of records, e.g. Record per account Complex program for each task e.g. withdrawal, deposit, average balance, … File structures often very complex motivated by efficiency, but become hard to understand 14 SQL SQL is Intergalactic Dataspeak Data Definition Language (DDL) Define table column types and names Integrity Constraint Data Manipulation Language query the data update the data Security & Access Control Views, Permissions Data Base Technologies 15 SQL Standards Open Data Base Connectivity (ODBC): Standard for client server interconnectivity, using C language JDBC: Like ODBC, for Java language SQL Standards: 86, 89, 92, SQL-3 Draft 16 Transactions Transactions: e.g. Debit/credit Problems: Failures (e.g., power, disk storage) Concurrent transactions Solution: Support for ACID properties 17 ACID Properties Atomicity: Transaction appears to either run completely or not at all -- no partial state Consistency: Integrity checks (e.g., balance >= 0) Isolation: Locks on data so that transactions do not step on each others toes Durability: Data/updates are never lost 18 Integrity Checks Data value checks balance >= 0 Foreign key constraints If account number appears in Transactions relation, it must also appear in Accounts relation Transactions.acct-no is a foreign key referencing Accounts.acct-no 19 Authorization Mechanisms Privileges e.g., read table, update table, insert row in table, delete row from table, privilege to grant privileges Each user given specific set of privileges he/she needs Roles privileges given to roles (e.g., teller, manager) users authorized to play roles 20 User Interfaces to Databases User Interfaces Forms & Menus Reports Graphical user interfaces Lots of Tools Available Native to Product (eg Oracle, Microsoft) also Independent Vendors (Powerbuilder) no standards 21 Database Systems 22 Storage Devices Main memory volatile, lost on power failure expensive and relatively small Hard disk non-volatile, reasonably fast access relatively cheap, and large main storage system for databases Mean time to Failure: ~5 years 23 RAID Systems Goal: improve storage reliability Data stored on multiple disks if one disk fails, data still available on others Essential for safety of data Hardware RAID expensive, very high availability for 24x7 applications (24 hrs X 7 days/wk) Software RAID cheaper, use if some downtime is allowable 24 DATABASE SERVERS Major players Oracle, IBM DB2, Microsoft SQL Server, Informix, Sybase, Ingress Wide range of performance, features, and price 25 Database Application Classes OLTP: Online Transaction Processing supports many small transactions Decision Support Summaries/aggregates OLAP: Online Analytical Processing 26 Database System Architectures 27 Database Architectures Centralized Dumb terminals connected to single server Client Server Smarter client machines connect to server Main work still done at server Parallel Servers Work divided between multiple CPUs Distributed Multiple independent databases in cooperation 28 CLIENT - SERVER TERMINOLOGY Service: Provided by the Server Each Client Is a Consumer Shared Resources : Managed by Server Client : Initiator of a Request 29 Client vs Server Functionality Server Functions Wait for Requests and process them when they come Handle Concurrent Transactions Authentication, Authorization Audit trails Client Functions Provide User Interface Support Graphics, Multimedia 30 TWO TIER PARTITIONING Distributed Presentation Remote Presentation Distributed Application Logic Remote Data Management Data Management Data Management Data Management Data Management Application Logic Application Logic Application Logic Distributed Data Management Data Management Data Management Presentation Logic Presentation Logic Presentation Logic Application Logic Application Logic Application Logic Presentation Logic Presentation Logic Presentation Logic 31 Three Tier Applications Tier One : Client e.g., web browser Tier Two : Application Server e.g. enhanced web server Tier Three: Database Server 32 Further Reading Silberschatz, Korth and Sudarshan, Database System Concepts, McGraw-Hill, 1997 33 Relational Databases Motivation: simplify storage structures easy to use language for queries/updates efficiency is job of system automatic optimization Legacy systems Systems built using COBOL and older data models Still in wide use, but declining usage 34