Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Relational Algebra and Calculus University of California, Berkeley School of Information Management and Systems SIMS 257: Database Management Spring 1998 SIMS 257: Database Management -- Ray Larson Review • Physical Database Design • RAID • Data Integrity Spring 1998 SIMS 257: Database Management -- Ray Larson Physical Design Decisions • There are several critical decisions that will affect the integrity and performance of the system. – – – – – Spring 1998 Storage Format Physical record composition Data arrangement Indexes Query optimization and performance tuning SIMS 257: Database Management -- Ray Larson When to Index • Tradeoff between time and space: – Indexes permit faster processing for searching – But they take up space for the index – They also slow processing for insertions, deletions, and updates, because both the table and the index must be modified • Thus they SHOULD be used for databases where search is the main mode of interaction • The might be skipped if high rates of updating and insertions are expected Spring 1998 SIMS 257: Database Management -- Ray Larson When to Use Indexes • Rules of thumb – Indexes are most useful on larger tables – Specify a unique index for the primary key of each table – Indexes are most useful for attributes used as search criteria or for joining tables – Indexes are useful if sorting is often done on the attribute – Most useful when there are many different values for an attribute – Some DBMS limit the number of indexes and the size of the index key values – Some indexed will not retrieve NULL values Spring 1998 SIMS 257: Database Management -- Ray Larson RAID Technology One logical disk drive Parallel Writes Disk 1 Disk 2 Disk 3 Disk 4 1 5 2 6 3 7 4 8 9 * * * 10 * * * 11 * * * 12 * * * Stripe Stripe Stripe Parallel Reads Spring 1998 SIMS 257: Database Management -- Ray Larson Raid 0 One logical disk drive Parallel Writes Disk 1 Disk 2 Disk 3 Disk 4 1 5 2 6 3 7 4 8 9 * * * 10 * * * 11 * * * 12 * * * Stripe Stripe Stripe Parallel Reads Spring 1998 SIMS 257: Database Management -- Ray Larson RAID-1 Parallel Writes Disk 1 Disk 2 Disk 3 Disk 4 1 3 1 3 2 4 2 4 5 * * * 5 * * * 6 * * * 6 * * * Stripe Stripe Stripe Parallel Reads Spring 1998 SIMS 257: Database Management -- Ray Larson RAID-2 Writes span all drives Disk 1 Disk 2 Disk 3 Disk 4 1a 1b ecc ecc Stripe 2a 3a * * * 2b 3b * * * ecc ecc ecc ecc * * * * * * Stripe Stripe Reads span all drives Spring 1998 SIMS 257: Database Management -- Ray Larson RAID-3 Writes span all drives Disk 1 Disk 2 Disk 3 Disk 4 1a 1b 1c ecc Stripe 2a 3a * * * 2b 3b * * * 2c 3c * * * ecc ecc * * * Stripe Stripe Reads span all drives Spring 1998 SIMS 257: Database Management -- Ray Larson Raid-4 Parallel Writes Disk 1 Disk 2 Disk 3 Disk 4 1 2 3 ecc Stripe 4 7 * * * 5 8 * * * 6 9 * * * ecc ecc * * * Stripe Stripe Parallel Reads Spring 1998 SIMS 257: Database Management -- Ray Larson RAID-5 Parallel Writes Disk 1 Disk 2 Disk 3 Disk 4 1 5 2 6 3 7 4 8 9 * * ecc 10 * * ecc 11 * * ecc 12 * * ecc Stripe Stripe Stripe Parallel Reads Spring 1998 SIMS 257: Database Management -- Ray Larson Integrity Constraints • The constraints we wish to impose in order to protect the database from becoming inconsistent. • Five types – – – – – Spring 1998 Required data attribute domain constraints entity integrity referential integrity enterprise constraints SIMS 257: Database Management -- Ray Larson Today • • • • Relational Operations Relational Algebra Relational Calculus Introduction to SQL via Access Spring 1998 SIMS 257: Database Management -- Ray Larson Relational Algebra Operations • • • • • • • • Select Project Product Union Intersect Difference Join Divide Spring 1998 SIMS 257: Database Management -- Ray Larson Select • Extracts specified tuples (rows) from a specified relation (table). Spring 1998 SIMS 257: Database Management -- Ray Larson Project • Extracts specified attributes(columns) from a specified relation. Spring 1998 SIMS 257: Database Management -- Ray Larson Product • Builds a relation from two specified relations consisting of all possible concatenated pairs of tuples, one from each of the two relations. Product a b c Spring 1998 x y a a b b c c x y x y x y SIMS 257: Database Management -- Ray Larson Union • Builds a relation consisting of all tuples appearing in either or both of two specified relations. Spring 1998 SIMS 257: Database Management -- Ray Larson Intersect • Builds a relation consisting of all tuples appearing in both of two specified relations Spring 1998 SIMS 257: Database Management -- Ray Larson Difference • Builds a relation consisting of all tuples appearing in first relation but not the second. Spring 1998 SIMS 257: Database Management -- Ray Larson Join • Builds a relation from two specified relations consisting of all possible concatenated pairs of, one from each of the two relations, such that in each pair the two tuples satisfy some condition. A1 B1 A2 B1 A3 B2 Spring 1998 B1 C1 B2 C2 B3 C3 (Natural or Inner) Join A1 B1 C1 A2 B1 C1 A3 B2 C2 SIMS 257: Database Management -- Ray Larson Outer Join • Outer Joins are similar to PRODUCT -- but will leave NULLs for any row in the first table with no corresponding rows in the second. Outer Join A1 A2 A3 A4 Spring 1998 B1 B1 B2 B7 B1 C1 B2 C2 B3 C3 SIMS 257: Database Management -- Ray Larson A1 B1 C1 A2 B1 C1 A3 B2 C2 A4 * * Divide • Takes two relations, one binary and one unary, and builds a relation consisting of all values of one attribute of the binary relation that match (in the other attribute) all values in the unary relation. Divide a a a b c Spring 1998 x y z x y a x y SIMS 257: Database Management -- Ray Larson ER Diagram: Acme Widget Co. Emp# Wage ISA Hourly Sales Cust# Customer Employee Sales-Rep Writes Orders Invoice Part# Invoice# Quantity Contains Line-Item Contains Invoice# Rep# Part# Count Price Cust# Spring 1998 Part SIMS 257: Database Management -- Ray Larson Employee Firstname Janet Thomas Charles Roberto Spring 1998 Middlename Birthdate Mary 6/25/63 Frederick 8/4/70 Robert 9/23/61 Garcia 7/8/58 Address City Zip 234 State Berkeley 92654 12 Lambert Oakland 93587 44 Central Berkeley 94720 76 Highland Berkeley 94654 SIMS 257: Database Management -- Ray Larson Part Part # 1 2 3 4 5 6 7 8 9 Spring 1998 Name Price Count Big blue widget 3.76 2 Small blue Widget 7.35 4 Tiny red widget 5.25 7 large red widget 157.23 23 double widget rack 10.44 12 Small green Widget 30.45 58 Big yellow widget 7.96 1 Tiny orange widget 81.75 42 Big purple widget 55.99 9 SIMS 257: Database Management -- Ray Larson Sales-Rep SSN Rep # Sales 123-76-3423 1 $12,345.45 843-36-7659 2 $231,456.75 Hourly SSN Wage 342-88-7865 $12.75 486-87-6543 $20.50 Spring 1998 SIMS 257: Database Management -- Ray Larson Customer Cust # COMPANY STREET1 Integrated Standards 1 Ltd. 35 Broadway STREET2 STATE ZIPCODE New York NY 02111 34 Bureaucracy Plaza Floors 1-172 3 Control Elevation Cyber Assicates Place Center Phildelphia PA 03756 Cyberoid NY 08645 35 Libra Plaza Nashua NH 09242 1 Broadway Middletown IN 32467 88 Oligopoly Place 3 Independence Parkway Sagrado TX 78798 Rivendell CA 93456 8 Little Mighty Micro 34 Last One Drive Orinda CA 94563 9 SportLine Ltd. 38 Champion Place Compton CA 95328 2 MegaInt Inc. 3 Cyber Associates General 4 Consolidated Consolidated 5 MultiCorp Internet Behometh 6 Ltd. Consolidated 7 Brands, Inc. Spring 1998 Floor 12 CITY Suite 882 SIMS 257: Database Management -- Ray Larson Invoice Invoice # Cust # Rep # 93774 3 84747 4 88367 5 88647 9 776879 2 65689 6 Spring 1998 SIMS 257: Database Management -- Ray Larson 1 1 2 1 2 2 Line-Item Invoice # Part # 93774 84747 88367 88647 776879 65689 93774 88367 Spring 1998 Quantity 3 10 23 1 75 2 4 3 22 5 76 12 23 10 34 2 SIMS 257: Database Management -- Ray Larson Join Items Part # Invoice # Part # Quantity 93774 3 10 84747 23 1 88367 75 2 88647 4 3 776879 22 5 65689 76 12 93774 23 10 88367 34 2 1 2 3 4 5 6 7 8 9 Cust # COMPANY STREET1 Integrated Standards 1 Ltd. 35 Broadway Spring 1998 Rep # 3 4 5 9 2 6 1 1 2 1 2 2 STREET2 STATE ZIPCODE NY 02111 34 Bureaucracy Plaza Floors 1-172 3 Control Elevation Cyber Assicates Place Center Phildelphia PA 03756 Cyberoid NY 08645 35 Libra Plaza Nashua NH 09242 1 Broadway Middletown IN 32467 88 Oligopoly Place 3 Independence Parkway Sagrado TX 78798 Rivendell CA 93456 8 Little Mighty Micro 34 Last One Drive Orinda CA 94563 9 SportLine Ltd. 38 Champion Place Compton CA 95328 3 Cyber Associates General 4 Consolidated Consolidated 5 MultiCorp Internet Behometh 6 Ltd. Consolidated 7 Brands, Inc. SIMS 257: Database Management -- Ray Larson Floor 12 CITY New York 2 MegaInt Inc. Invoice # Cust # 93774 84747 88367 88647 776879 65689 Name Price Count Big blue widget 3.76 2 Small blue Widget 7.35 4 Tiny red widget 5.25 7 large red widget 157.23 23 double widget rack 10.44 12 Small green Widget 30.45 58 Big yellow widget 7.96 1 Tiny orange widget 81.75 42 Big purple widget 55.99 9 Suite 882 Relational Algebra • What is the name of the customer who ordered Large Red Widgets? – – – – – Spring 1998 Select “large Red Widgets” from Part as temp1 Join temp1 with Line-item on Part # as temp2 Join temp2 with Invoice on Invoice # as temp3 Join temp3 with customer on cust # as temp4 Project Name from temp4 SIMS 257: Database Management -- Ray Larson Relational Calculus • Relational Algebra provides a set of explicit operations (select, project, join, etc) that can be used to build some desired relation from the database. • Relational Calculus provides a notation for formulating the definition of that desired relation in terms of the relations in the database without explicitly stating the operations to be performed • SQL is based on the relational calculus. Spring 1998 SIMS 257: Database Management -- Ray Larson SQL • Structured Query Language • Database Definition and Querying • Basic language is standardized across relational DBMSs. Each system may have proprietary extensions to standard. • Relational Calculus combines Select, Project and Join operations in a single command. SELECT. Spring 1998 SIMS 257: Database Management -- Ray Larson SELECT • Syntax: – SELECT attr1, attr2,…, attr3 FROM rel1 r1, rel2 r2,… rel3 r3 WHERE condition1 {AND | OR} condition2 ORDER BY attr1, attr3 • Examples in Access... Spring 1998 SIMS 257: Database Management -- Ray Larson