Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
INTRODUCTION TO DATABASES CS 260 Database Systems Course Outline SQL queries DML Select (SELECT) DDL (CREATE, ALTER, DROP) DML Action (INSERT, UPDATE, DELETE) User authentication Data modeling and normalization Transaction processing Stored DBMS programs Scaling and performance Distributed databases Why Study Database Systems? Important The back end of many important systems Good job experience Web-based e-commerce (e.g. amazon.com, imdb.com) Scientific sites (e.g. protein sequences) Real-time systems (e.g. air-traffic control) Business systems (e.g. banking) Many jobs require database systems experience Oracle DBAs hired at >$80K, often make >$100K A lot of data out there Facebook – over 100 petabytes Amazon – over 90 pb Google processes over 20pb per day AT&T (323 TB, 2010) World Data Center for Climate (6+ PB, 2010) Why Study Database Systems? As it relates to UWEC courses CS268 Use SQL and JDBC CS355 & CS485 (SE I and II) Project(s) Any might use databases CS course might use a database! Overview Data management Database system architectures Database vocabulary & history Data Management "Real" computer applications tend to have: Large volumes of data Different types of data Data with complex relationships “Old” Approach to Data Management File-based Decentralized Each application has its own data files Data File 1 Application 1 Data File 2 Application 2 Problems with file based approach? Duplicate data Redundant data Becomes inconsistent over time Requires custom programs to create, update, and retrieve data from files Transaction atomicity Concurrent access issues Security DBMS Approach to Data Management Data is viewed as a shared organizational resource Centralized All applications interact with a single centralized database Application 1 Database Application 2 What is a Database? Components Data files Database Management System (DBMS) Application 1 Database Data Files DBMS Application 2 What is the DBMS? Set of programs that perform… Basic data handling tasks Insert, Update, Delete, Retrieve Creating atomic transactions Enabling/disabling concurrency Database administration tasks Applications interact only with the DBMS Creating users, creating objects, enforcing security, creating backups, … Never interact directly with the data files Major players today: Oracle, SQL Server, DB2, MySQL Overview Data management Database system architectures Database vocabulary & history Database System Architectures Single-tier 2-tier N-tier Single-Tier Databases Personal computer-based Applications and database run on the same local computer Example: Access, Filemaker What are the limitations? Host Computer (microcomputer) Database Database Applications Single-Tier Databases Mainframe-based (host) Applications & database run on the same host computer Mainly used in legacy systems Host Computer Database Database Applications Network Terminals 2-Tier (Client/Server) DBMS runs on a server Client applications run on the clients Example: Oracle (server) + SQL Developer (client) Client Workstations Database Applications Database Server Database Network Database Applications Database Applications N-Tier Database Systems N-tier system places each service type on a different host Database Server Middle-tier Server(s) (Web server) Client Workstations (Browser) User Services Database Business Services User Services User Services Sample Database (CANDY) CANDY_CUSTOMER CUST_ID CUST_NAME CUST_TYPE CUST_ADDR CUST_ZIP CUST_PHONE USERNAME PASSWORD 1 Jones, Joe P 1234 Main St. 91212 434-1231 jonesj 2 Armstrong,Inc. R 231 Globe Blvd. 91212 434-7664 armstrong 3 Sw edish Burgers R 1889 20th N.E. 91213 434-9090 sw edburg 4 Pickled Pickles R 194 CityView 91289 324-8909 pickpick 5 The Candy Kid W 2121 Main St. 91212 563-4545 kidcandy 6 Waterman, Al P 23 Yankee Blvd. 91234 w ateral 7 Bobby Bon Bons R 12 Nichi Cres. 91212 434-9045 bobbybon 8 Crow sh, Elias P 7 77th Ave. 91211 434-0007 crow el 9 Montag, Susie P 981 Montview 91213 456-2091 montags 10 Columberg Sw eets W 239 East Falls 91209 874-9092 columsw e 1234 3333 2353 5333 2351 8900 3011 1033 9633 8399 CANDY_PURCHASE PURCH_ID CANDY_CUST_TYPE PROD_ID CUST_ID PURCH_DATE DELIVERY_DATE POUNDS STATUS 1 1 5 28-Oct-04 28-Oct-04 3.5 PAID CUST_TYPE_ID CUST_TYPE_DESC 2 2 6 28-Oct-04 30-Oct-04 15 PAID P Private 3 1 9 28-Oct-04 28-Oct-04 2 PAID R Retail 3 3 9 28-Oct-04 28-Oct-04 3.7 PAID Wholesale 4 3 2 28-Oct-04 5 1 7 29-Oct-04 29-Oct-04 3.7 NOT PAID 5 2 7 29-Oct-04 29-Oct-04 1.2 NOT PAID 5 3 7 29-Oct-04 29-Oct-04 4.4 NOT PAID 6 2 7 29-Oct-04 W CANDY_PRODUCT PROD_ID PROD_DESC PROD_COSTPROD_PRICE 3.7 PAID 3 PAID 1 Celestial Cashew Crunch $ 7.45 $ 10.00 7 2 10 29-Oct-04 14 NOT PAID 2 Unbrittle Peanut Paradise $ 5.75 $ 9.00 7 5 10 29-Oct-04 4.8 NOT PAID 3 Mystery Melange $ 7.75 $ 10.50 8 1 4 29-Oct-04 4 Millionaire’s Macadamia Mix $ 12.50 $ 16.00 8 5 4 29-Oct-04 5 Nuts Not Nachos $ 6.25 $ 9.50 9 5 4 29-Oct-04 29-Oct-04 1 PAID 7.6 PAID 29-Oct-04 3.5 NOT PAID Overview Data management Database system architectures Database vocabulary & history Basic Database Vocabulary PROD_ID PROD_DESC PROD_COST PROD_PRICE 1 Celestial Cashew Crunch $ 7.45 $ 10.00 2 Unbrittle Peanut Paradise $ 5.75 $ 9.00 3 Mystery Melange $ 7.75 $ 10.50 4 Millionaire’s Macadamia Mix $ 12.50 $ 16.00 5 Nuts Not Nachos $ 6.25 $ 9.50 Field: column of similar data values Record*: row of related fields Table*: set of related rows • Mathematical terms for record and table: tuple and relation Database History All pre-1960’s systems used file-based data First database: Apollo project Goal: No duplicate data in multiple locations Used a hierarchical structure Created relationships using pointers Pointer: hardware address Example Hierarchical Database UNIVERSITY_STUDENT Student ID Student LastName Student Student FirstName MI 5000 Nelson Amber S 5001 Hernandez Joseph P 5002 Myers Stephen R Pointers to Course Data* UNIVERSITY_COURSE CourseID Course Name Course Title 100 MIS 290 Intro. to Database Applications 101 MIS 304 Fundamentals of Business Programming 102 MIS 310 Systems Analysis & Design * Hex number referencing data’s physical location on hard drive Problems with Hierarchical Databases Relationships are all one-way; to go the other way, you must create a new set of pointers Pointers are hardware-specific VERY hard to move to new hardware Applications must be custom-written Usually in COBOL Relational Databases Circa 1972 E.J. Codd “Normalizing” relations Goal: No redundant data Stores data in a tabular format Creates relationships by sharing key fields Key Fields Primary key: field that uniquely identifies a record Often abbreviated “PK” UNIVERSITY_INSTRUCTOR Primary keys InstructorID Instructor LastName Instructor FirstName 1 Black Greg 2 McIntyre Karen 3 Sarin Naj Class Discussion What is the primary key of each table in the CANDY database? How can you tell if a field is a primary key? Sample Database (CANDY) CANDY_CUSTOMER CUST_ID CUST_NAME CUST_TYPE CUST_ADDR CUST_ZIP CUST_PHONE USERNAME PASSWORD 1 Jones, Joe P 1234 Main St. 91212 434-1231 jonesj 2 Armstrong,Inc. R 231 Globe Blvd. 91212 434-7664 armstrong 3 Sw edish Burgers R 1889 20th N.E. 91213 434-9090 sw edburg 4 Pickled Pickles R 194 CityView 91289 324-8909 pickpick 5 The Candy Kid W 2121 Main St. 91212 563-4545 kidcandy 6 Waterman, Al P 23 Yankee Blvd. 91234 w ateral 7 Bobby Bon Bons R 12 Nichi Cres. 91212 434-9045 bobbybon 8 Crow sh, Elias P 7 77th Ave. 91211 434-0007 crow el 9 Montag, Susie P 981 Montview 91213 456-2091 montags 10 Columberg Sw eets W 239 East Falls 91209 874-9092 columsw e 1234 3333 2353 5333 2351 8900 3011 1033 9633 8399 CANDY_PURCHASE PURCH_ID CANDY_CUST_TYPE PROD_ID CUST_ID PURCH_DATE DELIVERY_DATE POUNDS STATUS 1 1 5 28-Oct-04 28-Oct-04 3.5 PAID CUST_TYPE_ID CUST_TYPE_DESC 2 2 6 28-Oct-04 30-Oct-04 15 PAID P Private 3 1 9 28-Oct-04 28-Oct-04 2 PAID R Retail 3 3 9 28-Oct-04 28-Oct-04 3.7 PAID Wholesale 4 3 2 28-Oct-04 5 1 7 29-Oct-04 29-Oct-04 3.7 NOT PAID 5 2 7 29-Oct-04 29-Oct-04 1.2 NOT PAID 5 3 7 29-Oct-04 29-Oct-04 4.4 NOT PAID 6 2 7 29-Oct-04 W CANDY_PRODUCT PROD_ID PROD_DESC PROD_COSTPROD_PRICE 3.7 PAID 3 PAID 1 Celestial Cashew Crunch $ 7.45 $ 10.00 7 2 10 29-Oct-04 14 NOT PAID 2 Unbrittle Peanut Paradise $ 5.75 $ 9.00 7 5 10 29-Oct-04 4.8 NOT PAID 3 Mystery Melange $ 7.75 $ 10.50 8 1 4 29-Oct-04 4 Millionaire’s Macadamia Mix $ 12.50 $ 16.00 8 5 4 29-Oct-04 5 Nuts Not Nachos $ 6.25 $ 9.50 9 5 4 29-Oct-04 29-Oct-04 1 PAID 7.6 PAID 29-Oct-04 3.5 NOT PAID Special Types of Primary Keys Composite PK: made by combining 2 or more fields to create a unique identifier Consider the CANDY_PURCHASE table… Composite PK Surrogate PK: ID generated by the DBMS solely as a unique identifier (not done in above example, but likely in CANDY_CUSTOMER and CANDY_PRODUCT) Key Fields (continued) Foreign key Field that is a primary key in another table Serves to create a relationship UNIVERSITY_INSTRUCTOR Primary keys InstructorID Instructor LastName Instructor FirstName 1 Black Greg 2 McIntyre Karen 3 Sarin Naj Foreign keys UNIVERSITY_STUDENT StudentID Student LastName Student FirstName StudentMI AdvisorID 5000 Nelson Amber S 1 5001 Hernandez Joseph P 1 5002 Myers Stephen R 3 Alternative to Foreign Keys Repeat data values for every record Problems Takes extra space Redundant data becomes inconsistent over time UNIVERSITY_STUDENT StudentID Student LastName Student FirstName StudentMI AdvisorLast Name AdvisorFirst Name 5000 Nelson Amber S Black Greg 5001 Hernandez Joseph P Black Gregory 5002 Myers Stephen R Sarin Naj Class Discussion What are the foreign keys in the CANDY database? Does a table HAVE to have foreign keys? How can you tell if a field is a foreign key? Sample Database (CANDY) CANDY_CUSTOMER CUST_ID CUST_NAME CUST_TYPE CUST_ADDR CUST_ZIP CUST_PHONE USERNAME PASSWORD 1 Jones, Joe P 1234 Main St. 91212 434-1231 jonesj 2 Armstrong,Inc. R 231 Globe Blvd. 91212 434-7664 armstrong 3 Sw edish Burgers R 1889 20th N.E. 91213 434-9090 sw edburg 4 Pickled Pickles R 194 CityView 91289 324-8909 pickpick 5 The Candy Kid W 2121 Main St. 91212 563-4545 kidcandy 6 Waterman, Al P 23 Yankee Blvd. 91234 w ateral 7 Bobby Bon Bons R 12 Nichi Cres. 91212 434-9045 bobbybon 8 Crow sh, Elias P 7 77th Ave. 91211 434-0007 crow el 9 Montag, Susie P 981 Montview 91213 456-2091 montags 10 Columberg Sw eets W 239 East Falls 91209 874-9092 columsw e 1234 3333 2353 5333 2351 8900 3011 1033 9633 8399 CANDY_PURCHASE PURCH_ID CANDY_CUST_TYPE PROD_ID CUST_ID PURCH_DATE DELIVERY_DATE POUNDS STATUS 1 1 5 28-Oct-04 28-Oct-04 3.5 PAID CUST_TYPE_ID CUST_TYPE_DESC 2 2 6 28-Oct-04 30-Oct-04 15 PAID P Private 3 1 9 28-Oct-04 28-Oct-04 2 PAID R Retail 3 3 9 28-Oct-04 28-Oct-04 3.7 PAID Wholesale 4 3 2 28-Oct-04 5 1 7 29-Oct-04 29-Oct-04 3.7 NOT PAID 5 2 7 29-Oct-04 29-Oct-04 1.2 NOT PAID 5 3 7 29-Oct-04 29-Oct-04 4.4 NOT PAID 6 2 7 29-Oct-04 W CANDY_PRODUCT PROD_ID PROD_DESC PROD_COSTPROD_PRICE 3.7 PAID 3 PAID 1 Celestial Cashew Crunch $ 7.45 $ 10.00 7 2 10 29-Oct-04 14 NOT PAID 2 Unbrittle Peanut Paradise $ 5.75 $ 9.00 7 5 10 29-Oct-04 4.8 NOT PAID 3 Mystery Melange $ 7.75 $ 10.50 8 1 4 29-Oct-04 4 Millionaire’s Macadamia Mix $ 12.50 $ 16.00 8 5 4 29-Oct-04 5 Nuts Not Nachos $ 6.25 $ 9.50 9 5 4 29-Oct-04 29-Oct-04 1 PAID 7.6 PAID 29-Oct-04 3.5 NOT PAID Rules for Relational Database Tables Every record has to have a non-NULL (or “nonempty”) and unique PK value Every FK value must be defined as a PK in its parent table