Download Intro to Databases

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
INTRODUCTION TO
DATABASES
CS 260
Database Systems
Course Outline

SQL queries
DML Select (SELECT)
 DDL (CREATE, ALTER, DROP)
 DML Action (INSERT, UPDATE, DELETE)







User authentication
Data modeling and normalization
Transaction processing
Stored DBMS programs
Scaling and performance
Distributed databases
Why Study Database Systems?

Important

The back end of many important systems





Good job experience



Web-based e-commerce (e.g. amazon.com, imdb.com)
Scientific sites (e.g. protein sequences)
Real-time systems (e.g. air-traffic control)
Business systems (e.g. banking)
Many jobs require database systems experience
Oracle DBAs hired at >$80K, often make >$100K
A lot of data out there





Facebook – over 100 petabytes
Amazon – over 90 pb
Google processes over 20pb per day
AT&T (323 TB, 2010)
World Data Center for Climate (6+ PB, 2010)
Why Study Database Systems?

As it relates to UWEC courses
 CS268
 Use
SQL and JDBC
 CS355
& CS485 (SE I and II)
 Project(s)
 Any
might use databases
CS course might use a database!
Overview



Data management
Database system architectures
Database vocabulary & history
Data Management

"Real" computer applications tend to have:
 Large
volumes of data
 Different types of data
 Data with complex relationships
“Old” Approach to Data Management



File-based
Decentralized
Each application has its own data files
Data
File 1
Application
1
Data
File 2
Application
2
Problems with file based approach?


Duplicate data
Redundant data
 Becomes




inconsistent over time
Requires custom programs to create, update, and
retrieve data from files
Transaction atomicity
Concurrent access issues
Security
DBMS Approach to Data Management



Data is viewed as a shared organizational resource
Centralized
All applications interact with a single centralized
database
Application
1
Database
Application
2
What is a Database?

Components
 Data
files
 Database Management System (DBMS)
Application
1
Database
Data
Files
DBMS
Application
2
What is the DBMS?

Set of programs that perform…

Basic data handling tasks

Insert, Update, Delete, Retrieve
Creating atomic transactions
 Enabling/disabling concurrency
 Database administration tasks



Applications interact only with the DBMS


Creating users, creating objects, enforcing security, creating
backups, …
Never interact directly with the data files
Major players today:

Oracle, SQL Server, DB2, MySQL
Overview



Data management
Database system architectures
Database vocabulary & history
Database System Architectures



Single-tier
2-tier
N-tier
Single-Tier Databases
 Personal
computer-based
Applications and database run on the same local computer
 Example: Access, Filemaker
 What are the limitations?

Host Computer (microcomputer)
Database
Database
Applications
Single-Tier Databases
 Mainframe-based
(host)
 Applications
& database run on the same host computer
 Mainly used in legacy systems
Host Computer
Database
Database
Applications
Network
Terminals
2-Tier (Client/Server)
 DBMS
runs on a server
 Client applications run on the clients

Example: Oracle (server) + SQL Developer (client)
Client Workstations
Database
Applications
Database Server
Database
Network
Database
Applications
Database
Applications
N-Tier Database Systems

N-tier system places each service type on a
different host
Database Server
Middle-tier Server(s)
(Web server)
Client Workstations
(Browser)
User
Services
Database
Business
Services
User
Services
User
Services
Sample Database (CANDY)
CANDY_CUSTOMER CUST_ID
CUST_NAME
CUST_TYPE
CUST_ADDR
CUST_ZIP
CUST_PHONE USERNAME PASSWORD
1 Jones, Joe
P
1234 Main St.
91212 434-1231
jonesj
2 Armstrong,Inc.
R
231 Globe Blvd.
91212 434-7664
armstrong
3 Sw edish Burgers
R
1889 20th N.E.
91213 434-9090
sw edburg
4 Pickled Pickles
R
194 CityView
91289 324-8909
pickpick
5 The Candy Kid
W
2121 Main St.
91212 563-4545
kidcandy
6 Waterman, Al
P
23 Yankee Blvd.
91234
w ateral
7 Bobby Bon Bons
R
12 Nichi Cres.
91212 434-9045
bobbybon
8 Crow sh, Elias
P
7 77th Ave.
91211 434-0007
crow el
9 Montag, Susie
P
981 Montview
91213 456-2091
montags
10 Columberg Sw eets W
239 East Falls
91209 874-9092
columsw e
1234
3333
2353
5333
2351
8900
3011
1033
9633
8399
CANDY_PURCHASE
PURCH_ID
CANDY_CUST_TYPE
PROD_ID
CUST_ID
PURCH_DATE DELIVERY_DATE POUNDS
STATUS
1
1
5
28-Oct-04
28-Oct-04
3.5 PAID
CUST_TYPE_ID
CUST_TYPE_DESC
2
2
6
28-Oct-04
30-Oct-04
15 PAID
P
Private
3
1
9
28-Oct-04
28-Oct-04
2 PAID
R
Retail
3
3
9
28-Oct-04
28-Oct-04
3.7 PAID
Wholesale
4
3
2
28-Oct-04
5
1
7
29-Oct-04
29-Oct-04
3.7 NOT PAID
5
2
7
29-Oct-04
29-Oct-04
1.2 NOT PAID
5
3
7
29-Oct-04
29-Oct-04
4.4 NOT PAID
6
2
7
29-Oct-04
W
CANDY_PRODUCT
PROD_ID
PROD_DESC
PROD_COSTPROD_PRICE
3.7 PAID
3 PAID
1 Celestial Cashew Crunch
$
7.45
$
10.00
7
2
10
29-Oct-04
14 NOT PAID
2 Unbrittle Peanut Paradise
$
5.75
$
9.00
7
5
10
29-Oct-04
4.8 NOT PAID
3 Mystery Melange
$
7.75
$
10.50
8
1
4
29-Oct-04
4 Millionaire’s Macadamia Mix
$
12.50
$
16.00
8
5
4
29-Oct-04
5 Nuts Not Nachos
$
6.25
$
9.50
9
5
4
29-Oct-04
29-Oct-04
1 PAID
7.6 PAID
29-Oct-04
3.5 NOT PAID
Overview



Data management
Database system architectures
Database vocabulary & history
Basic Database Vocabulary
PROD_ID



PROD_DESC
PROD_COST
PROD_PRICE
1 Celestial Cashew Crunch
$
7.45
$
10.00
2 Unbrittle Peanut Paradise
$
5.75
$
9.00
3 Mystery Melange
$
7.75
$
10.50
4 Millionaire’s Macadamia Mix
$
12.50
$
16.00
5 Nuts Not Nachos
$
6.25
$
9.50
Field: column of similar data values
Record*: row of related fields
Table*: set of related rows
• Mathematical terms for record and table: tuple and relation
Database History


All pre-1960’s systems used file-based data
First database: Apollo project
 Goal:
No duplicate data in multiple locations
 Used a hierarchical structure
 Created relationships using pointers
 Pointer:
hardware address
Example Hierarchical Database
UNIVERSITY_STUDENT
Student
ID
Student
LastName
Student
Student
FirstName MI
5000
Nelson
Amber
S
5001
Hernandez
Joseph
P
5002
Myers
Stephen
R
Pointers to Course Data*
UNIVERSITY_COURSE
CourseID
Course
Name
Course
Title
100
MIS 290
Intro. to Database
Applications
101
MIS 304
Fundamentals of
Business
Programming
102
MIS 310
Systems Analysis
& Design
* Hex number
referencing data’s
physical location on
hard drive
Problems with Hierarchical Databases
Relationships are all one-way; to go the other way,
you must create a new set of pointers
 Pointers are hardware-specific



VERY hard to move to new hardware
Applications must be custom-written

Usually in COBOL
Relational Databases

Circa 1972
E.J. Codd
 “Normalizing” relations




Goal: No redundant data
Stores data in a tabular format
Creates relationships by sharing key fields
Key Fields

Primary key: field that uniquely identifies a record
 Often
abbreviated “PK”
UNIVERSITY_INSTRUCTOR
Primary keys
InstructorID
Instructor
LastName
Instructor
FirstName
1
Black
Greg
2
McIntyre
Karen
3
Sarin
Naj
Class Discussion


What is the primary key of each table in the
CANDY database?
How can you tell if a field is a primary key?
Sample Database (CANDY)
CANDY_CUSTOMER CUST_ID
CUST_NAME
CUST_TYPE
CUST_ADDR
CUST_ZIP
CUST_PHONE USERNAME PASSWORD
1 Jones, Joe
P
1234 Main St.
91212 434-1231
jonesj
2 Armstrong,Inc.
R
231 Globe Blvd.
91212 434-7664
armstrong
3 Sw edish Burgers
R
1889 20th N.E.
91213 434-9090
sw edburg
4 Pickled Pickles
R
194 CityView
91289 324-8909
pickpick
5 The Candy Kid
W
2121 Main St.
91212 563-4545
kidcandy
6 Waterman, Al
P
23 Yankee Blvd.
91234
w ateral
7 Bobby Bon Bons
R
12 Nichi Cres.
91212 434-9045
bobbybon
8 Crow sh, Elias
P
7 77th Ave.
91211 434-0007
crow el
9 Montag, Susie
P
981 Montview
91213 456-2091
montags
10 Columberg Sw eets W
239 East Falls
91209 874-9092
columsw e
1234
3333
2353
5333
2351
8900
3011
1033
9633
8399
CANDY_PURCHASE
PURCH_ID
CANDY_CUST_TYPE
PROD_ID
CUST_ID
PURCH_DATE DELIVERY_DATE POUNDS
STATUS
1
1
5
28-Oct-04
28-Oct-04
3.5 PAID
CUST_TYPE_ID
CUST_TYPE_DESC
2
2
6
28-Oct-04
30-Oct-04
15 PAID
P
Private
3
1
9
28-Oct-04
28-Oct-04
2 PAID
R
Retail
3
3
9
28-Oct-04
28-Oct-04
3.7 PAID
Wholesale
4
3
2
28-Oct-04
5
1
7
29-Oct-04
29-Oct-04
3.7 NOT PAID
5
2
7
29-Oct-04
29-Oct-04
1.2 NOT PAID
5
3
7
29-Oct-04
29-Oct-04
4.4 NOT PAID
6
2
7
29-Oct-04
W
CANDY_PRODUCT
PROD_ID
PROD_DESC
PROD_COSTPROD_PRICE
3.7 PAID
3 PAID
1 Celestial Cashew Crunch
$
7.45
$
10.00
7
2
10
29-Oct-04
14 NOT PAID
2 Unbrittle Peanut Paradise
$
5.75
$
9.00
7
5
10
29-Oct-04
4.8 NOT PAID
3 Mystery Melange
$
7.75
$
10.50
8
1
4
29-Oct-04
4 Millionaire’s Macadamia Mix
$
12.50
$
16.00
8
5
4
29-Oct-04
5 Nuts Not Nachos
$
6.25
$
9.50
9
5
4
29-Oct-04
29-Oct-04
1 PAID
7.6 PAID
29-Oct-04
3.5 NOT PAID
Special Types of Primary Keys

Composite PK: made by combining 2 or more fields to create
a unique identifier

Consider the CANDY_PURCHASE table…
Composite PK

Surrogate PK: ID generated by the DBMS solely as a unique
identifier (not done in above example, but likely in
CANDY_CUSTOMER and CANDY_PRODUCT)
Key Fields (continued)

Foreign key


Field that is a primary key in another table
Serves to create a relationship
UNIVERSITY_INSTRUCTOR
Primary keys
InstructorID
Instructor
LastName
Instructor
FirstName
1
Black
Greg
2
McIntyre
Karen
3
Sarin
Naj
Foreign keys
UNIVERSITY_STUDENT
StudentID
Student
LastName
Student
FirstName
StudentMI
AdvisorID
5000
Nelson
Amber
S
1
5001
Hernandez
Joseph
P
1
5002
Myers
Stephen
R
3
Alternative to Foreign Keys


Repeat data values for every record
Problems
 Takes
extra space
 Redundant data becomes inconsistent over time
UNIVERSITY_STUDENT
StudentID Student
LastName
Student
FirstName
StudentMI
AdvisorLast
Name
AdvisorFirst
Name
5000
Nelson
Amber
S
Black
Greg
5001
Hernandez
Joseph
P
Black
Gregory
5002
Myers
Stephen
R
Sarin
Naj
Class Discussion



What are the foreign keys in the CANDY database?
Does a table HAVE to have foreign keys?
How can you tell if a field is a foreign key?
Sample Database (CANDY)
CANDY_CUSTOMER CUST_ID
CUST_NAME
CUST_TYPE
CUST_ADDR
CUST_ZIP
CUST_PHONE USERNAME PASSWORD
1 Jones, Joe
P
1234 Main St.
91212 434-1231
jonesj
2 Armstrong,Inc.
R
231 Globe Blvd.
91212 434-7664
armstrong
3 Sw edish Burgers
R
1889 20th N.E.
91213 434-9090
sw edburg
4 Pickled Pickles
R
194 CityView
91289 324-8909
pickpick
5 The Candy Kid
W
2121 Main St.
91212 563-4545
kidcandy
6 Waterman, Al
P
23 Yankee Blvd.
91234
w ateral
7 Bobby Bon Bons
R
12 Nichi Cres.
91212 434-9045
bobbybon
8 Crow sh, Elias
P
7 77th Ave.
91211 434-0007
crow el
9 Montag, Susie
P
981 Montview
91213 456-2091
montags
10 Columberg Sw eets W
239 East Falls
91209 874-9092
columsw e
1234
3333
2353
5333
2351
8900
3011
1033
9633
8399
CANDY_PURCHASE
PURCH_ID
CANDY_CUST_TYPE
PROD_ID
CUST_ID
PURCH_DATE DELIVERY_DATE POUNDS
STATUS
1
1
5
28-Oct-04
28-Oct-04
3.5 PAID
CUST_TYPE_ID
CUST_TYPE_DESC
2
2
6
28-Oct-04
30-Oct-04
15 PAID
P
Private
3
1
9
28-Oct-04
28-Oct-04
2 PAID
R
Retail
3
3
9
28-Oct-04
28-Oct-04
3.7 PAID
Wholesale
4
3
2
28-Oct-04
5
1
7
29-Oct-04
29-Oct-04
3.7 NOT PAID
5
2
7
29-Oct-04
29-Oct-04
1.2 NOT PAID
5
3
7
29-Oct-04
29-Oct-04
4.4 NOT PAID
6
2
7
29-Oct-04
W
CANDY_PRODUCT
PROD_ID
PROD_DESC
PROD_COSTPROD_PRICE
3.7 PAID
3 PAID
1 Celestial Cashew Crunch
$
7.45
$
10.00
7
2
10
29-Oct-04
14 NOT PAID
2 Unbrittle Peanut Paradise
$
5.75
$
9.00
7
5
10
29-Oct-04
4.8 NOT PAID
3 Mystery Melange
$
7.75
$
10.50
8
1
4
29-Oct-04
4 Millionaire’s Macadamia Mix
$
12.50
$
16.00
8
5
4
29-Oct-04
5 Nuts Not Nachos
$
6.25
$
9.50
9
5
4
29-Oct-04
29-Oct-04
1 PAID
7.6 PAID
29-Oct-04
3.5 NOT PAID
Rules for Relational Database Tables


Every record has to have a non-NULL (or “nonempty”) and unique PK value
Every FK value must be defined as a PK in its
parent table
Related documents