Download Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft Jet Database Engine wikipedia , lookup

Relational algebra wikipedia , lookup

Database wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Functional Database Model wikipedia , lookup

Clusterpoint wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

PL/SQL wikipedia , lookup

Join (SQL) wikipedia , lookup

SQL wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
INFORMATION
TECHNOLOGY IN
BUSINESS AND
SOCIETY
SESSION 16 – SQL
SEAN J. TAYLOR
ADMINISTRATIVIA
• Assignment 3:
New drop for any updates related to
AdSense only
• Database tutorial led by Varun:
Tuesday 3/27 12:30pm-1:45
• Assignment 4:
Posted on the web due Friday 3/30
• Midterm: end of class
ADMINISTRATIVIA II
• Groups:
Please fill out your forms by SUNDAY 3/25
• 2-way feedback:
1. Please fill out your surveys!
2. I will send you a brief summary of your
current grade.
LEARNING OBJECTIVES
1. Be able to query single tables using SQL.
2. Be able to perform 1:1, 1:M joins to query
relational information.
3. Be able to compute aggregates information
using where and having clauses.
4. Be able to perform N:M joins using a join table.
REVIEW:
RELATIONAL DATABASES
• Information is stored in tables
• Each table contains information about a real
word “entity” (e.g., a book, a customer)
• Each table contains fields (e.g., BookName,
Author, Price)
• Each row of the table contains a unique
identifier, a.k.a. primary key (e.g., ISBN)
REVIEW: NORMALIZATION
• A technique for designing relational database tables to
minimize duplication of information and to safeguard the
database against certain types of logical or structural problems,
namely data anomalies.
• For example, when multiple instances of a given piece of
information occur in a table, the possibility exists that these
instances will not be kept consistent when the data within the
table is updated, leading to a loss of data integrity. Prevents
insertion, deletion and update anomalies.
• Rule of thumb: If you find yourself typing the same
information in a field again and again, then the design is bad
• You will encounter anomalies in the future
SQL: INTRODUCTION
What is SQL?
• Review: Database and DBMS (chapter 1)
• SQL is a standard command language use by relational
DBMS to perform database operations
Some facts about SQL
• SQL 92 is the most commonly supported version
• English-like (not programming)
• Case insensitive
• Venders have different implementations
SQL STATEMENTS
DDL - data definition language
• Defining and modifying data structures (metadata): database,
tables, views, etc.
DML - data manipulation language
• Manipulating user data: insert, update, search, etc.
DCL - data control language
• Control data access: permissions, etc.
SELECT STATEMENT
SELECT statement retrieves data from database
(query)
• The result is usually another table (but not necessarily a
relation)
We will learn
• Defining selection criteria
• Sorting
• Calculation
• Grouping
SELECT STATEMENT SYNTAX
SELECT Column(s) or other expressions
FROM Table(s)
[WHERE …]
[ORDER BY ColumnName]
SELECT COLUMNS
Syntax
• SELECT * (or a list of columns)
FROM TableName
• Wild card: *
Example
• SELECT * FROM Book
• SELECT BookName, Price FROM Books
SELECT ROWS
Use WHERE clause to specify selection criteria
Example
• SELECT * FROM Book
WHERE Price = 29.99
• SELECT BookName, Price FROM Books
WHERE Price < 20
2.1 Comparison Operators
• “=“, “>”, “<“, “>=“, “<=“, “<>”
DATA TYPES IN COMPARISON
Text, String, Memo, etc. – '…'
• BookName='database'
• BookName>'2008'
//alphabetical order
Number, integer, decimal, currency, etc.
• Price > 20.99
Data/Time
• Access 2007: >#08/30/2008# //after the date
• SQL Server: <'08/30/2008'
//before the date
MORE COMPARISON
OPERATORS
IN (value list)
• SELECT * FROM Book
WHERE Price IN (19.99, 29.99, 39.99)
BETWEEN min AND max
• SELECT * FROM Book
WHERE Price BETWEEN 9.99 AND 19.99
IS NULL
• SELECT * FROM Book
WHERE Author IS NULL
STRING PATTERN MATCH
Fuzzy query using LIKE
• _ (underscore): single character wildcard
• ? in Access 2007
• % (percentage): multiple character wildcard
• * in Access 2007
Example
• SELECT * FROM Book
WHERE BookName LIKE '*information systems*'
NOT
Reversal criteria
• NOT (expression)
Examples:
• NOT Price > 20
• NOT BookName LIKE '*information*'
• NOT IS NULL or IS NOT NULL
COMPOUND CONDITIONS
Use logical operators to connect multiple conditions
• AND: an intersection of the data sets (higher precedence)
• OR: a union of the data sets
• Best practice: use parentheses () to explicitly mark comparison
order
Examples
• SELECT * FROM Book
WHERE Price <= 19.99 AND Price >= 9.99
• SELECT * FROM Book
WHERE PubDate=#10/1/2003# OR PubDate=#10/1/2004#
• SELECT * from Book
WHERE (Publisher = 'Que' OR Publisher = 'Alpha') AND Price = 29.99
SORTING
Syntax
• ORDER BY Column(s) [ASC/DESC]
• Examples
SELECT * FROM Books
ORDER BY PubDate
Multiple columns
SELECT * FROM Book
ORDER BY Publisher DESC, PubDate
COLUMN BASED CALCULATION
Column based calculation
• Calculated (derived) columns are not designed directly into the
table
• Using +, -, *, / with columns and numbers
Example
• SELECT BookName, Price, Price/numberofPages AS PricePerPage
FROM Books
• SELECT BookName, Price, Price * 0.1 AS Discount
FROM Book
WHERE Price * 0.1 >= 15
A COMPLETE QUERY EXAMPLE
SELECT ISBN, BookName, Price, Publisher
FROM Book
WHERE
BookName like '*Information Systems*'
AND PubDate > #1/1/2002#
AND Price < 100
ORDER BY Price
ROW BASED CALCULATION
Using aggregate functions based on rows
• MIN
• MAX
• COUNT
• AVG
• SUM
(minimum)
(maximum)
(The number of)
(Average)
(Sum of)
Example
• SELECT COUNT(*) FROM Book
• SELECT AVG(Price) FROM Book
WHERE Publisher = 'Prentice Hall'
GROUPING
GROUP BY: doing math with groups
• SELECT COUNT(*) FROM Book
WHERE Publisher = 'Prentice Hall';
SELECT COUNT(*) FROM Book
WHERE Publisher = ‘Springer';
…
• Or:
SELECT Publisher, COUNT(*) FROM Book
GROUP BY Publisher
GROUP BY … HAVING
Use “Having” clause to filter aggregation result
SELECT Publisher, COUNT(*) FROM Book
GROUP BY Publisher
Having Count(*) > 2
Use “where” clause to filter records to be aggregated
SELECT Publisher, COUNT(*) as total FROM Book
Where Price < 100
GROUP BY Publisher
Having Count(*) > 10
Order by Count(*)
UNIQUENESS
Using the keyword “DISTINCT”
Example:
• SELECT DISTINCT Publisher FROM Book
TABLE JOIN
Querying data from multiple tables
• The query result consists of columns from more
than one table
How do rows match?
• Specifying matching/joining criteria: usually by
the pair of primary key/foreign key, or candidate
key/foreign key
ISBN is a
Foreign Key here
TABLE JOIN EXAMPLE
SELECT BookName, Date FROM Book, Order
Where Book.ISBN = Order.ISBN
Order By Book.ISBN,
Order.Date
Use
“table.column”
format to avoid
ambiguity
Joining/matching
criteria: very
important, don’t
forget!
JOINING MORE TABLES
SELECT
Book.BookName, Order.Date, Customer.LastName
FROM Book, Order, Customer
WHERE
Books.ISBN = Order.ISBN and
Order.CustomerID = Customer.CustomerID
ORDER BY
Books.ISBN, Rating
NEXT CLASS:
DATA MINING
• Read “Diamond in the Data
Mine”
• Work on A4
• Groups form
MIDTERM REVIEW PROCESS
• Consult the answer key.
• Photocopy the page(s) of your exam that
you wish to dispute.
• Write why you think you deserve points.
• Submit to my mailbox on the 8th floor by
Thursday 3/29.