Download Structured Query Language (SQL) Advanced Features

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Tandem Computers wikipedia , lookup

Microsoft Access wikipedia , lookup

Oracle Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Relational algebra wikipedia , lookup

Functional Database Model wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Clusterpoint wikipedia , lookup

SQL wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

PL/SQL wikipedia , lookup

Transcript
Explicit Joins
™
™
Structured Query Language (SQL)
Advanced Features
SELECT select-list
select list
FROM
table_1
{LEFT|RIGHT|FULL} [OUTER] JOIN table_2
ON join_condition_1
[{LEFT|RIGHT|FULL} [OUTER] JOIN table_3
ON join_condition_2] …
Instructor: Sharma Chakravarthy
[email protected]
The University of Texas at Arlington
Database Management Systems: Sharma Chakravarthy
Joins can be specified as part of the FROM clause
explicitly (theta join)
Joins specified as part of the WHERE statement are called
implicit joins. *= or =* is used to express left and right
outer joins. No support for full outer join (in SQL)
CROSS JOIN can also be used for Cartesian product.
1
Database Management Systems: Sharma Chakravarthy
Example
Union
26. Find the numbers of the players who are older
than R. Parmenter
™
SELECT
FROM
WHERE
Syntax
SELECT_statement_1
UNION [ALL]
SELECT_statement_2
[UNION [ALL]
SELECT_statement_3] …
[ORDER BY order_by_list]
A.playerno
PLAYERS as A
JOIN PLAYERS as P
on A.yob < P.yob
P.name = 'Parmenter'
AND P.initials = 'R'
Database Management Systems: Sharma Chakravarthy
2
3
Database Management Systems: Sharma Chakravarthy
4
1
Aggregate Functions
Aggregate Functions
AVG([ALL|DISTINCT] expression)
™
– The average of the non-null values in the expression
SUM([ALL|DISTINCT] expression)
™
– The total of the non-null values in the expression
MIN([ALL|DISTINCT] expression)
– The lowest non-null values in the expression
p
MAX([ALL|DISTINCT] expression)
™
– The highest non-null values in the expression
COUNT([ALL|DISTINCT] expression)
– The number of non-null values in the expression
™
COUNT(*)
– The number of rows selected by a query
Database Management Systems: Sharma Chakravarthy
5
Database Management Systems: Sharma Chakravarthy
Aggregate Functions
™
™
™
™
™
™
™
™
™
™
Can a WHERE clause have aggregate functions?
No!
Can a HAVING clause have aggregate functions?
Yes!
Can a SELECT clause have aggregate functions?
Yes!
A HAVING clause can only refer to a column included in
the select clause whereas a WHERE clause can refer to any
columns in the base tables
Can use AND and OR in the HAVING clause
Can use simple predicates in the HAVING clause, but
aggregates cannot be used in the WHERE clause
A SELECT can only include columns in the group by
clause (+ aggregate functions)!
Database Management Systems: Sharma Chakravarthy
Aggregate functions (or column functions) perform
calculation on the values in a set of selected rows
Expression specified for AVG and SUM functions
must result in a numeric value. The expression for
Min, MAX, and COUNT can result in a numeric,
date, or string value
By default all values are included in the calculation
regardless of whether they are duplicates. Use
DISTINCT keyword to omit duplicates
If you use an aggregate function in the SELECT
clause, that clause cannot include non-aggregate
columns from the base table
7
6
Aggregate Functions
™
Several RDBMSs support OLAP (on-line analytical
processing) used in data warehouses
™
ROLLUP
– Summary
y row is added for each g
group
p and the total
™
CUBE
– Summary row is added for each dimension (or distinct combination of
the group) and the total
™
TOP n [PERCENT] – output top n tuples or top n% tuples
™
Top n [PERCENT] [WITH TIES] – if the last row has ties
(same values), all are presented
Database Management Systems: Sharma Chakravarthy
8
2
ROLLUP
ROLLUP
Select venstate, vencity, count(*) as qtyven,
From vendors
Group by venstate, vencity with ROLLUP
Order by venstate DESC, vencity DESC
Select vendorid, count(*) as invcount,
sum(invoicetotal) as invtotal
From invoices
Group by vendorid with ROLLUP
Vid
117
121
123
NULL
invcount
1
8
47
56
invtotal
16.6
6940.2
4378.2
9335.0
Summary row
Database Management Systems: Sharma Chakravarthy
9
CUBE
vencity
washington
fairfield
east brunswick
NULL
washington
fairfield
NULL
NULL
qtyven
1
1
2
4
1
1
2
6
Summary row
Summary row
Database Management Systems: Sharma Chakravarthy
10
Joins Vs. subselects
Select venstate, vencity, count(*) as qtyven,
™
From vendors
Where venstate in {‘IA’, ‘NJ’)
Group by venstate, vencity with CUBE
Order by venstate DESC, vencity DESC
Venstate
NJ
NJ
NJ
NJ
IA
IA
IA
NULL
NULL
NULL
NULL
Venstate
NJ
NJ
NJ
NJ
IA
IA
IA
NULL
vencity
washington
f i fi ld
fairfield
east brunswick
NULL
washington
fairfield
NULL
washington
fairfield
east brunswick
NULL
qtyven
1
1
2
4
1
1
2
2
2
2
6
Advantage of implicit/explicit joins
–
–
–
Summary row
™
Database Management Systems: Sharma Chakravarthy
Advantages of subqueries
–
–
–
Summary row
11
The result of a join operation can include columns from
both tables.
Only outer table columns can be included when using a
subquery
A query with an implicit join typically performs faster than
the same with a subquery
You can pass an aggregate value to the outer query
A subquery tends to be more intuitive
Long, complex queries can be easier to code using
subqueries
Database Management Systems: Sharma Chakravarthy
12
3
subqueries
™
™
™
™
™
™
Subqueries
A subquery is a SELECT statement that is used within
another SQL statement
A subquery can return a single value, a result set that
contains a single column, or a result set that contains one
or more columns
A subquery that returns a single value can be used
anywhere an expression is allowed.
A subquery that returns a single column can be used in
place of a list of values
A subquery that returns one or more columns can be used
in place of a table in the FROM clause
Some DBMSs allow multiple attributes to be compared
Database Management Systems: Sharma Chakravarthy
™
Subqueries can be used in 4 ways:
– In a WHERE clause as part of a condition
– In a HAVING clause as part of a condition
– In a FROM clause as a table specification //table expression
– In a SELECT clause as a column specification
13
Database Management Systems: Sharma Chakravarthy
Subqueries
Subqueries
Select distinct vendorname,
(select MAX(invoicedate) form Invoices
where Invoices.vendorID = Vendors.vendorID) as LatestInv
from Vendors
order by
y LatestInv DESC
Alternatively,
Alternatively,
distinct vendorname, MAX(invoicedate)
Vendors join Invoices
on Invoices.vendorID = Vendors.vendorID) as LatestInv
Group by VendorName
order by LatestInv DESC
14
Select distinct vendorname, MAX(invoicedate) as latestInv
Form Vendors, Invoices
Where Invoices.vendorID = Vendors.vendorID
Group by VendorName
order by LatestInv DESC
Select
Form
Database Management Systems: Sharma Chakravarthy
15
Database Management Systems: Sharma Chakravarthy
16
4
Subselect
™
™
™
™
Subselect in a FROM Clause
Subquery in a SELECT clause used for a column
specification must return a single value
Typically a correlated subquery
Can be rewritten using a join instead of the
subquery.
Join is faster and more readable and hence
subqueries are seldom used in a select statement
Database Management Systems: Sharma Chakravarthy
™
– single-shot evaluation of sub/nested queries
– Example : Select employees whose salary is equal to average salary
Name
From
Where
Employee
Salary = (Select AVG (salary)
From Employee)
18
Sub/nested query can return a set
– Example: select employees whose department is located in
Denver
Select
From
Where
– The subquery returns a single value.
– Hence the predicate can be modified as
Name
Employee
Dept_number IN
(Select Dept_Number
From Dept
Where location=‘Denver’ )
– Modify the predicate appropriately at run time (as an IN
clause which has an explicit set as part of the query
Salary = “K” where k is the result of the sub query
Database Management Systems: Sharma Chakravarthy
Database Management Systems: Sharma Chakravarthy
Nested Queries (Contd.)
Classification
Select
Invoices.VendorID, MAX(InvoiceDate)
Invoices JOIN
(SELECT TOP 5 VendorID,
AVG(invoiceTotal) as AvgInvoice
FROM
Invoices
GROUP BY VendorID
ORDER BY AvgInvoice DESC) AS TopVendor
ON Invoices.VendorID=TopVendor.VendorID
Group By Invoices.VendorID
Order BY LatestInv DESC
17
Nested Queries
™
SELECT
FROM
19
Database Management Systems: Sharma Chakravarthy
20
5
Nested Queries (Contd.)
™
Nested Queries (Contd.)
The previous query can be re-written without
nesting
Select
From
Where
“Select names of Employees who earn more than their
managers”
– In principle, a correlation query must be re-evaluated for
each
h candidate
dd
tuple
l from
f
the
h referenced
f
d query block
bl k
Name
Employee e, Dept d
e.dept_number = d.dept_number
and d.location=‘Denver’
– The optimizer may not do this transformation
Database Management Systems: Sharma Chakravarthy
21
Nested Queries (Contd.)
™
Database Management Systems: Sharma Chakravarthy
Nested Queries (Contd.)
Correlated sub query
™
– a sub query that contains reference to a value obtained from
a candidate tuple of a higher level query block
– Example
Select
From
where
™
Correlated sub query
– a sub query that contains reference to a value obtained from
a candidate tuple of a higher level query block
– Example
p
Select
From
Where
Name
Employee e
Salary > (Select Salary
From Employee
where Employee.Num=e.Manager)
The subquery has to be evaluated for each tuple
of the outer query
Database Management Systems: Sharma Chakravarthy
22
23
™
Name
Employee
Dept_number IN
(Select Dept_Number
From Dept
Where location=‘Denver’ )
The above query is not correlated
Database Management Systems: Sharma Chakravarthy
24
6
Nested Queries (Contd.)
Multi level correlated subqueries
Select names of employee’s who earn more than their manager’s
manager.
™
The previous query can be written without using
subqueries
L1
L2
– Example
E
l
Select
From
where
Name
Employee e, Employee m
e.Salary > m.salary AND
e.Num=m.Manager
L3
select name
from
Employee e
where salary >
(select salary
from Employee
where Employee.number =
(select manager
From Employee
where Employee.num = e.Manager))
Since L3 query references a L1 value, L3 need to be evaluated for
each L1 tuple.
Database Management Systems: Sharma Chakravarthy
25
Database Management Systems: Sharma Chakravarthy
Correlated subqueries
Nested Queries
™ Some
correlated subqueries CANNOT
be expressed in a simpler manner (as
non-correlated subqueries)
™ Example:
™
™
Division queries
™
Database Management Systems: Sharma Chakravarthy
27
Nested block is optimized
independently, with the outer tuple
considered as providing a selection
condition.
Outer block is optimized with the
cost of `calling’ nested block
computation taken into account.
Implicit ordering of these blocks
means that some good strategies are
not considered. The non-nested
version of the query is typically
optimized better.
26
SELECT S.sname
FROM Sailors S
WHERE EXISTS
(SELECT *
FROM Reserves R
WHERE R.bid=103
AND R.sid=S.sid)
Nested block to optimize:
SELECT *
FROM Reserves R
WHERE R.bid=103
AND S.sid= outer value
Equivalent non-nested query:
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid=R.sid
AND R.bid=103
Database Management Systems: Sharma Chakravarthy
28
7
DIVISION
™
™
™
™
(p#, pname, color, weight, city)
(s#, p#, qty)
There is NO division operator or (logical) for all
quantifier in SQL
θ [any|all] is not adequate to express division !
How do we translate the above query into SQL?
DIVISION
The following 2 statements are equivalent:
1. For all x (parts) there exists a y (supplier) such
that P(x, y) is true (y supplies all x (parts))
2. It is not the case (false) that there exists a x (part)
for which no y (supplier) exists such that P(x, y)
is true (y supplies part x)
Database Management Systems: Sharma Chakravarthy
DIVISION
Get supplier names for suppliers who supply ALL parts
The following 2 statements are equivalent:
2.
™
For all parts p there exists a supplier s such that
supplies(s, p) is true
It is not the case (false) that there exists a part for which
no supplier exists such that suplies(s, p) is true
Database Management Systems: Sharma Chakravarthy
31
(p#, pname, color, weight, city)
(s#, p#, qty)
It is based on the equivalence of the logical
expression
∀ x (∃ y (P(x, y)) ≡ ¬ (∃ x (¬ ∃ y (P(x, y)))
(p#, pname, color, weight, city)
(s#, p#, qty)
∀p
parts (∃ supplier
pp
(supplies(supplier,
( pp ( pp , parts))
p
)) ≡
¬ (∃ part (¬ ∃ supplier (supplies(supplier, parts)))
1.
™
29
SUPPLIER (s#, sname, status, city)
PARTS
SP
SUPPLIER (s#, sname, status, city)
PARTS
SP
Get supplier names for suppliers who supply
ALL parts
(SP[s#, p#] ÷ PARTS[p#]) join SUPPLIER on s#
Database Management Systems: Sharma Chakravarthy
™
DIVISION
SUPPLIER (s#, sname, status, city)
PARTS
SP
SELECT
FROM
WHERE
SUPPLIER
(s#, sname, status, city)
PARTS
SP
(p#, pname, color, weight, city)
(s#, p#, qty)
30
DISTINCT sname
supplier s
/*needed for output */
NOT EXISTS
(SELECT
parts
FROM
parts p
/*numerator*/
minus
i
(SELECT
parts
FROM
SP sp
/*denominator*/
WHERE
sp.s# = s.s# )
)
Distinct is important!
Database Management Systems: Sharma Chakravarthy
32
8
DIVISION
SELECT
FROM
WHERE
SUPPLIER
(s#, sname, status, city)
PARTS
SP
(p#, pname, color, weight, city)
(s#, p#, qty)
DIVISION
(s#, sname, status, city)
(p#, pname, color, weight, city)
(s#, p#, qty)
Get suppliers who supply ALL red parts
DISTINCT sname
supplier s
NOT EXISTS
(SELECT
*
FROM
parts p
//all variable denominator
WHERE
NOT EXISTS
(SELECT *
FROM
SP sp //there exists numerator
WHERE sp.s# = s.s# AND
sp.p# = p.p#)
)
Database Management Systems: Sharma Chakravarthy
SUPPLIER
PARTS
SP
SELECT
FROM
WHERE
33
Other operators
DISTINCT sname
supplier s
NOT EXISTS
(SELECT
*
FROM
parts p //all variable comes here
WHERE
p.color = ‘red’
AND NOT EXISTS
(SELECT
*
FROM SP sp //there exists var
WHERE sp.s# = s.s# AND
sp.p# = p.p#)
)
Database
Management Systems: Sharma Chakravarthy
34
DDL
Create table
Example:
Create table employee (
fname
varchar(15) not null default ‘fnu’,
lname
varchar(20) not null default ‘lnu’,
sex
char check (sex = ‘f’
f or sex = ‘m’),
m ),
ssn
char(9),
superssn
char(9),
primary key(ssn),
foreign key (superssn) references employee(ssn)
on delete set default on update cascade,
unique
(fname, lname)
)
™
™
™
™
™
INTERSECT
EXCEPT
MINUS
Database Management Systems: Sharma Chakravarthy
35
Database Management Systems: Sharma Chakravarthy
36
9
DDL
DDL
Can create domains and use it in create tables
Can give CHECK constraints in domain
definition
™ Can define constraints with a name inside create
t bl command
table
d
Examples:
create domain D_NUM as integer
chek (D_num > 0 and D_num < 10);
™
™
Database Management Systems: Sharma Chakravarthy
37
Constraint statement
™
™
™
™
™
™
Can create assertions
“the salary of an employee must not be greater that the salary of
the manager of the department that the employee works for”
y
create assertion salary_constraint
check(NOT EXISTS
(select *
from employee e, employee m, department d
where e.salary > m.salary and
e.dno=d.dnumber and
d.mgrssn=m.ssn)
Database Management Systems: Sharma Chakravarthy
38
INSERT Statement
[constraint <constraint_name>]
[not] null |
Check (<condition>) |
Unique |
Primary key |
References <table_name> [(column_name)] [on
delete cascade]
Database Management Systems: Sharma Chakravarthy
™
SELECT select_list
INTO tabel_name
From table_source
….
™ Creates a NEW table and inserts tuples into it
™ The table name must not exist
™ All columns in the select_list must be names
(especially if they compute a value)
39
Database Management Systems: Sharma Chakravarthy
40
10
INSERT Statement
UPDATE statement
INSERT [INTO] Table_name [(column_list)]
[DEFAULT] VALUES(expression1 [, expression2]
…);
™
™
™
™
INSERT INTO PENALTIES
VALUES (10, 20, ‘1999’, 100);
Database Management Systems: Sharma Chakravarthy
™
41
DELETE statement
™
™
™
™
™
UPDATE table_name
SET col_name1 = expr1 [, col_name2= expr2] …
[FROM table_source [[AS] table_alias]
[WHERE search_condition]
If you omit the WHERE clause, all rows in the table
will be updated!
Database Management Systems: Sharma Chakravarthy
42
Other Features
DELETE [FROM] table_name
[FROM table_source]
[WHERE search_condition]
™
If you omit the WHERE clause, all rows in the table
will be updated!
™
™
™
™
Grant and revoke
Create views
Rollback and commit
The catalog is maintained as a table
Operations on meta data is done using the same
language!
Drop table table_name
Database Management Systems: Sharma Chakravarthy
43
Database Management Systems: Sharma Chakravarthy
44
11
Embedded SQL
Embedded SQL
™
™
™
™
™
As the name implies, SQL is embedded into a
programming language
Provides features that are not supported in SQL
directly (if
(if-then-else,
then else, looping, …)
Can write canned packages to accept parameters
from users
Combines the capability of SQL with that of a
programming language
Arbitrary computations can be performed on
data retrieved and stored in tables
Database Management Systems: Sharma Chakravarthy
45
Embedded SQL
™
Main(){
….
exec sql begin declare section;
declare variables
exec sql end declare section;
exec sql include sqlca;
……
exec sql
select cname, discnt
into :cust_name, :cust_discnt
from customers
where cid= :cusr_id
}
A single row is extracted in the above! What if it is a set?
Database Management Systems: Sharma Chakravarthy
Dynamic SQL
Cursors
–
–
–
–
–
Declare a cursor with a name
Open a cursor
Fetch tuples one at a time into program variables
Close the cursor
Cursors can be used for read or update; need to be declared;
™
™
™
™
™
™
™
™
46
Compiled by the pre-processor
Compiled and linked by the language processor
Executed!
™
Median and mode can be computed !
Transitive closure can be computed !
Database Management Systems: Sharma Chakravarthy
™
™
47
So far, we discussed static SQL. That is program variables
held values but NOT sql statements!
In other words, the embedded SQL statements cannot
change at run time!
SQL statements
t t
t cannott b
be generated
t d or parameterized
t i d att
run time based on user input!
The above is needed for writing interfaces to accept inputs
and generate appropriate SQL statements based on the
input!
Dynamic sql solves this problem!
You need to use this phase 5 of the project
Database Management Systems: Sharma Chakravarthy
48
12
Dynamic SQL
™
Dynamic SQL (Contd.)
Execute immediate
™
– Exec sql execute immediate :host_variable
– host_variable contains an sql statement as a string
– The statement is compiled once
– The statement cannot change
– If you want to change it, needs recompilation of the code.
Example:
char sqltext[] = “delete from players where
playerno=24”;
exec sql execute immediate :sqltext;
Database Management Systems: Sharma Chakravarthy
™
How do you overcome the above?
– Provide a mechanism to compile dynamically
– Provide a mechanism to execute dynamically
49
Prepare, execute, and using
™
Execute immediate
Database Management Systems: Sharma Chakravarthy
50
Prepare and execure
prepare
– Exec sql prepare delete_playerno from :host_Variable
– Exec sql execute delete_playerno using :var1, var2, …, vark;
™
Example:
char sqltext[] = “delete
delete from players where playerno
playerno= :pno”;
:pno ;
™
™
exec sql prepare delete_playerno from :sqltext;
™
– Parses and generates compiled form which is parameterized
exec sql execute delete_playerno using :pno;
™
– Executes the compiled form using the variable
Database Management Systems: Sharma Chakravarthy
™
51
In the previous example, there were NO values
returned
If the sql stmt has a select, there is a problem
we cannot include arbitrary number of attributes
to be returned
In order to overcome the above problem, a
descriptor area has been introduced
Describe statement needs to be used
Cursor is used to extract values iteratively
Database Management Systems: Sharma Chakravarthy
52
13
PL/SQL, TransactSQL
™
™
™
™
™
Procedures, Triggers and user-defined functions
In addition to embedding sql in various
programming languages, each vendor has a
language in which programming language
features have been included
PL/SQL s Oracle’s
Oracle s procedural extension of SQL
TransactSQL is Sybase’s procedural extension of
sql.
Seamless integration of sql and other language
constructs
Can compile and store PL/SQL code and can be
invoked
Database Management Systems: Sharma Chakravarthy
A stored procedure is one or more sql statements that have been
compiled and stored with the database
™
Stored procedures are compiled and optimized the first time
they are executed. In contrast, SQL statements that are sent from
the client to the server have to be compiled and optimized every
time they are executed
™
In addition to select statements, it can contain other sql
statements. It can also contain control-flow language for
conditional processing
™
A trigger is a special type of procedure executed automatically
when a table is modified. Triggers are typically used to check
the validity of data in a row that is being updated.
™
A user-defined-function (UDF) is a special type of procedure
that can return a value or a table
53
Database Management Systems: Sharma Chakravarthy
54
Triggers
Procedures, Triggers and user-defined functions
™
The three types of procedural programs – stored procedure,
triggers and user-define-functions – are executable database
objects. These are stored with in the database.
™
Stored procedures are frequently written by SQL programmers
for use by end users or application programmers. You can
provide access to the database exclusively thru stored
procedures.
™
Both user-defined-function (UDF) and Triggers are used more
often by sql programmers than application programmers or
end users. UDF is a special type of procedure that can return a
value or a table
™
They differ in whether or not they can use parameters. Both
stored procedures and user-defined-functions can use
parameters, but triggers cannot.
Database Management Systems: Sharma Chakravarthy
™
Create trigger trigger_name
On {table_name | view_name}
{for | after | instead of} [insert] [,] [update] [,]
[delete]
As sql_statements
™
55
With in a trigger, you can refer to two tables:
inserted or deleted
Database Management Systems: Sharma Chakravarthy
56
14
SQL Summary
SQL Summary (Contd.)
™
™
™
™
Although not Turing-complete, SQL is a
powerful non-procedural language for relational
data management
SQL has
h b
been optimized
ti i d extensively
t i l
SQL as a standard evolves every 5 to 6 years and
a number of additional features are added to
accommodate growing needs
– Triggers, abstract data types, object-oriented features have
been added in addition to OLAP operators (cube, rollup etc.)
Database Management Systems: Sharma Chakravarthy
57
™
An English query can be translated into more
than one SQL equivalent. Some may be easier to
understand (by humans). Some are better
optimized by the underlying system
Mining and warehousing has brought a number
of limitations of relational query optimizers
– They were developed assuming not more than 8 to 10 joins
™
™
B+ trees are the bread-and-butter of
optimization.
Tuners or wizards are being developed to
simplify (or replace) the job of an administrator
Database Management Systems: Sharma Chakravarthy
58
15