Download 10-realSQL

Document related concepts

Entity–attribute–value model wikipedia , lookup

Oracle Database wikipedia , lookup

Relational algebra wikipedia , lookup

Microsoft Access wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Tandem Computers wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Null (SQL) wikipedia , lookup

Relational model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Transcript
Real SQL Programming
Embedded SQL
Java Database Connectivity
Stored Procedures
1
Three-Tier Architecture
 A common environment for using a
database has three tiers of
processors:
1. Web servers --- talk to the user.
2. Application servers --- execute the
business logic.
3. Database servers --- get what the app
servers need from the database.
2
Example: Amazon
Database holds the information about
products, customers, etc.
Business logic includes things like “what
do I do after someone clicks
‘checkout’?”
 Answer: Show the “how will you pay for
this?” screen.
3
Environments, Connections, Queries
The database is, in many DB-access
languages, an environment.
Database servers maintain some number
of connections, so app servers can ask
queries or perform modifications.
The app server issues statements :
queries and modifications, usually.
4
Diagram to Remember
Environment
Connection
Statement
5
Real SQL Programming
Embedded SQL (write SQL with “markup”
and mixed with host language)
Call Level Interface (host level
programming via SQL API)
 Java, C++, Python, Ruby, etc
Stored Procedures
 User defined procedures/function that
become part of the schema (server level)
6
Embedded SQL
Key idea: A preprocessor turns SQL
statements into procedure calls that fit
with the surrounding host-language
code.
All embedded SQL statements begin
with EXEC SQL, so the preprocessor can
find them easily.
7
Shared Variables
To connect SQL and the host-language
program, the two parts must share
some variables.
Declarations of shared variables are
bracketed by:
EXEC SQL BEGIN DECLARE SECTION;
Always
<host-language declarations>
needed
EXEC SQL END DECLARE SECTION;
8
Use of Shared Variables
In SQL, the shared variables must be
preceded by a colon.
 They may be used as constants provided
by the host-language program.
 They may get values from SQL statements
and pass those values to the hostlanguage program.
In the host language, shared variables
behave like any other variable.
9
Example: Looking Up Prices
We’ll use C with embedded SQL to
sketch the important parts of a function
that obtains a beer and a bar, and looks
up the price of that beer at that bar.
Assumes database has our usual
Sells(bar, beer, price) relation.
10
Example: C Plus SQL
EXEC SQL BEGIN DECLARE SECTION;
Note 21-char
char theBar[21], theBeer[21];
arrays needed
for 20 chars +
float thePrice;
endmarker
EXEC SQL END DECLARE SECTION;
/* obtain values for theBar and theBeer */
EXEC SQL SELECT price INTO :thePrice
FROM Sells
WHERE bar = :theBar AND beer = :theBeer;
/* do something with thePrice */
SELECT-INTO
11
as in PSM
Embedded Queries
Embedded SQL has the some
limitations regarding queries:
 SELECT-INTO for a query should produce a
single tuple.
 Otherwise, you have to use a cursor.
12
Cursor Statements
Declare a cursor c with:
EXEC SQL DECLARE c CURSOR FOR <query>;
Open and close cursor c with:
EXEC SQL OPEN CURSOR c;
EXEC SQL CLOSE CURSOR c;
Fetch from c by:
EXEC SQL FETCH c INTO <variable(s)>;
 Macro NOT FOUND is true if and only if the FETCH
fails to find a tuple.
13
Example: Print Joe’s Menu
Let’s write C + SQL to print Joe’s menu
– the list of beer-price pairs that we
find in Sells(bar, beer, price) with bar =
Joe’s Bar.
A cursor will visit each Sells tuple that
has bar = Joe’s Bar.
14
Example: Declarations
EXEC SQL BEGIN DECLARE SECTION;
char theBeer[21]; float thePrice;
EXEC SQL END DECLARE SECTION;
EXEC SQL DECLARE c CURSOR FOR
SELECT beer, price FROM Sells
WHERE bar = ’Joe’’s Bar’;
The cursor declaration goes
outside the declare-section
15
Example: Executable Part
EXEC SQL OPEN CURSOR c;
The C style
of breaking
while(1) {
loops
EXEC SQL FETCH c
INTO :theBeer, :thePrice;
if (NOT FOUND) break;
/* format and print theBeer and thePrice */
}
EXEC SQL CLOSE CURSOR c;
16
Need for Dynamic SQL
Most applications use specific queries
and modification statements to interact
with the database.
 The DBMS compiles EXEC SQL … statements
into specific procedure calls and produces an
ordinary host-language program that uses a
library.
What about dynamic sql queries
17
Dynamic SQL
Preparing a query:
EXEC SQL PREPARE <query-name>
FROM <text of the query>;
Executing a query:
EXEC SQL EXECUTE <query-name>;
“Prepare” = optimize query.
Prepare once, execute many times.
18
Example: A Generic Interface
EXEC SQL BEGIN DECLARE SECTION;
char query[MAX_LENGTH];
EXEC SQL END DECLARE SECTION;
while(1) {
/* issue SQL> prompt */
/* read user’s query into array query */
EXEC SQL PREPARE q FROM :query;
EXEC SQL EXECUTE q;
q is an SQL variable
representing the optimized
}
form of whatever statement
19
is typed into :query
Execute-Immediate
If we are only going to execute the
query once, we can combine the
PREPARE and EXECUTE steps into one.
Use:
EXEC SQL EXECUTE IMMEDIATE <text>;
20
Example: Generic Interface Again
EXEC SQL BEGIN DECLARE SECTION;
char query[MAX_LENGTH];
EXEC SQL END DECLARE SECTION;
while(1) {
/* issue SQL> prompt */
/* read user’s query into array
query */
EXEC SQL EXECUTE IMMEDIATE :query;
}
21
Host/SQL Interfaces Via
Libraries
 Another approach to connecting
databases to conventional languages
is to use library calls.
1.
2.
3.
4.
C + CLI
Java + JDBC
PHP + PEAR/DB
Python+PyGreSQL
22
JDBC
Java Database Connectivity (JDBC) is a
library with Java as the host language.
23
Making a Connection
The JDBC classes
import java.sql.*;
Class.forName(org.postgresql.Driver);
Connection myCon =
DriverManager.getConnection(…);
Loaded by
forName
URL of the database
your name, and password
go here.
The driver
for Postgres;
others exist
24
Statements
 JDBC provides two classes:
1. Statement = an object that can accept a
string that is a SQL statement and can
execute such a string.
2. PreparedStatement = an object that has
an associated SQL statement ready to
execute.
25
Creating Statements
The Connection class has methods to create
Statements and PreparedStatements.
Statement stat1 = myCon.createStatement();
PreparedStatement stat2 =
myCon.createStatement(
”SELECT beer, price FROM Sells ” +
”WHERE bar = ’Joe’ ’s Bar’ ”
createStatement with no argument returns
);
a Statement; with one argument it returns
26
a PreparedStatement.
Executing SQL Statements
JDBC distinguishes queries from
modifications, which it calls “updates.”
Statement and PreparedStatement each
have methods executeQuery and
executeUpdate.
 For Statements: one argument: the query or
modification to be executed.
 For PreparedStatements: no argument.
27
Example: Update
stat1 is a Statement.
We can use it to insert a tuple as:
stat1.executeUpdate(
”INSERT INTO Sells ” +
”VALUES(’Brass Rail’,’Bud’,3.00)”
);
28
Example: Query
stat2 is a PreparedStatement holding
the query ”SELECT beer, price FROM
Sells WHERE bar = ’Joe’’s Bar’ ”.
executeQuery returns an object of class
ResultSet – we’ll examine it later.
The query:
ResultSet menu = stat2.executeQuery();
29
Accessing the ResultSet
An object of type ResultSet is
something like a cursor.
Method next() advances the “cursor” to
the next tuple.
 The first time next() is applied, it gets the
first tuple.
 If there are no more tuples, next() returns
the value false.
30
Accessing Components of Tuples
When a ResultSet is referring to a
tuple, we can get the components of
that tuple by applying certain methods
to the ResultSet.
Method getX (i ), where X is some
type, and i is the component number,
returns the value of that component.
 The value must have type X.
31
Example: Accessing Components
Menu = ResultSet for query “SELECT beer,
price FROM Sells WHERE bar = ’Joe’ ’s Bar’ ”.
Access beer and price from each tuple by:
while ( menu.next() ) {
theBeer = Menu.getString(1);
thePrice = Menu.getFloat(2);
/*something with theBeer and
thePrice*/
32
}
Stored Procedures
PSM, or “persistent stored modules,”
allows us to store procedures as
database schema elements.
PSM = a mixture of conventional
statements (if, while, etc.) and SQL.
Lets us do things we cannot do in SQL
alone.
33
Basic PSM Form: PL/PgSQL
CREATE FUNCTION <name>
(<parameter list> )
RETURNS <type> AS $$
[ DECLARE declarations ]
BEGIN
statements
END;
$$ language plpgsql;
34
Example: Stored Procedure
Let’s write a procedure that takes two
arguments b and p, and adds a tuple
to Sells(bar, beer, price) that has bar =
’Joe’’s Bar’, beer = b, and price = p.
 Used by Joe to add to his menu more
easily.
35
Kinds of PSM statements – (1)
RETURN <expression> sets the return
value of a function.
 Unlike C, etc., RETURN does not terminate
function execution.
DECLARE <name> <type> used to
declare local variables.
BEGIN . . . END for groups of statements.
 Separate statements by semicolons.
36
Kinds of PSM Statements – (2)
Assignment statements:
<variable> := <expression>;
 Example: b := ’Bud’;
Statement labels: give a statement a
label by prefixing a name and a colon.
37
Example: IF
Let’s rate bars by how many customers
they have, based on Frequents(drinker,bar).
 <100 customers: ‘unpopular’.
 100-199 customers: ‘average’.
 >= 200 customers: ‘popular’.
Function Rate(b) rates bar b.
38
Example: IF (continued)
CREATE FUNCTION Rate (b CHAR(20) )
Number of
RETURNS CHAR(10) AS $$
customers of
bar b
DECLARE cust INTEGER;
BEGIN
SELECT COUNT(*) INTO cust FROM Frequents
WHERE bar = b);
IF cust < 100 THEN RETURN ’unpopular’
ELSEIF cust < 200 THEN RETURN ’average’
ELSE RETURN ’popular’
END IF;
Nested
Return occurs here, not at
IF statement
END;
one of the RETURN statements
39
$$
Loops
Basic form:
<loop name>: LOOP <statements>
END LOOP;
Exit from a loop by:
LEAVE <loop name>
40
Example: Exiting a Loop
loop1: LOOP
...
LEAVE loop1;
...
END LOOP;
If this statement is executed . . .
Control winds up here
41
Breaking Cursor Loops – (4)
The structure of a cursor loop is thus:
cursorLoop: LOOP
…
FETCH c INTO … ;
IF NotFound THEN LEAVE cursorLoop;
END IF;
…
END LOOP;
42
Example: Cursor
Let’s write a function that examines
Sells(bar, beer, price), and raises by $1
the price of all beers at Joe’s Bar that
are under $3. Returns a count of
number of changes made.
43
The Needed Declarations
CREATE FUNCTION JoeGouge( )
RETURNS int AS $$
DECLARE theBeer CHAR(20);
Used to hold
beer-price pairs
DECLARE thePrice REAL;
when fetching
DECLARE cnt INT;
through cursor c
Returns Joe’s menu
DECLARE c CURSOR FOR
(SELECT beer, price FROM Sells
WHERE bar = ’Joe’’s Bar’);
44
The Function Body
BEGIN
OPEN c;
cnt := 0;
Check if the recent
<<menuLoop>> LOOP
FETCH failed to
FETCH c INTO theBeer, thePrice;
get a tuple
IF NOT FOUND THEN LEAVE menuLoop END IF;
IF thePrice < 3.00 THEN
UPDATE Sells SET price = thePrice + 1.00
WHERE bar = ’Joe’’s Bar’ AND beer = theBeer;
cnt := cnt+1;
END IF;
END LOOP;
If Joe charges less than $3 for
CLOSE c;
the beer, raise its price at
RETURN cnt;
Joe’s Bar by $1.
45
END;
Disks and Files
Record Formats, Files & Indexing
Disks, Memory, and Files
The BIG picture…
Query Optimization
and Execution
Relational Operators
Files and Access Methods
Buffer Management
Disk Space Management
DB
47
Hierarchy of Storage
 Cache is faster and more expensive than...
 RAM is faster and more expensive than...
 DISK is faster and more expensive than...
 DVD is faster and more expensive than...
 Tape is faster and more expensive than...
 If more information than will fit at one level,
have to store some of it at the next level
48
Why Not Store Everything in Main
Memory?
Costs too much. $100 will buy you either
4GB of RAM or 1TB of disk today.
Main memory is volatile. We want data to
be saved between runs. (Obviously!)
Typical storage hierarchy:



Main memory (RAM) for currently used data.
Disk for the main database (secondary storage).
Tapes for archiving older versions of the data
(tertiary storage).
Disks and Files
DBMS stores information on (“hard”) disks.
This has major implications for DBMS
design!



READ: transfer data from disk to main memory
(RAM).
WRITE: transfer data from RAM to disk.
Both are high-cost operations, relative to inmemory operations, so must be planned
carefully!
Disks
Secondary storage device of choice.
Main advantage over tapes: random
access vs. sequential.
Data is stored and retrieved in units
called disk blocks or pages.
Unlike RAM, time to retrieve a disk
page varies depending upon location on
disk.

Therefore, relative placement of pages on
disk has major impact on DBMS
Pages and Blocks
Data files decomposed into pages
 Fixed size piece of contiguous information in the file
 Unit of exchange between disk and main memory
Disk divided into page size blocks of storage
 Page can be stored in any block
Application’s request for data satisfied by
 Read data page to DBMS buffer in main memory
 Transfer requested data from buffer to application
Application’s request to change data satisfied by
 Update DBMS buffer
 (Eventually) copy buffer page to page on disk
52
Components of a Disk
Disk head

The platters spin (7200rpm).
The arm assembly is
moved in or out to position
a head on a desired track.
Tracks under heads make
a cylinder (imaginary!).

Sector
Arm movement
Only one head
reads/writes at any
one time.

Arm assembly
Block size is a multiple
of sector size (which is fixed).

Spindle
Tracks
Platters
I/O Time to Access a Page
Seek latency- time to position heads over
cylinder containing page (~ 10 - 20 ms)
Rotational latency - additional time for
platters to rotate so that start of page is
under head (~ 5 - 10 ms)
Transfer time - time for platter to rotate
over page (depends on size of page)
Latency = seek latency + rotational
latency
Goal - minimize average latency, reduce
number of page transfers
54
Accessing a Disk Page
Time to access (read/write) a disk block:



seek time (moving arms to position disk head on track)
rotational delay (waiting for block to rotate under head)
transfer time (actually moving data to/from disk surface)
Seek time and rotational delay dominate.



Seek time varies from about 1 to 20msec
Rotational delay varies from 0 to 10msec
Transfer rate is about 1msec per 4KB page
Latency = seek latency + rotational latency
Goal - minimize average latency, reduce
number of page transfers
Reducing Latency
Store pages containing related information
close together on disk
 Justification: If application accesses x, it will next
access data related to x with high probability
Page size tradeoff:
 Large page size - data related to x stored in same
page; hence additional page transfer can be
avoided
 Small page size - reduce transfer time, reduce
buffer size in main memory
56
 Typical page size - 4096 bytes
Arranging Pages on Disk
`Next’ block concept:



blocks on same track, followed by
blocks on same cylinder, followed by
blocks on adjacent cylinder
Blocks in a file should be arranged
sequentially on disk (by `next’), to
minimize seek and rotational delay.
For a sequential scan, pre-fetching
several pages at a time is a big win!
RAID
Disk Array: Arrangement of several disks
that gives abstraction of a single, large
disk.
Goals: Increase performance and
reliability.
Two main techniques:


Data striping: Data is partitioned; size of a
partition is called the striping unit. Partitions
are distributed over several disks.
Redundancy: More disks => more failures.
Redundant information allows reconstruction
of data if a disk fails.