Download PPT - NYU Stern School of Business

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
C20.0046: Database
Management Systems
Lecture #26
Matthew P. Johnson
Stern School of Business, NYU
Spring, 2004
M.P. Johnson, DBMS, Stern/NYU, Sp2004
1
Agenda


Previously: Indices
Next:




Finish Indices, advanced indices
Failure/recovery
Data warehousing & mining
 Websearch
Hw3 due today

no extensions!

1-minute responses

Review: clustered, dense, primary, #/tbl, syntax
M.P. Johnson, DBMS, Stern/NYU, Sp2004
2
User/
Application
Transaction
commands
Let’s get physical
Query
update
Query compiler/optimizer
Record,
index
requests
Transaction manager:
•Concurrency control
•Logging/recovery
Read/write
pages
Execution engine
Query execution
plan
Index/record mgr.
Page
commands
Buffer manager
Storage manager
storage
M.P. Johnson, DBMS, Stern/NYU, Sp2004
3
BSTs

Very simple data structure in CS: BSTs




Each node has two children:



Binary Search Trees
Keep balanced
Each node ~ one item
Left subtree: <
Right subtree: >=
Can search, insert, delete in log time

log2(1MB = 220) = 20
M.P. Johnson, DBMS, Stern/NYU, Sp2004
4
Search for DBMS

Big improvement: log2(1MB) = 20

Each op divides remaining range in half!

But recall: all that matters is #disk accesses

20 is better than 220 but:
Can we do better?
M.P. Johnson, DBMS, Stern/NYU, Sp2004
5
BSTs  B-trees


Like BSTs except each node ~ one block
Branching factor is >> 2




Data stored only in leaves



Each access divides remaining range by, say, 300
B-trees = BSTs + blocks
B+ trees are a variant of B-trees
Leaves form a (sorted) linked list
Better supports range queries
Consequences:



Much shorter depth  Many fewer disk reads
Must find element within node
Trades CPU/RAM time for disk time
M.P. Johnson, DBMS, Stern/NYU, Sp2004
6
B+ Trees

Parameter n  branching factor is n+1

Largest number s.t. one block can contain n
search-key values and n+1 pointers
30
Keys k < 30

120
Keys 30<=k<120
240
Keys 120<=k<240
Keys 240<=k
Each node (except root) has at least n/2 keys
40
50
60
Next leaf
40
50
60
M.P. Johnson, DBMS, Stern/NYU, Sp2004
7
Searching a B+ Tree

Exact key values:



Start at the root
If we’re in leaf, walk through its key values;
If not, look at keys K1..Kn


If Ki <= K <= Ki+1, look in child i
Range queries:


Select name
From people
Where age = 25
As above
Then walk left until test fails
M.P. Johnson, DBMS, Stern/NYU, Sp2004
Select name
From people
Where 20 <= age
and age <= 30
8
B+ Tree Example
Find the key 40
n=4
80
40  80
20
60
100
12
0
140
20 < 40  60
10
15
18
20
30
40
50
60
65
80
85
90
30 < 40  40
10
15
18
20
30
40
50
60
65
80
85
90
NB: Leaf keys are sorted; data pointed to is only if clustered
M.P. Johnson, DBMS, Stern/NYU, Sp2004
9
Clustered & unclustered B-trees

Data entries
Data entries
(Index
File)
(Data file)
Data Records
CLUSTERED
Data Records
UNCLUSTERED
B+ trees, and, or

Assume index on a,b,c

Intuition: phone book

WHERE a = ‘x’ and b = ‘y’

WHERE b = ‘y’ and c = ‘z’

WHERE a = ‘a’ and c = ‘z’

WHERE a = ‘x’ or b = ‘y’ or c = ‘z’
M.P. Johnson, DBMS, Stern/NYU, Sp2004
11
B+ trees and LIKE

Supports only hard-coded prefix LIKE checks

Intuition: phone book

Select * from T where a like ‘xyz%’

Select * from T where a like ‘%xyz’

Select * from T where a like ‘xyz%zyx%’
M.P. Johnson, DBMS, Stern/NYU, Sp2004
12
B-tree search efficiency

With params:




the largest n satisfying 4n+8(n+1) <= 4096 is n=340



block=4k
integer = 4b,
pointer = 8b
Each node has 170..340 keys
assume on avg has (170+340)/2=255
Then:




255 rows  depth = 1
2552 = 64k rows  depth = 2
2553 = 16M rows  depth = 3
2554 = 4G rows  depth = 4
M.P. Johnson, DBMS, Stern/NYU, Sp2004
13
B-trees in practice

Most DBMSs use B-trees for most indices



Speeds up





Default in MySQL
Default in Oracle
where clauses
Some like checks
Min or max functions
joins
Limitation: fields used must


Be a prefix of indexed fields
Be ANDed together
M.P. Johnson, DBMS, Stern/NYU, Sp2004
14
Next topic: Advanced types of indices

Spatial indices based on R-trees (R = region)

Support multi-dimensional searches on
“geometry” fields
2-d not 1-d ranges

Oracle:
CREATE INDEX geology_rtree_idx ON
geology_tab(geometry) INDEXTYPE IS
MDSYS.SPATIAL_INDEX;

MySQL:
CREATE TABLE geom (g GEOMETRY NOT
NULL, SPATIAL INDEX(g));

M.P. Johnson, DBMS, Stern/NYU, Sp2004
15
Advanced types of indices

Inverted indices for web doc search

First, think of each webpage as a tuple




One column for every possible word
True means the word appears on the page
Index on all columns
Now can search: you’re fired

 select * from T where youre=T and fired=T
M.P. Johnson, DBMS, Stern/NYU, Sp2004
16
Advanced types of indices
Can simplify somewhat:

1.
2.
For each field index, delete False entries
True entries for each index become a bucket
Create “inverted index”:





One entry for each search word
Search word entry points to corresponding
bucket
Bucket points to pages with its word
Amazon
M.P. Johnson, DBMS, Stern/NYU, Sp2004
17
Advanced types of indices

Function-based indices

Speeds up WHERE upper(name)=‘BUSH’, etc.
create index on T(my_soundex(name));
create index on T(substr(DOB),4,5));


Now supported in Oracle 8, not MySQL
Bitmap indices

Speeds up arbitrary combination of reqs


Not limited to prefixes or conjunctions
Now supported in Oracle 9, not MySQL
M.P. Johnson, DBMS, Stern/NYU, Sp2004
18
Bitmap indices



Assume table has n records
Assume F is a field with m different values
Bitmap index on F: m length-n bitstrings





One bitstring for each value of F
Each one says which rows have that value for F
Example:
n = , mF =
1
, mG =
2
3
Q: find rows where
4
F=50 or (F=30 and G=‘Baz’)
5
6
M.P. Johnson, DBMS, Stern/NYU, Sp2004
F
30
30
40
50
40
30
G
Foo
Bar
Baz
Foo
Bar
Baz
19
Bitmap index search


Larger example: (age,salary) of jewelry buyers:
Sal.
Age
Sal.
1
25
60
2
45
3
4
Age
Sal.
5
50
120
9
25
400
60
6
70
110
10
45
350
50
75
7
85
140
11
50
275
50
100
8
30
260
12
50
260
Bitmaps for age:


Age
25:100000001000, 30:000000010000, 45:01000000100,
50:001110000010, 60:000000000001, 70:000001000000,
85:000000100000
Bitmaps for salary:

60:110000000000, 75:001000000000, 100:000100000000,
110:000001000000, 120:000010000000, 140:000000100000,
260:000000010001, 275:000000000010,
20
M.P. Johnson, DBMS, Stern/NYU, Sp2004
350:000000000100, 400:000000001000
Bitmap index search





Query: find buyers of age 45-55 with salary
100-200
Age range: 010000000100 (45) |
001110000010 (50) = 011110000110
Bitwise or of Salary range: 000111100000
AND together: 011110000110 &
000111100000 = 000110000000
What does this mean?
M.P. Johnson, DBMS, Stern/NYU, Sp2004
21
Bitmap index search

Once we have row numbers, then what?


Get rows with those numbers (How?)
Bitmap indices in Oracle:
CREATE BITMAP INDEX ON T(F,G);

Best for low-cardinality fields



Boolean, enum, gender
 lots of 0s in our bitmaps
Compress: 000000100001  6141

“run-length encoding”
M.P. Johnson, DBMS, Stern/NYU, Sp2004
22
New topic: Recovery
Type of Crash
Prevention
Wrong data entry
Constraints and
Data cleaning
Disk crashes
Redundancy:
e.g. RAID, archive
Fire, theft,
bankruptcy…
Buy insurance,
Change jobs…
System failures:
e.g. blackout
DATABASE
RECOVERY
M.P. Johnson, DBMS, Stern/NYU, Sp2004
23
System Failures


Each transaction has internal state
When system crashes, internal state is lost


Don’t know which parts executed and which didn’t
Remedy: use a log


A file that records each action of each xact
Trail of breadcrumbs
M.P. Johnson, DBMS, Stern/NYU, Sp2004
24
Media Failures


Rule of thumb: Pr(hard drive has head crash
within 10 years) = 50%
Simpler rule of thumb: Pr(hard drive has head
crash within 1 years) = 10%



Serious problem
Soln: different RAID strategies
RAID: Redundant Arrays of Independent
Disks
M.P. Johnson, DBMS, Stern/NYU, Sp2004
25
RAID levels


RAID level 1: each disk gets a mirror
RAID level 4: one disk is xor of all others


E.g.:





Each bit is sum mod 2 of corresponding bits
Disk 1: 11110000
Disk 2: 10101010
Disk 3: 00111000
Disk 4:
How to recover?
M.P. Johnson, DBMS, Stern/NYU, Sp2004
26
Transactions


Transaction: unit of code to be executed
atomically
In ad-hoc SQL


one command = one transaction
In embedded SQL


Transaction starts = first SQL command issued
Transaction ends =



COMMIT
ROLLBACK (=abort)
Can turn off/on autocommit
M.P. Johnson, DBMS, Stern/NYU, Sp2004
27
Primitive operations of transactions

Each xact reads/writes rows or blocks: elms

INPUT(X)


READ(X,t)


copy transaction local variable t to element X
OUTPUT(X)


copy element X to transaction local variable t
WRITE(X,t)


read element X to memory buffer
write element X to disk
LOG RECORD
M.P. Johnson, DBMS, Stern/NYU, Sp2004
28
Transaction example

Xact: Transfer $100 from savings to checking

A = A+100;
B = B-100;

READ(A,t);
t
:= t+100;
WRITE(A,t);
READ(B,t);
t
:= t-100;
WRITE(B,t)
M.P. Johnson, DBMS, Stern/NYU, Sp2004
29
Transaction example

READ(A,t); t := t+100;WRITE(A,t); READ(B,t); t := t-100;WRITE(B,t)
Action
t
INPUT(A)
Mem A
Mem B
Disk A
Disk B
1000
1000
1000
READ(A,t)
1000
1000
1000
1000
t:=t+100
1100
1000
1000
1000
WRITE(A,t)
1100
1100
1000
1000
INPUT(B)
1100
1100
1000
1000
1000
READ(B,t)
1000
1100
1000
1000
1000
t:=t-100
900
1100
1000
1000
1000
WRITE(B,t)
900
1100
900
1000
1000
OUTPUT(A)
900
1100
900
1100
1000
OUTPUT(B)
900
1100
900
1100
900
M.P. Johnson, DBMS, Stern/NYU, Sp2004
30
The log



An append-only file containing log records
Note: multiple transactions run concurrently,
log records are interleaved
After a system crash, use log to:



Redo some transaction that didn’t commit
Undo other transactions that didn’t commit
Three kinds of logs: undo, redo, undo/redo

We’ll discuss only Undo
M.P. Johnson, DBMS, Stern/NYU, Sp2004
31
Undo Logging


Log records
<START T>


<COMMIT T>


T has committed
<ABORT T>


transaction T has begun
T has aborted
<T,X,v>

T has updated element X, and its old value was v
M.P. Johnson, DBMS, Stern/NYU, Sp2004
32
Undo-Logging Rules



U1: Changes logged (<T,X,v>) before being
written to disk
U2: Commits logged (<COMMIT T>) after being
written to disk
Results:



May forget we did whole xact (and so wrongly undo)
Will never forget did partial xact (and so leave)
Log-change, change, log-change, change,
Commit, log-commit
M.P. Johnson, DBMS, Stern/NYU, Sp2004
33
Undo-Logging e.g. (inputs omitted)
Action
T
Mem A
Mem B
Disk A
Disk B
Log
<START T>
READ(A,t)
1000
1000
1000
1000
t:=t+100
1100
1000
1000
1000
WRITE(A,t)
1100
1100
1000
1000
READ(B,t)
1000
1100
1000
1000
1000
t:=t-100
900
1100
1000
1000
1000
WRITE(B,t)
900
1100
900
1000
1000
OUTPUT(A)
900
1100
900
1100
900
OUTPUT(B)
900
1100
900
1100
900
<T,A,8>
<T,B,8>
COMMIT
<COMMIT T>
M.P. Johnson, DBMS, Stern/NYU, Sp2004
34
Recovery with Undo Log
After system’s crash, run recovery manager

1.
Decide for each xact T whether it was completed
<START T>….<COMMIT T>  yes
<START T>….<ABORT T>
 yes
<START T>……………………  no
2.
Undo all modifications from incomplete xacts, in
reverse order (why?) and abort each
M.P. Johnson, DBMS, Stern/NYU, Sp2004
35
Recovery with Undo Log

Read log from the end; cases:

<COMMIT T>: mark T as completed
<ABORT T>: mark T as completed
<T,X,v>:

if T is not completed then
write X=v to disk
else
ignore
<START T>: ignore


M.P. Johnson, DBMS, Stern/NYU, Sp2004
36
Recovery with Undo Log
Start:
…
…
<T2,X2,v2>
…
…
<START T5>
<START T4>
<T1,X1,v1>
<T5,X5,v5>
<T4,X4,v4>
<COMMIT T5>
<T3,X3,v3>
<T2,X2,v2>
Q: Which updates are
undone?
Crash!
M.P. Johnson, DBMS, Stern/NYU, Sp2004
37
Recovery with Undo Log

Note: undo commands are idempotent



How far back in the log do we go?



No harm done if we repeat them
Q: What if system crashes during recovery?
Don’t go all the way back to the start
May be very large
Better idea: use checkpointing
M.P. Johnson, DBMS, Stern/NYU, Sp2004
38
Checkpointing

Checkpoint the database periodically





Stop accepting new transactions
Wait until all current xacts complete
Flush log to disk
Write a <CKPT> log record, flush log
Resume accepting new xacts
M.P. Johnson, DBMS, Stern/NYU, Sp2004
39
Undo Recovery with Checkpointing
During recovery,
can stop at first
<CKPT>
…
…
<T1,X1,v1>
…
…
(all completed)
<CKPT>
<START T2>
<START T3
<START T5>
<START T4>
<T4,X4,v4>
<T5,X5,v5>
<T4,X4,v4>
<COMMIT T5>
<T3,X3,v3>
<T2,X2,v2>
other xacts
xacts T2,T3,T4,T5
M.P. Johnson, DBMS, Stern/NYU, Sp2004
40
Non-quiescent Checkpointing



Problem: database must freeze during
checkpoint
Would like to checkpoint while database is
operational
Idea: non-quiescent checkpointing

Quiescent: quiet, still, at rest; inactive
M.P. Johnson, DBMS, Stern/NYU, Sp2004
41
Next time

Next: Data warehousing mining!
For next time: reading online

Proj5 due next Thursday



no extensions!
Now: one-minute responses


Relative weight: warehousing, mining, websearch
Data mining techniques




NNs
GAs
kNN
Decision Trees
M.P. Johnson, DBMS, Stern/NYU, Sp2004
42