Download 投影片 1

Document related concepts

Oracle Database wikipedia , lookup

Microsoft Access wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Ingres (database) wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

PL/SQL wikipedia , lookup

SQL wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational algebra wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Fushen Wang, XinZhou, Carlo Zaniolo
Using XML to Build Efficient TransactionTime Temporal Database Systems on
Relational Databases
In Time Center, 2005
資工所.莊政道.D95922014
Contents
Introduction
 Viewing Relation History in XML
 Temporal Queries using XQuery
 The ArchIS System
 Temporal Clustering and Indexing
 Performance Study
 Database History Compression
 Conclusion

1
Introduction
The additional complexity of going from standard
queries into temporal ones:
XML/XQuery < relation tables and SQL
 The evolution history of a relational database:

be viewed naturally using XML
 be queried effectively using Xquery

the temporal data and temporal queries can be
supported efficiently:
data-compression, clustering, indexing and querymapping techniques
 data model:
temporally ungrouped vs. temporally grouped

2
Contents
Introduction
 Viewing Relation History in XML
 Temporal Queries using XQuery
 The ArchIS System
 Temporal Clustering and Indexing
 Performance Study
 Database History Compression
 Conclusion

3
Viewing Relation History in XML

Traditional transaction-time databases
(temporally ungrouped)
4
Viewing Relation History in XML (cont.)

Traditional transaction-time databases
(temporally ungrouped)
Change an attribute valueadd a new history tuple
 Redundancy information
 Temporal queries frequently coalesce tuples



Complex and hard to scale in RDBMS
Overcome using:
Time-stamped history of each attribute is grouped
under the attribute
5
6
7
Viewing Relation History in XML (cont.)

H-documents
(or H-views when these are virtual representations)
the nested representations hard hard to be represented
in flat tables, they can be naturally represented by XMLbased hierarchical views.
 greatly reduces the need for coalescing
 the effectiveness of expressing complex temporal
queries with XQuery

8
9
10
Contents
Introduction
 Viewing Relation History in XML
 Temporal Queries using XQuery
 The ArchIS System
 Temporal Clustering and Indexing
 Performance Study
 Database History Compression
 Conclusion

11
Temporal Queries using XQuery
temporal projection
 temporal snapshot
 temporal slicing
 temporal join
 temporal aggregate
 restructuring

12
Temporal Queries using Xquery (cont.)
Grouped data model:
result is already
coalesced
 Ungrouped data model:
coalesced on the
results

13
Temporal Queries using Xquery (cont.)
xs: namespace of XML Schema
 tstart($e): start date
 tend($d): end date

14
Temporal Queries using Xquery (cont.)
toverlaps($a, $b):returns true if one node overlaps
with another one, and false otherwise
 telement($a, $b) constructs an element with a and
b as its attributes

15
Temporal Queries using Xquery (cont.)

overlapinterval($a,$b) as Ch.3
16
Temporal Queries using Xquery (cont.)
17
Temporal Queries using Xquery (cont.)

function restructure takes two lists and returns all
the overlapped intervals.
18
Temporal Queries using Xquery (cont.)

tcontains($a,$b) : to check if one interval covers
another.
19
Temporal Queries using Xquery (cont.)

tequals($d1,$d2) :to check if two nodes have equal
intervals
20
Temporal Queries using Xquery (cont.)

Temporal Functions

Restructuring functions
coalesce($l): coalesce a list of nodes
 restructure($a,$b): return all the overlapped intervals on two
set of nodes.


Interval functions (such as Ch.3)
toverlaps($a,$b), tprecedes($a,$b), tcontains($a,$b),
tequals($a,$b), tmeets($a,$b)
 overlapinterval($a,$b)

21
Temporal Queries using Xquery (cont.)

Temporal Functions

Duration and date/time functions
 timespan($e): the time span of a node
 tstart($e): the start time of a node
 tend($e): the end time of a node
 tinterval($e): the interval of a node
 telement($Ts, $Te): constructs an empty element
telement with attributes tstart and tend
 rtend($e): recursively replaces all the occurrence of
“9999-12-31” with the value of current date
 externalnow($e): recursively replaces all the occurrence
of “9999-12-31” with the string “now”.
22
Temporal Queries using Xquery (cont.)

Support for ‘now’
Now  current_timestamp, current_date
 Value(now)  end-of-time (as 9999-12-31)

Access with tstart($e), tend($d)
 Search based on indexes
 Temporal ordering can be used without any
change


query-1  output(9999-12-31 for ‘now’)  input of
query-2
23
Contents
Introduction
 Viewing Relation History in XML
 Temporal Queries using XQuery
 The ArchIS System
 Temporal Clustering and Indexing
 Performance Study
 Database History Compression
 Conclusion

24
The ArchIS System
underlying database systems: RDBMSs
 design issues:

how to map (shred) the XML views representing the Hdocuments into tables (which we call Htables)
 how to translate queries from the XML views to the Htables, and
 which indexing, clustering and query mapping
techniques should be used for high performance.

25
The ArchIS System (cont.)
26
The ArchIS System (cont.)
Global
relation
table
Table in
Relational
DB
H-Tables
Key table
+tstart, tend
+tstart, tend
attribute
attribute
history
table
attribute
history table
history table
27
The ArchIS System (cont.)
Relation in the current database:
employee(id, name, salary, title, deptno)
idkey attribute
 The Key Table:
employee_id(id, tstart, tend)
idnot change


Composite keys
ex. (supplierno, itemno)
lineitem_id(id,supplierno, itemno, tstart, tend)
id : a unique value generated from (supplierno,itemno)
28
The ArchIS System (cont.)

Attribute History Tables:
employee_name(id,name,tstart,tend)
employee_deptno(id, deptno, tstart,tend)
employee_salary(id, salary, tstart,tend)
employee_title(id, title, tstart,tend)

Index on ‘id’efficiently join
29
The ArchIS System (cont.)

Insert a new tuple:
TSTART =current timestamp
 TEND = now


Delete a current tuple:


TEND =current timestamp
Update a current tuple:

Delete + Insert
30
The ArchIS System (cont.)

Global Relation Table:
relations(relationname, tstart, tend)
record all the relations history in the database schema,
i.e., the time spans covered by the various tables in the
database.
 the root elements of H-documents.

31
The ArchIS System (cont.)

Query Mapping
XQuery on H-views 
equivalent SQL/XML expressions on H-tables
 on H-tables, SQL/XML constructs:

XMLElement: return elements
 XMLAttributes: return attributes
 XMLAgg: aggregate function, constructs an
XML value from a collection of XML value
expressions.

32
The ArchIS System (cont.)

Query: to return an new employees element
containing all the employees hired after 02/04/2003
SQL/XML query:
Output:
33
The ArchIS System (cont.)
SQL/XML query:
34
The ArchIS System (cont.)
SQL/XML query:
35
The ArchIS System (cont.)

mapping(H-viewsH-tables): 5 steps

Identification of variable range
tuple vs. attribute
 key relation vs. attribute relation
 Variables(XQuery:for,let)  Varibales(SQL/XML:from)






Generation of join conditions
Generation of the where conditions
Translation of built-in functions
Output generation
SQL/XML queries often contan many natural joins
(N.id=T.id), Joins execute very fast (id attribute is
sotred)
36
Contents
Introduction
 Viewing Relation History in XML
 Temporal Queries using XQuery
 The ArchIS System
 Temporal Clustering and Indexing
 Performance Study
 Database History Compression
 Conclusion

37
Temporal Clustering and Indexing (cont.)

Usefulness-Based Clustering
Improve snapshot queries
 Store attribute history in a segment
 U = Ulive/Uall, where Ulive is the count of live(or current) tuples and
uall is the count of all tuples


As U <Umin:
A new segment Si is allocated;
 The interval of this segment is recorded in the table
segment (segno, segstart, segend)
 All tuples in SEGlive are copied into a new segment Si, sorted by
ID;
 All live tuples in SEGlive are copied into a new live segment
SEGlive’ , and the old live segment is dropped.

38
Temporal Clustering and Indexing(cont.)
39
Temporal Clustering and Indexing (cont.)
40
Temporal Clustering and Indexing (cont.)

Advantages for segment-based clustering:
the current live segment always has a high usefulness,
which assures effective updates.
 records are globally temporally clustered on segments.
 for snapshot queries, only one segment is used, and for
temporal slicing queries, only segments involved are
used, thus such queries can be more efficient.

41
Temporal Clustering and Indexing (cont.)

Storage usage
42
Temporal Clustering and Indexing (cont.)

Query Mapping with Clustering

First find the segment number satisfying the tstart and
tend conditions and then search those segments
SQL/XML query:
43
Temporal Clustering and Indexing (cont.)

unless the number of segments becomes very large
and exceeds the number of main memory blocks
available for sort-merge joins, joining H-tables
remains a very efficient one-pass operation.
44
Contents
Introduction
 Viewing Relation History in XML
 Temporal Queries using XQuery
 The ArchIS System
 Temporal Clustering and Indexing
 Performance Study
 Database History Compression
 Conclusion

45
Performance Study

Query:
46
Performance Study (cont.)

Query Perofrmance:
Segment-based archiving >good no segment-based
arching
47
Performance Study (cont.)
48
Performance Study (cont.)

Storate Utiltization
49
Performance Study (cont.)
50
Contents
Introduction
 Viewing Relation History in XML
 Temporal Queries using XQuery
 The ArchIS System
 Temporal Clustering and Indexing
 Performance Study
 Database History Compression
 Conclusion

51
Database History Compression

Block-based Compression: BlockZIP
compresses the data as block-sized blocks
 after compression with BlockZIP, the output consists of
a set of block-sized compressed blocks concatenated
together.
 uncompressing of the whole file is not needed.
 BlockZIP facilitates uncompression at the granularity of
a block, thus snapshot and temporal slicing queries can
be efficient, since only a small number of blocks need to
be uncompressed.

52
Database History Compression (cont.)
53
Database History Compression (cont.)
54
Contents
Introduction
 Viewing Relation History in XML
 Temporal Queries using XQuery
 The ArchIS System
 Temporal Clustering and Indexing
 Performance Study
 Database History Compression
 Conclusion

55
Conclusion
the transaction time histories of relational
databases can be stored and queried efficiently by
using XML and SQL/XML.
 the query mapping, indexing, clustering, and
compression techniques used to achieve
performance levels well above those of a native
XML DBMS.
 The approach of ArchIS can be used to add a
transaction-time capability to any existing RDBMS.
 realization does not require the invention of new
techniques, nor costly extensions of existing
standards

56
THE END !