Download Temporal Databases: Principles and Practices

Document related concepts
no text concepts found
Transcript
Ho Chi Minh City University of Technology
Faculty of Computer Science and Engineering
Temporal Databases:
Principles and Practices
Suphamit Chittayasothorn and Vo Thi Ngoc Chau
Part 2: Temporal Data-related Issues
Dr. Vo Thi Ngoc Chau
([email protected])
February 13, 2014
Content
A
Quick Review
 Implementations
Support
for Temporal
 Temporal
Data Aggregation
 Temporal
Data Mining
 Conclusion
2
A Quick Review
- Temporal Database Concepts
Temporal aspect
Valid Time
Transaction Time
User-defined Time
Definition
Time when
the fact is
true in reality
Time when the fact
is current in the
database
Time uninterpreted and
parallel to data domains
like NUMBER
Provided by
User
System
User
Supported by
System
System
User
Special language
for semantics
Yes
Yes
No
Temporal Database - a database that supports some aspect of time,
not counting user-defined time (e.g, date_of_birth).
Temporal Language – Extensions to QUEL, SQL, OQL
3
A Quick Review
- The Handling of Temporal Data

Object/tuple timestamping vs. Attribute
timestamping
r1
1000
800
800
1200
‘ON1’
‘ON1’
‘ON2’
‘ON1’
[2000, 2003)
[2003, 2004)
[2004, 2006)
[2006, 9999)
Valid
Time
4
A Quick Review
- The Handling of Temporal Data

Object/tuple timestamping
Area
Owner
r1
1000
[2000, 2003)
r1
800
[2003, 2004)
r1
800
[2004, 2006)
r1
1200
[2006, 9999)
r1
‘ON1’
[2000, 2003)
r1
‘ON1’
[2003, 2004)
r1
‘ON2’
[2004, 2006)
r1
‘ON1’
[2006, 9999)
Shape
r1
[2000, 2003)
r1
[2003, 2004)
r1
[2004, 2006)
r1
[2006, 9999)
DATA REDUNDANCY
+ DATA SPLITTING
+ TEMPORAL DATA ANOMALIES
5
A Quick Review
- The Handling of Temporal Data

Attribute timestamping
Area
1000
r1
800
[2000, 2003)
[2003, 2004)
[2006, 9999)
900
[2000, 2003)
[2003, 2004)
r2
1100
‘ON1’
[2004, 2006)
1200
700
Owner
‘ON2’
‘ON3’
[2004, 2006)
[2006, 9999)
‘ON4’
Shape
[2000, 2003)
[2000, 2003)
[2003, 2004)
[2003, 2004)
[2006, 9999)
[2004, 2006)
[2004, 2006)
[2006, 9999)
[2000, 2003)
[2000, 2003)
[2003, 2004)
[2003, 2004)
[2006, 9999)
[2004, 2006)
[2004, 2006)
[2006, 9999)
6
Implementations
for Temporal Support

Temporal database system implementation

Implementation from scratch

Implementation on some existing DBMS: layered or integrated
A layered approach
An integrated approach
Temporal
Statement S
Temporal Support
Error
Result
Temporal
Statement S
Error Result
Scanner
Existing DBMS
Metadata
Management
Parser
Temporal
Support
Output Processor
Code Generator
Non-Temporal
Statement S’
Existing DBMS
7
Implementations
for Temporal Support

An example of temporal database system
implementation

Our proposed temporal compatible object relational database
system on Oracle 10g using the integrated approach

Valid time

Attribute timestamping

A temporal transparency environment

More at:
 Vo Thi Ngoc Chau, Suphamit Chittayasothorn, “A Temporal
Compatible Object Relational Database System”, in Proceedings of
SOUTHEASTCON, 2007.
 Vo Thi Ngoc Chau, Suphamit Chittayasothorn, “A Temporal Object
Relational SQL Language with Attribute Timestamping in a Temporal
Transparency Environment”, Data & Knowledge Engineering (2008),
doi: 10.1016/j.datak.2008.06.008.
8
Our Proposed Temporal Compatible
Object Relational Database System
Application
Programs
SQL with
Temporal Types
and Methods
Temporal
Application Programs
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
Application Programs
SQL wrapped in
Translation
Stored-Procedures
End-users
SQL with
Temporal Types
and Methods
Existing Application Programming Interface
Temporal
End-users
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
End-users
SQL wrapped in
Translation
Stored-Procedures
Existing End-User Interface
Temporal Extension
Non-Temporal ORDBMS
Temporal Database
9
Our Proposed Temporal Compatible
Object Relational Database System
Application
Programs
SQL with
Temporal Types
and Methods
Temporal
Application Programs
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
Application Programs
SQL wrapped in
Translation
Stored-Procedures
End-users
SQL with
Temporal Types
and Methods
Existing Application Programming Interface
Temporal
End-users
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
End-users
SQL wrapped in
Translation
Stored-Procedures
Existing End-User Interface
Temporal Extension
Non-Temporal ORDBMS
Temporal Database
10
Our Proposed Temporal Compatible
Object Relational Database System
Application
Programs
SQL with
Temporal Types
and Methods
Temporal
Application Programs
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
Application Programs
SQL wrapped in
Translation
Stored-Procedures
End-users
SQL with
Temporal Types
and Methods
Existing Application Programming Interface
Temporal
End-users
Non-temporal
End-users
Temporal SQL
SQL wrapped in
wrapped in
Translation
Translation Stored-Procedures
Stored-Procedures
Existing End-User Interface
Temporal Extension
Non-Temporal ORDBMS
Temporal Database
11
Our Proposed Temporal Compatible
Object Relational Database System
Upward Compatibility/Temporal Upward Compatibility
Application
Programs
SQL with
Temporal Types
and Methods
Temporal
Application Programs
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
Application Programs
SQL wrapped in
Translation
Stored-Procedures
End-users
SQL with
Temporal Types
and Methods
Existing Application Programming Interface
Temporal
End-users
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
End-users
SQL wrapped in
Translation
Stored-Procedures
Existing End-User Interface
Temporal Extension
Non-Temporal ORDBMS
Temporal Database
12
Our Proposed Temporal Compatible
Object Relational Database System
Sequenced Semantics
Application
Programs
SQL with
Temporal Types
and Methods
Temporal
Application Programs
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
Application Programs
SQL wrapped in
Translation
Stored-Procedures
End-users
SQL with
Temporal Types
and Methods
Existing Application Programming Interface
Temporal
End-users
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
End-users
SQL wrapped in
Translation
Stored-Procedures
Existing End-User Interface
Temporal Extension
Non-Temporal ORDBMS
Temporal Database
13
Our Proposed Temporal Compatible
Object Relational Database System
Non-sequenced Semantics
Application
Programs
SQL with
Temporal Types
and Methods
Temporal
Application Programs
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
Application Programs
SQL wrapped in
Translation
Stored-Procedures
End-users
SQL with
Temporal Types
and Methods
Existing Application Programming Interface
Temporal
End-users
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
End-users
SQL wrapped in
Translation
Stored-Procedures
Existing End-User Interface
Temporal Extension
Non-Temporal ORDBMS
Temporal Database
14
Our Proposed Temporal Compatible
Object Relational Database System
Interactive SQL Mode
Application
Programs
SQL with
Temporal Types
and Methods
Temporal
Application Programs
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
Application Programs
SQL wrapped in
Translation
Stored-Procedures
End-users
SQL with
Temporal Types
and Methods
Existing Application Programming Interface
Temporal
End-users
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
End-users
SQL wrapped in
Translation
Stored-Procedures
Existing End-User Interface
Temporal Extension
Non-Temporal ORDBMS
Temporal Database
15
Our Proposed Temporal Compatible
Object Relational Database System
Embedded SQL Mode
Application
Programs
SQL with
Temporal Types
and Methods
Temporal
Application Programs
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
Application Programs
SQL wrapped in
Translation
Stored-Procedures
End-users
SQL with
Temporal Types
and Methods
Existing Application Programming Interface
Temporal
End-users
Temporal SQL
wrapped in
Translation
Stored-Procedures
Non-temporal
End-users
SQL wrapped in
Translation
Stored-Procedures
Existing End-User Interface
Temporal Extension
Non-Temporal ORDBMS
Temporal Database
16
Implementations
for Temporal Support
 Built-in
temporal support in the SQL
standards and DBMSs

Part 7 SQL/Temporal was withdrawn.
 More
at: ISO/IEC JTC 1/SC 32 N 0436, Rationale
for the Withdrawal of Projects, 2000.

Built-in date-time related data types and
methods in many existing DBMSs
 MySQL,
Oracle, MS SQL Server, IBM DB2,
Informix, …
17
Implementations
for Temporal Support

Built-in temporal support in the SQL standards and
DBMSs

Valid time support with tuple timestamping in Oracle
10g/11g Workspace Manager


Transaction time support in Oracle 10g/11g Flashback Query


More at: Oracle Valid Time Support, Application developer’s
guide – workspace manager, 10g release 1(10.1), No. B1082401.
More at: Oracle Flashback Query, Application developer’s guide
– fundamentals, 10g release 1(10.1), No. B10795-01.
Transaction time support in MS SQL Server Immortal DB

More at: D. Lomet, R. Barga, M. Mokbel, G. Shegalov, R. Wang,
Y. Zhu, Transaction time support inside a database engine, in:
Proc. ICDE Conference, (IEEE, 2006) 35-46.
18
Implementations
for Temporal Support

Built-in temporal support in Oracle


Data types and methods related to date and time

SQL standards

Oracle 10g/11g
Oracle valid time support

Concepts

The WM_PERIOD Data Type

Data Definitions with Valid Time Support

Data Querying with Valid Time Support

Data Modifications with Valid Time Support
19
Data types and methods related to
date and time in SQL standards


Datetime data types

DATE, TIME WITHOUT TIME ZONE, TIMESTAMP
WITHOUT TIME ZONE, TIME WITH TIME ZONE,
TIMESTAMP WITH TIME ZONE

Fields: YEAR, MONTH, DAY, HOUR, MINUTE,
SECOND, TIMEZONE_HOUR, TIMEZONE_MINUTE
Interval data type

INTERVAL

Fields: YEAR, MONTH, HOUR, MINUTE, SECOND
20
Data types and methods related to
date and time in SQL standards

Operations involving datetimes and intervals

Arithmetic operators: -, +, *, /

<overlaps predicate>: the operator OVERLAPS to
determine if two chronological periods overlap in
time

<current date value function>: CURRENT_DATE

<current time value function>: CURRENT_TIME

<extract expression>: functions on a datetime or
interval to return an exact numeric value
representing the value of one component (field)
of the datetime or interval

…
21
Data types and methods related to
date and time in Oracle 10g/11g


Datetime data types

DATE, TIMESTAMP, TIMESTAMP WITH TIME
ZONE, TIMESTAMP WITH LOCAL TIME ZONE

Fields: YEAR, MONTH, DAY, HOUR, MINUTE,
SECOND, TIMEZONE_HOUR, TIMEZONE_MINUTE,
TIMEZONE_REGION, TIMEZONE_ABBR
Interval data types

INTERVAL YEAR TO MONTH, INTERVAL DAY TO
SECOND
22
Data types and methods related to
date and time in Oracle 10g/11g

Datetime arithmetic operations: +, -, *, /

Datetime functions: ADD_MONTHS,
CURRENT_DATE, CURRENT_TIMESTAMP,
DBTIMEZONE, EXTRACT(datetime), FROM_TZ,
LAST_DAY, LOCALTIMESTAMP, MONTHS_BETWEEN,
NEW_TIME, NEXT_DAY, NUMTODSINTERVAL,
NUMTOYMINTERVAL, ROUND(date),
SESSIONTIMEZONE, SYS_EXTRACT_UTC, SYSDATE,
SYSTIMESTAMP, TO_CHAR(datetime),
TO_TIMESTAMP, TO_TIMESTAMP_TZ,
TO_DSINTERVAL, TO_YMINTERVAL, TRUNC(date),
TZ_OFFSET
23
Oracle valid time support

So-called Workspace

Concepts

The WM_PERIOD Data Type

Data Definitions with Valid Time Support

Data Querying with Valid Time Support

Data Modifications with Valid Time Support
24
So-called Workspace

A workspace is a virtual environment that one or more users can share
to make changes to the data in the database.

A workspace logically groups collections of new row versions from one
or more version-enabled tables, and isolates these versions until they
are explicitly merged with production data or discarded, thus providing
maximum concurrency.

Users in a workspace always see a transactionally consistent view of
the entire database; that is, they see changes made in their current
workspace plus the rest of the data in the database as it existed either
when the workspace was created or when the workspace was most
recently refreshed with changes from the parent workspace.

The history option enables you to timestamp changes made to all rows
in a version-enabled table and to save a copy of either all changes or
only the most recent changes to each row. If you keep all changes
(specifying the “without overwrite” history option) when versionenabling a table, you keep a persistent history of all changes made to
all row versions, and enable users to go to any point in time to view the
database as it existed from the perspective of that workspace.
25
Concepts

The valid time support for version-enabled
tables

Valid time: the validity of the data

Each record is valid only within the time range associated
with the record.

Each row contains an added column to hold the
valid time period associated with the row.

A valid time range is specified and queries, insert,
update, and delete operations reflect and
accommodate the valid time range.
26
Concepts

Version-enabled tables

All rows in a version-enabled table can support
multiple versions of the data.

The history option enables timestamp changes to
be made to all rows in a version-enabled table and
to save a copy of either all changes or only the
most recent changes to each row.

If all changes (specifying the “without overwrite”
history option) are kept when a table is versionenabled, a persistent history of all changes made
to all row versions is kept and users are enabled to
go to any point in time to view the database as it
existed from the perspective of that workspace.
27
Concepts

Version-enabled tables

Referential integrity and uniqueness constraints

DDL operations related to version-enabled tables


Enclosed by the BeginDDL procedure and the CommitDDL
or RollbackDDL procedure
DDL statements on <table_name>_LTS instead of
<table_name>

Querying is carried out as usual.

Modification operations (insert, delete, and update)
on version-enabled tables

Carried out as usual
28
Examples
29
Examples
30
Examples
31
The WM_PERIOD Data Type

Used to specify a valid time range for the
session or workspace, and for a row in a
version-enabled table

Each instance of the WM_PERIOD data type
is a closed-open period of time [validFrom,
validTill) where validTill is excluded.
32
The WM_PERIOD Data Type

Operators on instances of the WM_PERIOD
data type

The relationship checking operators (input:
period1, period2; output: 1=TRUE, 0 = FALSE)


WM_OVERLAPS, WM_CONTAINS, WM_MEETS,
WM_EQUALS, WM_LESSTHAN, WM_GREATERTHAN
The set operators (input: period1, period2;
output: period3)

WM_INTERSECTION, WM_LDIFF, WM_RDIFF
33
Data Definitions with Valid Time
Support

Enable valid time support when versionenabling a table

Add valid time support to an existing
version-enabled table

Constraints on version-enabled tables that
have valid time support
34
Data Definitions with Valid Time
Support
 Enable
valid time support when
version-enabling a table
35
Data Definitions with Valid Time
Support

Add valid time support to an existing version-enabled table
36
Data Definitions with Valid Time
Support

Constraints on version-enabled tables that have
valid time support

Referential integrity constraints



If both with valid time support, an insert or update operation
on the referring table will fail if the valid time associated with
the new value at the referring column is not within the valid
time associated with the value at the referred column of the
referred table.
If either or both with no valid time support, valid time periods
are ignored in enforcing the constraint.
Unique constraints

Given an insert or update operation on a version-enabled table
with valid time support and a unique constraint on one or
many columns
 If the existing and inserted rows have the same value
combination at these constrained columns, their WM_VALID
values do not overlap.
37
Data Definitions with Valid Time
Support

Constraints on version-enabled tables that
have valid time support

Referential integrity constraints



For example, given two DEPARTMENTS and EMPLOYEES
tables, the DEPARTMENTS.MANAGER_ID column is a
foreign key referencing the EMPLOYEES.EMPLOYEE_ID
column.
Consider an insert or update operation with a new
DEPARTMENTS.MANAGER_ID value.
The operation will fail if the DEPARTMENTS.WM_VALID
value is not within the range of the
EMPLOYEES.WM_VALID value for the employee who is
being made the department manager.
38
Data Definitions with Valid Time
Support

Constraints on version-enabled tables that
have valid time support

Unique constraints



For example, given an EMPLOYEES table with a unique
constraint on an EMPLOYEE_ID column.
Consider an insert or update operation with a new
EMPLOYEE_ID value.
If the new EMPLOYEE_ID value is the same as an existing
EMPLOYEE_ID value, the operation will fail if the
WM_VALID values of the existing and inserted rows
overlap.
39
Data Querying with Valid Time
Support

Invoke the SetValidTime or SetValidTimeFilterOn
procedure with some valid time range


Do not invoke or invoke the SetValidTime
procedure with no parameters


Rows with the valid time that overlap the specified valid
time range are taken into account in the evaluation of a
query.
Rows that are valid at the current time are considered in
the evaluation of a query.
Operators are explicitly specified on the WM_VALID
column of version-enabled tables with the valid
time support  the non-sequenced manner
40
Data Querying with Valid Time
Support
41
Data Querying with Valid Time
Support
42
Data Querying with Valid Time
Support
43
Data Querying with Valid Time
Support
44
Data Modifications with Valid Time
Support

INSERT statements

Specify a valid time period for a new row  the
value at the WM_VALID column

If NULL for the WM_VALID column, the session
valid time period or [NOW, NULL) is used  no
concept of “temporal upward compatibility”.

Primary key and referential integrity constraints
are checked.
45
Data Modifications with Valid Time
Support

INSERT statements
46
Data Modifications with Valid Time
Support

DELETE statements

A sequenced delete operation deletes the
portion of a row that falls within the session
valid time range.

No concept of a non-sequenced delete operation


If a valid time was not set, all rows satisfying the
WHERE condition if any are considered for the delete
operation  incorrect description
If a valid time was not set, all rows that are valid at
the current time are considered  correct from a test
47
Data Modifications with Valid Time
Support

DELETE statements – before deletion
48
Data Modifications with Valid Time
Support

DELETE statements – a sequenced deletion
49
Data Modifications with Valid Time
Support

DELETE statements – after a sequenced deletion
50
Data Modifications with Valid Time
Support

DELETE statements – test a non-sequenced deletion
51
Data Modifications with Valid Time
Support

UPDATE statements

A sequenced update operation: no change is
specified to the WM_VALID column in the
UPDATE statement


The WM_VALID.ValidTill value for an updated row is
changed to the ValidFrom timestamp of the current
session valid time range, and a new row is created in
which the WM_VALID period reflects the current session
valid time range.  only for the overlapping case; this is
not generally applied to other cases (disjoint, within,
equal, contains)
A non-sequenced update operation: a change is
explicitly specified to the WM_VALID column in
the UPDATE statement and the rows that are
valid at the current time are deleted.
52
Data Modifications with Valid Time
Support

UPDATE statements
53
Data Modifications with Valid Time
Support

UPDATE statements
54
Data Modifications with Valid Time
Support

UPDATE statements
55
Data Modifications with Valid Time
Support

UPDATE statements
56
Data Modifications with Valid Time
Support

UPDATE statements
57
New Oracle Database 12c
temporal support

Examples from “Oracle® Database
Development Guide, 12c Release 1 (12.1),
E17620-11”
empno
last_name
start_time
end_time
58
New Oracle Database 12c
temporal support
empno
last_name
start_time
end_time
100
Ames
01-Jan-10
30-Jun-11
101
Burton
01-Jan-11
30-Jun-11
102
Chen
01-Jan-12
59
New Oracle Database 12c
temporal support
empno
last_name
start_time
end_time
100
Ames
01-Jan-10
30-Jun-11
101
Burton
01-Jan-11
30-Jun-11
102
Chen
01-Jan-12
60
empno
last_name
start_time
end_time
100
Ames
01-Jan-10
30-Jun-11
101
Burton
01-Jan-11
30-Jun-11
102
Chen
01-Jan-12
61
Summary on Oracle’s Support

Tuple timestamping

Each row is associated with a period of time.

Constraints: Referential integrity constraints,
Uniqueness constraints

No sequenced queries


Sequenced deletions


Operators on the WM_VALID column are explicitly invoked
by users.
Entire rows with the WM_VALID value that overlaps the
session valid time range are deleted.  A sequenced
deletion should affect only the one in common, the others
should be kept as-is.
More new Oracle Database 12c temporal
support at: www.oracle.com
62
Temporal Data Aggregation

Given a valid-time table - Employee
Name
REFC
Salary
ValidTime
Gender
D_birth
vtValue
Ed
E2
vtStart
vtEnd
2/1/1982
1/1/1988
M
Di
John
1/1/1988
1/1/1982
1/1/1962
12/31/9999
E5
M
1/1/1978
12/31/9999
Jack
5/15/1950
12/31/9999
White
12/1/1960
10/1/1960
1/1/1978
Johnson
12/31/9999
vtEnd
vtStart
vtEnd
20
2/1/1982
6/1/1982
30
6/1/1982
2/1/1985
D1
2/1/1982
2/1/1987
2/1/1985
2/1/1987
4/1/1987
12/31/9999
D2
4/1/1987
12/31/9999
30
1/1/1982
8/1/1984
40
8/1/1984
9/1/1986
D1
1/1/1982
12/31/9999
50
9/1/1986
12/31/9999
45
1/1/1984
1/1/1989
55
1/1/1989
12/31/9999
D4
1/1/1984
12/31/9999
40
1/1/1980
1/1/1984
D4
1/1/1980
1/1/1984
50
1/1/1984
12/31/9999
D3
1/1/1984
12/31/9999
30
1/1/1980
1/1/1984
D3
1/1/1980
1/1/1984
40
1/1/1984
1/1/1989
D4
1/1/1984
12/31/9999
45
1/1/1989
12/31/9999
40
F
M
F
ValidTime
vtValue
vtStart
7/1/1955
12/31/9999
E3
E4
ValidTime
vtValue
E1
Edward
Dept
1/1/1962
5/15/1950
12/1/1960
63
Temporal Data Aggregation

Given a valid-time table - Employee
Name
REFC
ValidTime
Gender
D_birth
vtValue
Ed
vtStart
vtEnd
2/1/1982
1/1/1988
a temporal
composition
M
1/1/1988
12/31/9999
Di
John
1/1/1982
1/1/1962
12/31/9999
E4
E5
M
Johnson
1/1/1978
12/31/9999
Jack
5/15/1950
12/31/9999
White
12/1/1960
10/1/1960
1/1/1978
E3
12/31/9999
2/1/1982
6/1/1982
30
6/1/1982
2/1/1985
2/1/1985
2/1/1987
4/1/1987
12/31/9999
30
1/1/1982
8/1/1984
40
8/1/1984
9/1/1986
50
9/1/1986
12/31/9999
45
1/1/1984
1/1/1989
55
1/1/1989
12/31/9999
40
1/1/1980
50
40
F
M
F
vtEnd
20
7/1/1955
ValidTime
vtValue
vtStart
time-dependent
component
E2
ValidTime
vtValue
E1
Edward
an attribute
Dept
history
Salary
vtStart
vtEnd
D1
2/1/1982
2/1/1987
D2
4/1/1987time
12/31/9999
component
D1
1/1/1982
12/31/9999
D4
1/1/1984
12/31/9999
1/1/1984
D4
1/1/1980
1/1/1984
1/1/1984
12/31/9999
D3
1/1/1984
12/31/9999
30
1/1/1980
1/1/1984
D3
1/1/1980
1/1/1984
40
1/1/1984
1/1/1989
D4
1/1/1984
12/31/9999
45
1/1/1989
12/31/9999
1/1/1962
5/15/1950
12/1/1960
64
Temporal Data Aggregation
 Challenge
yourselves with temporal
data aggregation

How to provide built-in temporal support
for users?

How to make use of available built-in
support from DBMS?
 Let’s try with different existing DBMSs!
65
List the highest salary at each department
at present.
Name
REFC
Salary
ValidTime
Gender
D_birth
vtValue
Ed
E2
vtStart
vtEnd
2/1/1982
1/1/1988
M
Di
John
1/1/1988
1/1/1982
1/1/1962
12/31/9999
E5
M
1/1/1978
12/31/9999
Jack
5/15/1950
12/31/9999
White
12/1/1960
vtEnd
D1
2/1/1982
2/1/1987
vtEnd
20
2/1/1982
6/1/1982
30
6/1/1982
2/1/1985
2/1/1985
2/1/1987
4/1/1987
12/31/9999
D2
4/1/1987
12/31/9999
30
1/1/1982
8/1/1984
40
8/1/1984
9/1/1986
D1
1/1/1982
12/31/9999
50
9/1/1986
12/31/9999
45
1/1/1984
1/1/1989
55
1/1/1989
12/31/9999
D4
1/1/1984
12/31/9999
40
1/1/1980
1/1/1984
D4
1/1/1980
1/1/1984
50
1/1/1984
12/31/9999
D3
1/1/1984
12/31/9999
30
1/1/1980
1/1/1984
D3
1/1/1980
1/1/1984
SAL_MAX
40
1/1/1984
1/1/1989
D4
1/1/1984
12/31/9999
7/1/1955
10/1/1960
1/1/1978
Johnson
vtStart
40
F
M
DEPT_IDF
12/31/9999
ValidTime
vtValue
vtStart
12/31/9999
E3
E4
ValidTime
vtValue
E1
Edward
Dept
1/1/1962
5/15/1950
12/1/1960
D2
40 45
D1
50
D4
55
D3
50
1/1/1989
12/31/9999
66
List the number of employees whose names are always
different from ‘John’ and their highest salaries along time
Name
REFC
Salary
ValidTime
Gender
D_birth
vtValue
Ed
E2
vtStart
vtEnd
2/1/1982
1/1/1988
M
Di
John
1/1/1988
1/1/1982
1/1/1962
12/31/9999
E5
10/1/1960
1/1/1978
M
Johnson
1/1/1978
12/31/9999
Jack
5/15/1950
12/31/9999
White
12/1/1960
12/31/9999
EMP#
4
vtStart
vtEnd
vtStart
vtEnd
20
2/1/1982
6/1/1982
30
6/1/1982
2/1/1985
D1
2/1/1982
2/1/1987
2/1/1985
2/1/1987
4/1/1987
12/31/9999
D2
4/1/1987
12/31/9999
30
1/1/1982
8/1/1984
40
8/1/1984
9/1/1986
D1
1/1/1982
12/31/9999
50
9/1/1986
12/31/9999
45
1/1/1984
1/1/1989
55
1/1/1989
12/31/9999
D4
1/1/1984
12/31/9999
40
1/1/1980
1/1/1984
D4
1/1/1980
1/1/1984
50
1/1/1984
12/31/9999
D3
1/1/1984
12/31/9999
30
1/1/1980
1/1/1984
D3
1/1/1980
1/1/1984
40
1/1/1984
1/1/1989
D4
1/1/1984
12/31/9999
45
ValidTime
1/1/1989
12/31/9999
40
F
1/1/1962
M
ValidTime
vtValue
7/1/1955
12/31/9999
E3
E4
ValidTime
vtValue
E1
Edward
Dept
5/15/1950
F
12/1/1960
vtValue
SAL_MAX
vtStart
vtEnd
40
1/1/1980
1/1/1984
50
1/1/1984
12/31/9999
67
For each gender, list the number of employees and the
highest salary that employees have been paid along time.
Name
REFC
Salary
ValidTime
Gender
D_birth
vtValue
Ed
E2
vtStart
vtEnd
2/1/1982
1/1/1988
M
Di
John
1/1/1988
1/1/1982
12/31/9999
1/1/1962
F
1/1/1978
Jack
5/15/1950
10/1/1960
1/1/1978
M
Johnson
EMP#
M
White
12/1/1960
M
12/31/9999
F
3
20
2/1/1982
6/1/1982
30
6/1/1982
2/1/1985
2/1/1985
2/1/1987
4/1/1987
12/31/9999
30
1/1/1982
8/1/1984
40
8/1/1984
9/1/1986
50
9/1/1986
12/31/9999
45
1/1/1984
1/1/1989
55
1/1/1989
SAL_MAX 12/31/9999
40
5/15/1950
vtValue50
12/1/1960
50
55
F
2
vtStart
vtEnd
D1
2/1/1982
2/1/1987
D2
4/1/1987
12/31/9999
D1
1/1/1982
12/31/9999
D4
1/1/1984
12/31/9999
D4
1/1/1980
1/1/1984
vtEnd D3
1/1/1984
12/31/9999
D3
1/1/1980
1/1/1984
1/1/1989 D4
1/1/1984
12/31/9999
1/1/1962
12/31/9999
Gender
12/31/9999
vtEnd
40
40
E5
ValidTime
vtValue
vtStart
7/1/1955
12/31/9999
E3
E4
ValidTime
vtValue
E1
Edward
Dept
30
40
45
1/1/1980
1/1/1984
ValidTime
1/1/1984
12/31/9999
1/1/1980
1/1/1984
1/1/1984
1/1/1989
1/1/1989
12/31/9999
vtStart
1/1/1980
1/1/1984
1/1/1989
1/1/1984
12/31/9999
30
1/1/1980
1/1/1984
40
1/1/1984
9/1/1986
50
9/1/1986
12/31/9999
68
List the highest salary at each department
with less than 20 employees along time.
Salary
REFC …
Dept
ValidTime
vtValue
ValidTime
vtEnd
…
30
vtStart
vtEnd
vtStart
vtEnd
1/1/1982 8/1/1984
1/1/1982 12/31/9999 40
8/1/1984 9/1/1986
2/1/1982 2/1/1987
6/1/1982 2/1/1985
D1
2/1/1985 2/1/1987
E2
E3
E4
E5
…
…
…
…
D2
vtEnd
30
2/1/1982 6/1/1982
40
ValidTime
vtValue
vtStart
D1
E1
DEPT_ID
vtValue
vtStart
20
SAL_MAX
50
9/1/1986 12/31/9999
4/1/1987 12/31/9999
4/1/1987 12/31/9999
D2
4/1/1987 12/31/9999 40
4/1/1987 12/31/9999
30
1/1/1982 8/1/1984
D4
1/1/1980 1/1/1984
40
1/1/1980 1/1/1984
40
8/1/1984 9/1/1986
45
1/1/1984 1/1/1989
D4
1/1/1984 12/31/9999
55
1/1/1989 12/31/9999
30
1/1/1980 1/1/1984
50
9/1/1986 12/31/9999
45
1/1/1984 1/1/1989
D1
D4
1/1/1982 12/31/9999
55
1/1/1989 12/31/9999
40
1/1/1980 1/1/1984
50
1/1/1984 12/31/9999 D3
1/1/1984 12/31/9999
30
1/1/1980 1/1/1984
D3
1/1/1980 1/1/1984
40
1/1/1984 1/1/1989
D4
1/1/1984 12/31/9999
45
1/1/1989 12/31/9999
D4
D3
1/1/1980 1/1/1984
D3
1/1/1984 12/31/9999 50
1/1/1984 12/31/9999
1/1/1984 12/31/9999
1/1/1980 1/1/1984
Grouped by a temporal column
69
Temporal Data Mining
 What
is Data Mining?
 Frequent
Pattern Mining
70
What is Data Mining?

Obama campaign’s secret strategy – 2012

Knowing your customers

Predict final status of undergrad students

Predict heart disease

Car classification

Market analysis

…
71
Data
Mining
Information/
Knowledge
What is Data Mining?
72
What is Data Mining?
Data mining –
searching for
knowledge (interesting
patterns) from large
amounts of data.
More at: Jiawei Han, Micheline Kamber, Jian Pei, “Data Mining: Concepts and Techniques”,
Third Edition, Morgan Kaufmann Publishers, 2012.
73
What is Data Mining?
Frequent
pattern
mining is a
mining task
Data mining as a step in the process of knowledge discovery
74
Frequent Pattern Mining
Application domains
Market basket analysis
Predicting stock prices in the future
Analyzing temporal data encountered
in Electronic Health Record systems
Predicting seismic trends
75
Frequent Pattern Mining
Frequent patterns
The patterns that occur frequently in data.
Milk and bread are
frequently bought together
in grocery stores  [milk,
bread] is a frequent
pattern.
A sequence of buying
a PC, followed by a
camera and then a
memory card
frequently occurred 
It is a frequent
sequential pattern.
76
Example of point-based data
Example of interval-based data
Example of time series data
A time series - sequence of
numbers collected at regular
intervals
over a period of
Are they temporal
data?
time.
77
Frequent Pattern Mining
Database
Frequent pattern
On point-based sequential
databases
On interval-based
sequential databases
On time series databases
On databases of multiple
time series
78
Frequent Pattern Mining

More at:
 R. Agrawal, R. Srikant. Mining sequential patterns. In Proc. ICDE, 1995.
 I. Batal, D. Fradkin, J. Harrison, F. Mörchen, M. Hauskrecht. Mining recent
temporal patterns for event detection in multivariate time series data. In Proc.
KDD, pp.280-288, 2012.
 I. Batyrshin, L. Sheremetov, R. Herrera-Avelar. Perception based patterns in
time series data mining. Studies in Computational Intelligence (SCI) 36 (2007)
85–118.
 M. Lin, S. Lee. Fast discovery of sequential patterns through memory indexing
and database partitioning. Journal of Information Science and Engineering 21
(2005) 109–128.
 F. Mörchen, A. Ultsch. Efficient mining of understandable patterns from
multivariate interval time series. Data Min Knowl Disc 15 (2007) 181-215.
 J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, M-C. Hsu.
Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern
growth. In Proc. the 17th International Conference on Data Engineering (ICDE),
2001.
79
Frequent Pattern Mining

More at:
 P. Papapetrou, G. Kollios, S. Sclaroff. Discovering frequent arrangements of
temporal intervals. In Proc. ICDM, 2005.
 E. Winarko, J. F. Roddick. ARMADA – An algorithm for discovering richer
relative temporal association rules from interval – based data. Data &
Knowledge Engineering 63 (2007) 76-90.
 S-Y. Wu, Y-L. Chen. Mining nonambiguous temporal patterns for intervalbased events. IEEE Transactions on Knowledge and Data Engineering 19
(2007) 742–758.
 Q. Yang, X. Wu. 10 challenging problems in data mining research.
International Journal of Information Technology & Decision Making 5 (4)
(2006) 597–604.
 R. Sadasivam, K. Duraiswamy, Efficient approach to discover interval-based
sequential patterns, Journal of Computer Science, 9 (2): 225-234, 2013.
 T. B. T. Phan, T. N. C Vo, T. A. Duong – An efficient interval-based approach
to mining frequent patterns in a time series database. In Proc. MIWAI’13,
80
Conclusion

Time is ubiquitous.


Consider temporal data from semantics to
management to implementation


Check if you can find some application domain where time
does not exist
Let’s get started with some advanced DBMS where your
temporal data can be supported the most
And now, a very very very … large amount of
temporal data gathered over the time

What should we do next to make the most of such existing
data?
 Let’s try something with temporal data analysis and mining
81
82
Related documents