Download Transaction Management in Fully Temporal System

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Serializability wikipedia , lookup

Clusterpoint wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Concurrency control wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation
Transaction Management in Fully Temporal System
Michal Kvet, Karol Matiaško
Marek Kvet
University of Zilina,
Faculty of Management Science and Informatics
Zilina, Slovakia
[email protected],
[email protected]
University of Zilina,
University Science Park
Žilina, Slovakia
[email protected]
could not be based on the historical data because of the time
consumption (sometimes even days to load all needed
snapshots). In addition, the granularity of the data is still
growing, so number of backups was above the acceptable
level. If the all backups and also log files were not stored
completely, some operation could be even missing. The
following figure shows the problem. Backup does not
register two operations occurred (multiple updates, insert and
subsequent delete of the same object).
Abstract—Timed data processing belongs to one of the most
important task of the development of current database
systems. Conventional database approach offers paradigm for
current valid data processing. However, it is necessary to store
and manage also historical values. Moreover, temporal system
should provide structure for future valid data processing. The
basic structure for processing temporal data was developed in
the past, which was able to store images of objects at different
time points or intervals. However, significant factor is the
effectiveness of the whole system. If there is requirement to
process data using transactions, the problem is more
complicated. This paper deals with the new approach - column
level temporal data processing with the management of
transactions. This system can be, thus, defined as bi-temporal
structure, but the root is uni-temporal system.
Keywords-transaction; object level temporal data; fully
temporal system; jobs; changes monitoring
I.
Figure 1. Backup problem - possible loss of data
Moreover, it is not enough to find a faster solution for
historical data processing. What does the term “faster”
mean? To order shorten the processing time from days to
hours, from tens of hours to a couple of hours? The usage of
that solution is unacceptable and inapplicable. The aim is to
create support for the temporal data, that the difference
between the processing in the currently valid data and
historical data is minimized. Thus, it is necessary to create a
system with easy data.
However, historical data management is not a complete
temporal system, because it must be allowed to process
values, the validity of which begins in the future. If the begin
time of the validity of the object occurred, the database
system must be able to update the data without user
intervention. This system, in addition, offers very poor
support for transaction management.
Later, the temporal systems on the object level have been
developed. The structure and the principle of these solutions
are described in next section. The main criterion for
comparing the solutions and structure is the performance,
which is based on the processing time to get the required
data and also the size of the whole structure. Although large
disc capacity is currently available, there is still need to
effectively store and process data, because temporal data are
really extensive and contain changes of the object states over
the years [5] [6].
This paper deals with the new approach based on the
column level temporal data, describes the principles and
INTRODUCTION
Database systems are one of the most important parts of
the information technology. It is generally known that
database system is usually the basic part - the root of any
information system. The development of data processing has
brought the need for modelling and accessing large structures
based on the simplicity, reliability and speed of the system
[1] [3] [4]. However, even today, when database technology
is widespread, most databases process and represent the
states of the data valid in this moment. Properties and states
of the objects evolve over the time, become invalid and are
replaced by new ones. Once the state is changed, the
corresponding data are updated in the database and it still
contains only the current valid data. However, temporal data
processing is very important in dynamically evolving
systems, industry, communication systems and also in
systems processing sensitive data, which incorrect change
would cause a great harm. It can also help us to optimize
processes and make future decisions [2] [3] [4].
Historical data management using log files and archives
is completely inappropriate. Although the data can be
obtained, it is too complicated, it lasts to too much time and
the output data are obviously not in suitable form (raw
material).
The main disadvantage is the need of the administrator
intervention (operation manager). Operational decisions
978-1-4799-4923-6/14 $31.00 © 2014 IEEE
DOI 10.1109/UKSim.2014.26
147
techniques. Column level temporal structure does not deal
with the complete object level temporal data, but divides the
object attributes into conventional and temporal columns.
Changes are managed only in temporal part. Each column is
processed separately. The result is a new concept that brings
the ability to define transaction data in the system which
manages only the validity period (similar to uni-temporal
system, but on column level). New temporal data definition
is compared to existing solutions based on access and
processing and also the size. The results are in the
"Experiments" section.
II.
REQUIREMENTS AND ASPECTS
System requirements can be divided into two parts with
special aspects: [6] [11] [12]
1. Aspect of usability (easy methods) – the aim is to
provide access to outdated information as easily and quickly
as to the actual values. Transactions for managing temporal
data must be as simple as for current data processing.
Moreover, it is necessary to define a way to combine the past
and present.
2. Aspect of performance (speed and correctness of
results) requires results in the same form as when accessing
the actual data with adequate processing time. The difference
in accessing the object at any point in time should be
minimal.
III.
TEMPORAL DATA MANAGEMENT
Figure 2. Updating temporal data
Temporal databases define new paradigm for selecting
one or more rows based on the specified criteria, for
projecting of one or more columns to the output sets and for
joining the tables by specifying relationship criteria. Rows
with the different values of the primary key (PK) can
represent one object at different times. Transactions for
inserting, updating and deleting the rows must therefore
specify not only the object itself, but also processed period.
If the valid time of the object is defined by time interval, the
transaction must include time period - 2 time point values begin and end timestamps (dates). This means that the
update query does not cause only update of existing data,
but also insert of the new row based on the validity
intervals. The figure 2 shows the update of the states of the
object ID=1. Valid time of the new state overlaps the
existing intervals, so they must be split. [2] [6] [8]
One row in the temporal system can be represented by
three ways based on the time validity and transaction time.
The figure 3 shows the principle of time management. The
first type represents the most common used method today –
conventional table – approach, which does not provide
management for time validity. The second one extends the
primary key by the interval of validity (BD1, ED1) – unitemporal system. In this case, it is necessary to ensure, that
these intervals do not overlap. Simply, there cannot be more
than 1 actually valid state of the same object. Bi-temporal
structure extends uni-temporal system by the transaction
time (BD2, ED2) [12].
Figure 3. Conventional and temporal structure [6]
The figure 4 shows the time representation of the
conventional, uni-temporal, bi-temporal table and also our
developed system – column level temporal structure.
Figure 4. Time representation
148
IV.
UNI-TEMPORAL STRUCTURE – OBJECT LEVEL
The easiest and also often used method to manage
temporal data is uni-temporal system. It is based on the
extension of the conventional (non-timed) model. The
primary key now contains not only the object identifier, but
also one or two attributes determining the validity of the row.
Consequently, one object can be defined by the various
numbers of the rows, but not more than one defines the
object at any time point. Thus, the data modelling operations
must define not only the object itself, but also the time point
expressing the begin timestamp (or other timed attribute
based on the granularity like day, month and so on) or two
attributes expressing the time interval. In our case, the row is
defined by the validity.
Special type of uni-temporal system is a solution that
contains only one time attribute that is part of the primary
key. This means that any change of the corresponding object
determines the validity of the prior state. The following
figure shows the representation of such a model, as well as
corresponding standard uni-temporal system. The first part of
the figure consists only of the begin time of the validity. The
second one is a standard model with the closed-closed
representation of the interval, which is intuitive, it exactly
defines begin and end time of the validity. However, the
problem is how to define next valid interval. If the clock time
tick is a day, the problem is greater due to the last day of the
month, leap year and so on. In addition, what about changing
the minimal time between two states, two modifications of
the database? It can create gaps of undefined attributes, if the
time clock is modified (e.g. from days to hours). The whole
database must be reconstructed for purpose of gap
elimination. The last type in the figure 5 is open-closed
representation.
Figure 6. Example of the database model (student and subjects) [10]
This is only part of the issue, which, however,
sufficiently shows that it is necessary to create a new
complex solution.
V.
COLUMN LEVEL TEMPORAL SYSTEM
Object level temporal system does not fulfill the
performance requirement because of the duplicities. The
changes rate of the attribute is almost always different,
therefore it is not appropriate to use object level temporal
model. Better way is our developed system, which manages
not the whole objects, but only attributes. If one or more
attributes are changed, only this information is stored in the
database, not the whole object values are stored. It
significantly reduces the size of the structure.
The following figure shows the principle and the structure.
The external level is based on the conventional tables, so
there is no problem for existing applications. History
management provides the developed temporal table.
Figure 5. Types of uni-temporal table modelling [10]
The mentioned solution seems to be easy, but it does not
address the fundamental problem of undefined states, the
time intervals during which the state of the object is partially
or completely undefined. We cannot easily replace the
previous value with NULL the value, because it can have
special denotation. In addition, some attributes cannot have
the NULL value limited by the definition of the attribute
column. The problem however occurs, if even one attribute
value is changed to the undefined state. The whole state of
the object must be denoted as undefined, respectively
incorrect. It must be, therefore, possible to distinguish the
condition of correctness and completeness of the object state.
The other one can be based on the moving object into
table containing only undefined states. However, what about
the foreign keys? Imagine the manipulation (fig. 6). These
problems very well describe the limitations of the system.
Figure 7. Column level temporal system [10]
Temporal table consists of the next mentioned attributes
[7] [8] [9] – see also fig. 7:
x ID change
x ID previous change – references the last change of
an object identified by ID. This attribute can also
have NULL value that means, the data have not
149
x
x
x
x
The process of the dropping the job is much easier and
requires only the job identifier to be killed) –
dbms_scheduler.drop_job(‘planned_job_1’);
Disadvantage of this system is the fact that it is not
possible to get data valid in the future before the executing
the Job, because new values are not directly available.
System ensures executing the script at specified the time,
but information about future changes (although they are
available in the system), cannot be obtained directly,
although the new values can be found in the system tables.
Problem is the structure of these records. That was the
reason for creating future tables. This table consists of the
information of the planned job – identifier of the object,
provided operation and data attribute values. If the job is
executed, these data are deleted from this table. The fig. 9
shows the structure of this table.
been updated yet, so the data were inserted for the
first time in past and are still actual.
ID_tab – references the table, record of which has
been processed by DML statement (INSERT,
DELETE, UPDATE).
ID_orig - carries the information about the
identifier of the row that has been changed.
ID_column, ID_row – hold the referential
information to the old value of attribute (if the
DML statement was UPDATE). Only update
statement of temporal column sets not null value.
BD – the begin date of the new state validity of an
object.
VI.
FUTURE VALID DATA PROCESSING
Proposed solution is fully temporal and although it has
not been strictly written, this structure allows you managing
future valid data, too. One possible way to provide it is just
functionality Job, by which is possible to plan changes of
individual attributes. Functionality Job ensures automatic
change of values and it is designed to handle methods,
respectively provides executing the script in defined time.
Triggers are connected to the Insert, Update and Delete
operations, so the information about the change is
automatically stored in the temporal table.
Figure 9. Productive and future table
VII. TRANSACTIONS
Transaction management in fully temporal system is
difficult. Each transaction can create dependent objects. In
case of rollback of the transaction, these dependent objects
should be cancelled too. When the transaction is
successfully committed, these objects can be affected or
modified. Typical example is the earlier update of the object
or improperly planned change that will happen either at a
different time, or will not happen at all.
The above described model is extended by the transaction
definition in the temporal table (fig. 10).
Planning of these events requires several parameters that
are described in the table 2.
TABLE I.
JOB PARAMETERS
Create_job
Job_name
Job_type
Job_action
Start_date
Repeat_interval
Enabled
Identifier of the job
Type of the job (plsql_block,
stored_procedure, executable)
Statement
Planned time to run the job
How often or when the job should be started
The state of the job
The fig. 8 shows the example of the job planning. Job is
identified by the unique name, consists of the code to be
started at defined time point. The state of the job is
expressed using the attribute Enabled.
Figure 10. Colulm level temporal system with transaction management
In principle, there can be several situations based on READ
and WRITE operations. If the transactions do not
manipulate with the same database object, there is no
Figure 8. Job plan
150
problem. If the transactions manipulate with the same
object, but the first one is ended before all operations of the
same object in the second transaction, the problem does not
occur at all, too. However, the following figure shows the
situation, that RO1T2 (read operation of the object O1 by the
transaction T2 manages incorrect data, if the first transaction
is rollbacked.
in our opinion, clearly defines the features, properties,
limitations and finally the overall quality and reliability of
described solutions. Our experiments and evaluations are
based on the processing time, which best represents the
quality of the model. All of the experiments were provided
using the Oracle 11g database system. The total number of
records in the main structure is 100 000.
Experiments are based on the comparison of the
implemented temporal solution on column level with the
standard uni-temporal and bi-temporal approach. Although
the implemented solutions seem to be complicated, it
provides better performance results.
The model 1 deals with the fully temporal (historical,
current and future valid data) on object level. In comparison
with fully temporal structure on column level (model 2),
there is significant acceleration of the system - one temporal
and two conventional columns (100% = performance of
model 1) :
x Size: 54,77%.
x Time to get current snapshot: 76,26%.
x Time to get all life-cycle object data: 33,43%.
Using fully temporal system, it is often necessary to manage
future valid data using transactions. There are basically 2
ways. Standard approach is defined by bi-temporal table
(model 4), which contains also transaction time definition.
In our case, the column level uni-temporal system with
transaction definition tables (model 3) is used. The
developed solution allows acceleration based on the
monitored indicators (100% = performance of model 4):
x Size: 59,11%.
x Time to get current snapshot: 76,79%.
x Time to get all life-cycle object data: 38,34%.
Although the model 3 uses only uni-temporal structure, the
complete transaction management is provided by this
system.
The following table and graphs shows the complete
performance results, required size and processing times.
Figure 11. Transaction management (1)
Similar situation is, if the validity of the WO1T1 is shifted to
the future (fig. 12). There is no problem, because the T2 does
not have to know about future update regardless of whether
the transaction T1 is committed or not.
Figure 12. Transaction management (2)
Another situation occurs when one transaction
determines the validity of the object managed (updated)
using the second transaction (fig. 13).
Figure 13. Transaction management (3)
However, a problem occurs when modifying a previous
example - if the transaction T2 is committed before T1 and
then T1 reads the state of the updated object.
TABLE II.
Figure 14. Transaction management (4)
The situations described above occur only if the
transactions write directly into the database during a
transaction.
Considering the transaction management, this
developed system can be created directly compared to bitemporal system, which is also temporal and transaction
oriented.
Size of the DB
(kB)
Time to get all
actual data
(current
snapshot) (ms)
Time to get all
data during the
life-cycle of one
object (ms)
VIII. EXPERIMENTS
The developed system results and quality is based on the
performance comparison with the existing structures. This,
151
EXPERIMENTS RESULTS
Unitemporal
system
Temporal
table
(full)
Object
level
Model 1
Column
level
Model 2
Temporal
table
(full) +
transaction
management
Column
level
Model 3
41 235
18 650
19 612
47 957
3 921
931
931
4 012
721
480
579
939
Bitemporal
system
Object
level
Model 4
describes implementation methods to provide the complex
temporal data management with transaction management.
The temporal data are usually large; the processing
requires sophisticated access methods. In the future
development, we will focus on the various index structures
creation, which can improve the performance of the model,
too.
ACKNOWLEDGMENT
This publication
implementation:
is
the
result
of
the
project
Centre of excellence for systems and services of
intelligent transport II., ITMS 26220120050 supported by
the Research & Development Operational Programme
funded by the ERDF.
Figure 15. Size of the database based on models
"PODPORUJEME VÝSKUMNÉ AKTIVITY NA SLOVENSKU
PROJEKT JE SPOLUFINANCOVANÝ ZO ZDROJOV EÚ"
This paper is supported by the following project:
University Science Park of the University of Zilina (ITMS:
26220220184) supported by the Research&Development
Operational Program funded by the European Regional
Development Fund.
The work is also supported by the project VEGA
1/1116/11 - Adaptive data distribution.
Figure 16. Processing time results based on models
IX.
CONCLUSION
REFERENCES
Conventional database object is represented by one row current state of the object, whereas temporal management
system offers processing object valid data and their changes
and progress in time. Developers require access to the whole
information about the evolution of the states during the lifecycle, therefore new paradigm has been created – temporal
system processing. Effective managing temporal data can be
very useful for decision making, analyses, process
optimization and can be used in any area – industry,
communication systems, medicine, and transport systems…
However, temporal data management used today does
not cover the complexity of the data management in time.
They do not offer sufficient power to manage large volumes
of data. A significant aspect is just processing time and also
size.
Standard temporal database support is based on the object
level. Our developed system is based on the column attribute
level, the whole state is created by the grouping the
properties and states of the attributes. The main advantage is
the possibility of data processing in time with different
granularity - sensory data.
Fully temporal system requires also transaction
management, in our case; the uni-temporal system with table
transaction management is used. It offers good performance
rate.
This paper deals with the principles, characteristics and
[1]
[2]
C. J. Date, “Date on Database”. Apress, 2006.
C. J. Date, H. Darwen, and N. A. Lorentzos, “Temporal data and the
relational model”, Morgan Kaufmann, 2003.
[3] C. J. Date, “Logic and Databases – The Roots of Relational Theory”,
Trafford Publishing, 2007.
[4] P. N. Hubler and N. Edelweiss, “Implementing a Temporal Database
on Top of a Conventional Database”, 2000. Conference SCCC ’00,
pp. 58 – 67
[5] Ch. S. Jensen, “Introduction to Temporal Database Research”
[6] T. Johnson and R. Weis, “Managing Time in Relational Databases”,
Morgan Kaufmann, 2010.
[7] M. Kvet, A. Lieskovský, and K. Matiaško, “Temporal data
modelling”, 2013. IEEE conference ICCSE 2013, 26.4. – 28.4.2013,
pp. 452 - 459
[8] M. Kvet and K. Matiaško, „Column level uni-temporal“, 2013. In
Journal Communications, in press
[9] M. Kvet and K. Matiaško, “Management of Temporal System”,
2013. In International Journal of New Architectures and Their
Applications, Vol. 3, No. 3, pp. 70-80
[10] M. Kvet and K. Matiaško, “Uni-temporal Modelling Extension at
Object vs. Attribute Level”, 2013. IEEE conference EMS 2013,
20.11. – 22.11.2013, pp. 7 - 11
[11] M. Kvet, J. Mešina, and K. Matiaško, “Algorithm for brain tumour
detections”, in Acta Electrotechnica et Informatica, vol. 12, no. 2,
2012, pp. 45-50.
[12] J. Maté, “Transformation of Relational Datavases to TransactionTime Temporal Databases”, in ECBS-EERC, 2011, pp. 27-34.
152