* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Transaction Management in Fully Temporal System
Survey
Document related concepts
Transcript
2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation Transaction Management in Fully Temporal System Michal Kvet, Karol Matiaško Marek Kvet University of Zilina, Faculty of Management Science and Informatics Zilina, Slovakia [email protected], [email protected] University of Zilina, University Science Park Žilina, Slovakia [email protected] could not be based on the historical data because of the time consumption (sometimes even days to load all needed snapshots). In addition, the granularity of the data is still growing, so number of backups was above the acceptable level. If the all backups and also log files were not stored completely, some operation could be even missing. The following figure shows the problem. Backup does not register two operations occurred (multiple updates, insert and subsequent delete of the same object). Abstract—Timed data processing belongs to one of the most important task of the development of current database systems. Conventional database approach offers paradigm for current valid data processing. However, it is necessary to store and manage also historical values. Moreover, temporal system should provide structure for future valid data processing. The basic structure for processing temporal data was developed in the past, which was able to store images of objects at different time points or intervals. However, significant factor is the effectiveness of the whole system. If there is requirement to process data using transactions, the problem is more complicated. This paper deals with the new approach - column level temporal data processing with the management of transactions. This system can be, thus, defined as bi-temporal structure, but the root is uni-temporal system. Keywords-transaction; object level temporal data; fully temporal system; jobs; changes monitoring I. Figure 1. Backup problem - possible loss of data Moreover, it is not enough to find a faster solution for historical data processing. What does the term “faster” mean? To order shorten the processing time from days to hours, from tens of hours to a couple of hours? The usage of that solution is unacceptable and inapplicable. The aim is to create support for the temporal data, that the difference between the processing in the currently valid data and historical data is minimized. Thus, it is necessary to create a system with easy data. However, historical data management is not a complete temporal system, because it must be allowed to process values, the validity of which begins in the future. If the begin time of the validity of the object occurred, the database system must be able to update the data without user intervention. This system, in addition, offers very poor support for transaction management. Later, the temporal systems on the object level have been developed. The structure and the principle of these solutions are described in next section. The main criterion for comparing the solutions and structure is the performance, which is based on the processing time to get the required data and also the size of the whole structure. Although large disc capacity is currently available, there is still need to effectively store and process data, because temporal data are really extensive and contain changes of the object states over the years [5] [6]. This paper deals with the new approach based on the column level temporal data, describes the principles and INTRODUCTION Database systems are one of the most important parts of the information technology. It is generally known that database system is usually the basic part - the root of any information system. The development of data processing has brought the need for modelling and accessing large structures based on the simplicity, reliability and speed of the system [1] [3] [4]. However, even today, when database technology is widespread, most databases process and represent the states of the data valid in this moment. Properties and states of the objects evolve over the time, become invalid and are replaced by new ones. Once the state is changed, the corresponding data are updated in the database and it still contains only the current valid data. However, temporal data processing is very important in dynamically evolving systems, industry, communication systems and also in systems processing sensitive data, which incorrect change would cause a great harm. It can also help us to optimize processes and make future decisions [2] [3] [4]. Historical data management using log files and archives is completely inappropriate. Although the data can be obtained, it is too complicated, it lasts to too much time and the output data are obviously not in suitable form (raw material). The main disadvantage is the need of the administrator intervention (operation manager). Operational decisions 978-1-4799-4923-6/14 $31.00 © 2014 IEEE DOI 10.1109/UKSim.2014.26 147 techniques. Column level temporal structure does not deal with the complete object level temporal data, but divides the object attributes into conventional and temporal columns. Changes are managed only in temporal part. Each column is processed separately. The result is a new concept that brings the ability to define transaction data in the system which manages only the validity period (similar to uni-temporal system, but on column level). New temporal data definition is compared to existing solutions based on access and processing and also the size. The results are in the "Experiments" section. II. REQUIREMENTS AND ASPECTS System requirements can be divided into two parts with special aspects: [6] [11] [12] 1. Aspect of usability (easy methods) – the aim is to provide access to outdated information as easily and quickly as to the actual values. Transactions for managing temporal data must be as simple as for current data processing. Moreover, it is necessary to define a way to combine the past and present. 2. Aspect of performance (speed and correctness of results) requires results in the same form as when accessing the actual data with adequate processing time. The difference in accessing the object at any point in time should be minimal. III. TEMPORAL DATA MANAGEMENT Figure 2. Updating temporal data Temporal databases define new paradigm for selecting one or more rows based on the specified criteria, for projecting of one or more columns to the output sets and for joining the tables by specifying relationship criteria. Rows with the different values of the primary key (PK) can represent one object at different times. Transactions for inserting, updating and deleting the rows must therefore specify not only the object itself, but also processed period. If the valid time of the object is defined by time interval, the transaction must include time period - 2 time point values begin and end timestamps (dates). This means that the update query does not cause only update of existing data, but also insert of the new row based on the validity intervals. The figure 2 shows the update of the states of the object ID=1. Valid time of the new state overlaps the existing intervals, so they must be split. [2] [6] [8] One row in the temporal system can be represented by three ways based on the time validity and transaction time. The figure 3 shows the principle of time management. The first type represents the most common used method today – conventional table – approach, which does not provide management for time validity. The second one extends the primary key by the interval of validity (BD1, ED1) – unitemporal system. In this case, it is necessary to ensure, that these intervals do not overlap. Simply, there cannot be more than 1 actually valid state of the same object. Bi-temporal structure extends uni-temporal system by the transaction time (BD2, ED2) [12]. Figure 3. Conventional and temporal structure [6] The figure 4 shows the time representation of the conventional, uni-temporal, bi-temporal table and also our developed system – column level temporal structure. Figure 4. Time representation 148 IV. UNI-TEMPORAL STRUCTURE – OBJECT LEVEL The easiest and also often used method to manage temporal data is uni-temporal system. It is based on the extension of the conventional (non-timed) model. The primary key now contains not only the object identifier, but also one or two attributes determining the validity of the row. Consequently, one object can be defined by the various numbers of the rows, but not more than one defines the object at any time point. Thus, the data modelling operations must define not only the object itself, but also the time point expressing the begin timestamp (or other timed attribute based on the granularity like day, month and so on) or two attributes expressing the time interval. In our case, the row is defined by the validity. Special type of uni-temporal system is a solution that contains only one time attribute that is part of the primary key. This means that any change of the corresponding object determines the validity of the prior state. The following figure shows the representation of such a model, as well as corresponding standard uni-temporal system. The first part of the figure consists only of the begin time of the validity. The second one is a standard model with the closed-closed representation of the interval, which is intuitive, it exactly defines begin and end time of the validity. However, the problem is how to define next valid interval. If the clock time tick is a day, the problem is greater due to the last day of the month, leap year and so on. In addition, what about changing the minimal time between two states, two modifications of the database? It can create gaps of undefined attributes, if the time clock is modified (e.g. from days to hours). The whole database must be reconstructed for purpose of gap elimination. The last type in the figure 5 is open-closed representation. Figure 6. Example of the database model (student and subjects) [10] This is only part of the issue, which, however, sufficiently shows that it is necessary to create a new complex solution. V. COLUMN LEVEL TEMPORAL SYSTEM Object level temporal system does not fulfill the performance requirement because of the duplicities. The changes rate of the attribute is almost always different, therefore it is not appropriate to use object level temporal model. Better way is our developed system, which manages not the whole objects, but only attributes. If one or more attributes are changed, only this information is stored in the database, not the whole object values are stored. It significantly reduces the size of the structure. The following figure shows the principle and the structure. The external level is based on the conventional tables, so there is no problem for existing applications. History management provides the developed temporal table. Figure 5. Types of uni-temporal table modelling [10] The mentioned solution seems to be easy, but it does not address the fundamental problem of undefined states, the time intervals during which the state of the object is partially or completely undefined. We cannot easily replace the previous value with NULL the value, because it can have special denotation. In addition, some attributes cannot have the NULL value limited by the definition of the attribute column. The problem however occurs, if even one attribute value is changed to the undefined state. The whole state of the object must be denoted as undefined, respectively incorrect. It must be, therefore, possible to distinguish the condition of correctness and completeness of the object state. The other one can be based on the moving object into table containing only undefined states. However, what about the foreign keys? Imagine the manipulation (fig. 6). These problems very well describe the limitations of the system. Figure 7. Column level temporal system [10] Temporal table consists of the next mentioned attributes [7] [8] [9] – see also fig. 7: x ID change x ID previous change – references the last change of an object identified by ID. This attribute can also have NULL value that means, the data have not 149 x x x x The process of the dropping the job is much easier and requires only the job identifier to be killed) – dbms_scheduler.drop_job(‘planned_job_1’); Disadvantage of this system is the fact that it is not possible to get data valid in the future before the executing the Job, because new values are not directly available. System ensures executing the script at specified the time, but information about future changes (although they are available in the system), cannot be obtained directly, although the new values can be found in the system tables. Problem is the structure of these records. That was the reason for creating future tables. This table consists of the information of the planned job – identifier of the object, provided operation and data attribute values. If the job is executed, these data are deleted from this table. The fig. 9 shows the structure of this table. been updated yet, so the data were inserted for the first time in past and are still actual. ID_tab – references the table, record of which has been processed by DML statement (INSERT, DELETE, UPDATE). ID_orig - carries the information about the identifier of the row that has been changed. ID_column, ID_row – hold the referential information to the old value of attribute (if the DML statement was UPDATE). Only update statement of temporal column sets not null value. BD – the begin date of the new state validity of an object. VI. FUTURE VALID DATA PROCESSING Proposed solution is fully temporal and although it has not been strictly written, this structure allows you managing future valid data, too. One possible way to provide it is just functionality Job, by which is possible to plan changes of individual attributes. Functionality Job ensures automatic change of values and it is designed to handle methods, respectively provides executing the script in defined time. Triggers are connected to the Insert, Update and Delete operations, so the information about the change is automatically stored in the temporal table. Figure 9. Productive and future table VII. TRANSACTIONS Transaction management in fully temporal system is difficult. Each transaction can create dependent objects. In case of rollback of the transaction, these dependent objects should be cancelled too. When the transaction is successfully committed, these objects can be affected or modified. Typical example is the earlier update of the object or improperly planned change that will happen either at a different time, or will not happen at all. The above described model is extended by the transaction definition in the temporal table (fig. 10). Planning of these events requires several parameters that are described in the table 2. TABLE I. JOB PARAMETERS Create_job Job_name Job_type Job_action Start_date Repeat_interval Enabled Identifier of the job Type of the job (plsql_block, stored_procedure, executable) Statement Planned time to run the job How often or when the job should be started The state of the job The fig. 8 shows the example of the job planning. Job is identified by the unique name, consists of the code to be started at defined time point. The state of the job is expressed using the attribute Enabled. Figure 10. Colulm level temporal system with transaction management In principle, there can be several situations based on READ and WRITE operations. If the transactions do not manipulate with the same database object, there is no Figure 8. Job plan 150 problem. If the transactions manipulate with the same object, but the first one is ended before all operations of the same object in the second transaction, the problem does not occur at all, too. However, the following figure shows the situation, that RO1T2 (read operation of the object O1 by the transaction T2 manages incorrect data, if the first transaction is rollbacked. in our opinion, clearly defines the features, properties, limitations and finally the overall quality and reliability of described solutions. Our experiments and evaluations are based on the processing time, which best represents the quality of the model. All of the experiments were provided using the Oracle 11g database system. The total number of records in the main structure is 100 000. Experiments are based on the comparison of the implemented temporal solution on column level with the standard uni-temporal and bi-temporal approach. Although the implemented solutions seem to be complicated, it provides better performance results. The model 1 deals with the fully temporal (historical, current and future valid data) on object level. In comparison with fully temporal structure on column level (model 2), there is significant acceleration of the system - one temporal and two conventional columns (100% = performance of model 1) : x Size: 54,77%. x Time to get current snapshot: 76,26%. x Time to get all life-cycle object data: 33,43%. Using fully temporal system, it is often necessary to manage future valid data using transactions. There are basically 2 ways. Standard approach is defined by bi-temporal table (model 4), which contains also transaction time definition. In our case, the column level uni-temporal system with transaction definition tables (model 3) is used. The developed solution allows acceleration based on the monitored indicators (100% = performance of model 4): x Size: 59,11%. x Time to get current snapshot: 76,79%. x Time to get all life-cycle object data: 38,34%. Although the model 3 uses only uni-temporal structure, the complete transaction management is provided by this system. The following table and graphs shows the complete performance results, required size and processing times. Figure 11. Transaction management (1) Similar situation is, if the validity of the WO1T1 is shifted to the future (fig. 12). There is no problem, because the T2 does not have to know about future update regardless of whether the transaction T1 is committed or not. Figure 12. Transaction management (2) Another situation occurs when one transaction determines the validity of the object managed (updated) using the second transaction (fig. 13). Figure 13. Transaction management (3) However, a problem occurs when modifying a previous example - if the transaction T2 is committed before T1 and then T1 reads the state of the updated object. TABLE II. Figure 14. Transaction management (4) The situations described above occur only if the transactions write directly into the database during a transaction. Considering the transaction management, this developed system can be created directly compared to bitemporal system, which is also temporal and transaction oriented. Size of the DB (kB) Time to get all actual data (current snapshot) (ms) Time to get all data during the life-cycle of one object (ms) VIII. EXPERIMENTS The developed system results and quality is based on the performance comparison with the existing structures. This, 151 EXPERIMENTS RESULTS Unitemporal system Temporal table (full) Object level Model 1 Column level Model 2 Temporal table (full) + transaction management Column level Model 3 41 235 18 650 19 612 47 957 3 921 931 931 4 012 721 480 579 939 Bitemporal system Object level Model 4 describes implementation methods to provide the complex temporal data management with transaction management. The temporal data are usually large; the processing requires sophisticated access methods. In the future development, we will focus on the various index structures creation, which can improve the performance of the model, too. ACKNOWLEDGMENT This publication implementation: is the result of the project Centre of excellence for systems and services of intelligent transport II., ITMS 26220120050 supported by the Research & Development Operational Programme funded by the ERDF. Figure 15. Size of the database based on models "PODPORUJEME VÝSKUMNÉ AKTIVITY NA SLOVENSKU PROJEKT JE SPOLUFINANCOVANÝ ZO ZDROJOV EÚ" This paper is supported by the following project: University Science Park of the University of Zilina (ITMS: 26220220184) supported by the Research&Development Operational Program funded by the European Regional Development Fund. The work is also supported by the project VEGA 1/1116/11 - Adaptive data distribution. Figure 16. Processing time results based on models IX. CONCLUSION REFERENCES Conventional database object is represented by one row current state of the object, whereas temporal management system offers processing object valid data and their changes and progress in time. Developers require access to the whole information about the evolution of the states during the lifecycle, therefore new paradigm has been created – temporal system processing. Effective managing temporal data can be very useful for decision making, analyses, process optimization and can be used in any area – industry, communication systems, medicine, and transport systems… However, temporal data management used today does not cover the complexity of the data management in time. They do not offer sufficient power to manage large volumes of data. A significant aspect is just processing time and also size. Standard temporal database support is based on the object level. Our developed system is based on the column attribute level, the whole state is created by the grouping the properties and states of the attributes. The main advantage is the possibility of data processing in time with different granularity - sensory data. Fully temporal system requires also transaction management, in our case; the uni-temporal system with table transaction management is used. It offers good performance rate. This paper deals with the principles, characteristics and [1] [2] C. J. Date, “Date on Database”. Apress, 2006. C. J. Date, H. Darwen, and N. A. Lorentzos, “Temporal data and the relational model”, Morgan Kaufmann, 2003. [3] C. J. Date, “Logic and Databases – The Roots of Relational Theory”, Trafford Publishing, 2007. [4] P. N. Hubler and N. Edelweiss, “Implementing a Temporal Database on Top of a Conventional Database”, 2000. Conference SCCC ’00, pp. 58 – 67 [5] Ch. S. Jensen, “Introduction to Temporal Database Research” [6] T. Johnson and R. Weis, “Managing Time in Relational Databases”, Morgan Kaufmann, 2010. [7] M. Kvet, A. Lieskovský, and K. Matiaško, “Temporal data modelling”, 2013. IEEE conference ICCSE 2013, 26.4. – 28.4.2013, pp. 452 - 459 [8] M. Kvet and K. Matiaško, „Column level uni-temporal“, 2013. In Journal Communications, in press [9] M. Kvet and K. Matiaško, “Management of Temporal System”, 2013. In International Journal of New Architectures and Their Applications, Vol. 3, No. 3, pp. 70-80 [10] M. Kvet and K. Matiaško, “Uni-temporal Modelling Extension at Object vs. Attribute Level”, 2013. IEEE conference EMS 2013, 20.11. – 22.11.2013, pp. 7 - 11 [11] M. Kvet, J. Mešina, and K. Matiaško, “Algorithm for brain tumour detections”, in Acta Electrotechnica et Informatica, vol. 12, no. 2, 2012, pp. 45-50. [12] J. Maté, “Transformation of Relational Datavases to TransactionTime Temporal Databases”, in ECBS-EERC, 2011, pp. 27-34. 152