Download Paper

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft SQL Server wikipedia , lookup

Global serializability wikipedia , lookup

Oracle Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

IMDb wikipedia , lookup

Ingres (database) wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Commitment ordering wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

Versant Object Database wikipedia , lookup

ContactPoint wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Serializability wikipedia , lookup

Concurrency control wikipedia , lookup

Transcript
In the world of data warehouses utilities, great emphasis is placed on the contrasts
between one vendor’s performance benchmark and another’s performance benchmark. But,
benchmarks no longer tell the whole story. Recovery became among the very important
characteristics of any product. Everybody knows that nothing work perfectly hundred percent
of the time. People are worried about losing all their data within a short power shutdown.
Database administrator’s most common nightmare is their databases empty because of a
system of transaction failure. Therefore, recovery became as important to users as the others
performance criteria such as OLAP, data mining …etc.
Recovery of a database system means recovering the database itself: restoring the
database to a previous known correct state, after some failure had happened and affected the
state of the database. Failures can be caused by almost anything: Programming errors on the
applications, hardware errors on the device, channel or CPU, operator errors such as mounting
the tape wrongly, power shutdown, fire in the machine or even sabotage. The list is endless.
In all these cases, recovery principle is quite the same: redundancy. Still how this redundancy
is done. One possible way is to:
 Periodically, copy or dump the entire database to archive storage.
 For every change, make a log entry with the old and new value of the changed item.
SO, in case a failure happens:
 If the database is damaged, it can be restored by loading it from the most recent archive
copy and then use the log file to redo all the changes made since the archive copy was taken.
o If only the content of the database is unreliable, the database is restored by undoing all
“unreliable” changes, using the log file.
Some people claim that a better approach is to simply duplex the data warehouse.
Duplexing in the sense of keeping two identical copies, and applying on both of them all the
updates simultaneously. It is true that this solution improve reliability and performance.
However doing it this way implies having twice as much storage, and the two copies should
be independent from each other; being on different channels or having separate CPUs. This
will reduce the chance that a single failure affects both copies. This separation is hard to
make and independence cannot be totally achieved. Knowing all this, we need to take a deep
look on how transaction are made, how we can recover from a faulty transaction.
The first thing we learn in our database course is that every database system is built to
carry out transaction. Transaction is the smallest unit of work. Transaction consists of
executing a sequence of operations.
It begins with a special BEGIN TRANSACTION
operation and executes the set of updates to end with either a COMMIT or a ROLLBACK
1
operation. COMMIT means a successful termination, ROLLBACK states from unsuccessful
process. This implies that all recoverable operations must be done within the bounds of a
transaction. We mean by “recoverable” operation any one that can be undone or redone in
case of failure. All database updates (for which an entry has been loged) and message I/O are
recoverable operations. Through this paper, we will take as example the following banking
transaction code. It takes as input:
TRANSFERT $1000 3452332 TO 9087665
TRANSFERT: PROC;
GET (FROM, TO, AMOUNT);
FIND UNIQUE (ACCOUNT WHERE ACCOUNT = FROM);
ASSIGN (BALANCE – AMOUNT) TO BALANCE;
IF BALANCE < 0
THEN
DO;
ROLLBACK;
PUT (‘INSUFFICIENT FUNDS’);
ELSE
DO;
FIND UNIQUE (ACCOUNT WHERE ACCOUNT = TO);
ASSIGN (BALANCE + AMOUNT) TO BALANCE;
COMMIT;
PUT (‘TRANSFER COMPLETE’);
END;
END /* TRANFERT */;
It is important to know that for the end user, transactions are atomic. To the user
“transfer x dollars from account A to account B” is one single atomic transaction, which
either succeed or fail. If it succeeded, well and good; if it failed, than nothing should have
changed in the database. Therefore, the database should not be left in a state where the
FROM account has been decremented but the TO account has not been incremented. We can
conclude that transactions are all-or-nothing operations.
As mentioned before, recovery has various Implications for messages as well
as for databases. In the TRANSFER example, the transaction not only updates the database,
it also sends messages to the end user (INSUFFICIENT FUNDS or TRANSFER
COMPLETE). As we can see, output messages should not be transmitted until the planned
end-of-transaction. Handling messages is done by the Data Communication Manager (DC
Manager). Thus it is the DC Manager that receives the original input message (giving FROM,
TO and AMOUNT in the TRANSFERT example). On receipt of this message, the DC
Manager (a) writes a log record and (b) places the message on the input queue. GET operation
2
is used to retrieve a message from the input queue where a PUT operation attempts to put a
message in the output queue. The other operations that affect message queues are COMMIT
and ROLLBACK. They cause the DC Manager to:

Write a log entry for the messages on the output queue.

Arrange for the actual transmission of those messages.

Remove messages from the input queue.
A transaction failure such as overflow causes the DC Manager to cancel the output messages.
The logical structure of the TRANSFER example may be considered as typical of
transactions in general:
 Accept input message;
 Perform database processing;
 Send output message(s);
We can have more complex structure with multiple communications between the end
user and the program that can be handled in two ways: either they can be subdivided into a
sequence of simple transactions, each having the structure above, or they can be treated as one
big transaction that repeats the input-process-output cycle many times and then issues
COMMIT or ROLLBACK. The first approach suffers from the drawback that the database
may be changed in the interval between two “conversations”, which may cause the user to
work on incorrect information. The second one suffers from the drawback that at any time the
end-user must be prepared to get a message like “ignore all previous messages, a failure
occurred”.
There different types of failure.
The TRANSFER example illustrates how the
ROLLBACK statement ensures that a failure does not corrupt the database. Failures can be
subdivided as follow:

Transaction local failures that are detected by the application code itself
(INSUFFISANT FUNDS).

Transaction local failures that are not explicitly handled by the application
code (arithmetic overflow).

System-wide failures (CPU failure) that affect all transactions currently in
progress but do not damage the database.

Media failures (disk head failure) that damage the database, or some portion of
it, and affect all transactions currently using that portion.
3
In the rest of this paper, we will use “transaction failure” to mean a failure caused by
unplanned, abnormal program termination. Conditions that may cause such terminations
include arithmetic operation overflow, division by zero and storage protection violation.
Transaction failures are handled in the following way: suppose a division by zero occurred.
This operation will raise a ZERODIVIDE that can be handled by the program code. If not,
the system action for ZERODIVIDE is to arise an ERROR, and the programmer has the
option to catch this error. Finally, the system action for ERROR is to cause abnormal
program termination. It is only at that point that we can say that a “transaction failure” has
occurred.
Transaction failure means that the program did not reach its planned termination, it is
therefore necessary to force a rollback: undo all changes the transaction made on the database
and cancel all output messages. Undoing changes involves working backward through the
log. There are three basic types of changes we want to undo: updating an existing record,
deleting an existing record, and inserting a new record.
When the recovery manager
encounters a particular log record, it simply invokes the undo operation for that specific
change. For more convenience and to help the recovery manager perform the rollback, the
log file should be kept on a direct access device. However, for large database, managing 200
millions of log records daily is not a simple task. Therefore, an active log is indeed written as
a direct access data set. When it is full, the log manager switches to another data set and
dump the first to archive storage. The dumping process is done in parallel with on-line usage
of the second data set. Thus the total log consists of the currently archive on-line portion, on
direct access, plus an arbitrary number of earlier portions as archival storage.
All this is good, but the process of rolling back a transaction can be subject to failure.
Such failure will subsequently cause the rollback procedure itself to be restarted from the
beginning. Because of this, the undo logic on update, delete, and insert must be idempotent:
UNDO(UNDO(UNDO(. . . (x))))= UNDO(x) for all x.
When dealing with “transaction failure”, we clearly understand the a transaction is a
unit of recovery as well as a unit of work.
Another major type of failure is “system failure”, we mean by this any event that
cause the system to stop and thus require a system restart. The question arises on how the
recovery manager knows at restart which transactions to rollback. Searching the entire log for
the transactions record with no termination (COMMIT or ROLLBACK) will be timeconsuming. By introducing the notion of checkpoints, we reduce the search time drastically.
4
The concept of checkpoint is very straightforward.
Periodically, the system “takes a
checkpoint”. The latter is done through the following steps:
1. Force-writing any log records that are still in main storage out to the actual log;
2. Forcing a “checkpoint record” out to the log data set;
3. Force-writing any updates that are still in main storage out to the actual database;
4. Writing the address of the checkpoint record within the log data set into a “restart file”.
Each checkpoint record contains:

A list of all transaction active at the time of the checkpoint; together with

The address within the log of each such transaction’s most recent log record.
At restart time, the recovery manager obtains the address of the most recent
checkpoint record from the restart file, locates that checkpoint record in the log, and proceeds
to search forward through the log from that point to end. The manager needs then to check
which transactions need to be undone, and which should be redone. In this context, we define
five distinct categories of transaction, as we can see in figure I:

A system failure has occurred at time t2.

The most recent checkpoint prior to time t2 was taken at time t1.

Transactions of type T1 were complete before time t1.

Transactions of type T2 started after time t1 and completed before time t2.

Transactions of type T3 started prior to time t1 and completed after t1 and before time t2.

Transactions of type T4 started prior to time t1 but did not complete by time t2.

Finally, transactions of type T5 started after time t1 but did not complete by time t2.
T1
T2
T3
T4
T5
t1
t2
Checkpoint
System crash
Figure IThe five transaction categories.
5
It is clear that at restart, transactions of type T4 and T5 must be undone. What is not
that obvious is that transactions of type T2 and T3 must be redone, because even though the
transactions has been complete before the system crash, there is no guarantee that their
updates were actually written to the database. Therefore, just as the update, delete, and insert
components of the DBMS include an UNDO entry point, they must also implement a REDO
entry point. The recovery manager is thus able to track forward the log records, and invoke
REDO for appropriate transactions. REDO, just like UNDO must be idempotent. Moreover,
REDO also has certain implications for handling messages. Since the system can reschedule
transactions of type T2 and T3, it must force-write input messages log records, since the
parameters provided by those messages are needed to redo the transactions.
So far, we have been talking about changing the database and the writing a log record
as two separate operations. This separation is real and it introduces the possibility of a failure
occurring in the interval between the two. Suppose that such a failure occurs and that we
made the change in the database but without recording this change in the log file. This
change cannot be undone. Therefore, for safety, the log record should always be written first.
This is the basis of the Write-Ahead Log Protocol:

A transaction is not allowed to write a record to the physical database until at least the
undo portion of the corresponding log record has been written to the physical log.

A transaction is not allowed to complete COMMIT processing until both the redo and the
undo portions of all log records for the transaction have been written to the physical log.
Now, let’s talk about the system startup. How does the system react to a failure.
There are three type of system startup:

Emergency restart is the process that is invoked after a system failure has occurred. It
involves the recovery procedures (UNDO or REDO).

Warm start is the process of starting up the system after a controlled system shutdown.
On a receipt of a SHUTDOWN command, the system refuses any transactions initiation,
wait for all active transactions to terminate and then terminate itself. Warm start is a
special case of emergency restart.

Starting the system from scratch is a cold start. The term is used to refer to the process of
starting the system after some disastrous failure that makes warm start impossible (loss of
the restart file). This involves starting again from some archive version of the database.
Work done since the last version needs to be redone.
performed exactly once, when the system is first installed.
6
Ideally cold start should be
The checkpoint procedure we discussed in the beginning of this paper is workable,
but can be slightly modified for better efficiency. The procedure we presented “undoes”
updates even if they were never done and “redoes” updates even if they have in fact already
been done. The third step of the process, fore-write the database buffers, is not really
necessary. The purpose of including it was just to establish a bound on the amount of
recovery processing needed. Instead, we introduce some logic in our recovery and thus
avoiding to fore-write the database buffers.
The new technique starts by assigning, in
ascending order, to each log record a unique log sequence number (LSN) at the time it is
written. Ten, for each page updated and written in the database, we place in that page the
LSN of the log record corresponding to that update. Now, we modify the previous checkpoint
recovery procedures by eliminating step 3 and incorporating into the checkpoint record (step
2) the value m, where m is the LSN of the log record corresponding to the oldest page in the
database buffer (m being the minimum of all LSN for unwritten pages). The log record with
LSN m corresponds to the furthest point back in time that recovery procedure needs to go to
redo work. The procedure is as follow:

The undo-list and redo-list are established as before.

Work backward through the log.

Work forward through the log.
The third and last type of failure discussed in this paper are media failures. A media
failure is a failure in which a portion of the secondary storage medium is damaged. The
recovery process in this case consists of restoring the database from an archive dump and then
use the log to redo transactions run since that dump was taken. Now suppose a media failure
occurs. All current transactions will be abnormally terminated. New device should be
allocated to replace the one that failed. A utility program is then run which (a) load the
database on to the new device from the most recent archive dump, and (b) use the log to redo
all the transactions that completed since the dump was taken.
So far, we have assumed that transactions deal with a single resource manager
(DBMS), and that they involve just one recoverable source (database). Now, imagine a
transaction T that updates information in both a system R database and an IMS database.
Since transactions are all-or-nothing, imagine the transaction T executing a SQL END
TRANSACTION and then failing before it can execute a DL/I SYNC (to commit updates to
the IMS database). Considerations such as these lead to the requirement for a protocol known
as two-phase-commit.
7
Two-phase-commit is required whenever a transaction is able to invoke multiple
independent resource managers. It works as follow: a new system component is introduced
called the coordinator. Then transactions do not issue separate COMMIT operations to each
resource manager, but instead issue a single “global” COMMIT to the coordinator. On
receipt of such request the coordinator goes through the two phases:

Phase 1: the coordinator request all resource managers to get themselves into a state
which they can either commit or rollback the transaction.
In practice, the resource
managers should all force log records involving that transaction out to the local log. If the
resource manager succeeds reaching this state, it replies “OK” to the coordinator. (“NOT
OK” otherwise)

Phase 2: if all replies are “OK”, the coordinator then broadcasts the command
“COMMIT” to all resource managers that then complete their local COMMIT processing.
Otherwise, the coordinator broadcasts the command “ROOLBACK” so that all resource
managers undo all local effects of the transaction, using their local log.
Most of the recovery operations discussed above are provided in various data
manipulations languages (DMLs), namely SQL, DL/I, DBTG, and UDL. SQL provides the
operations BEGIN TRANSACTION, END TRANSACTION (successful termination), and
RESTORE TRANSACTION (unsuccessful termination). DL/I, for IMS, implement different
application programs for recovery process:

Batch programs.

Message processing programs.

Fast path, message driven program.
It also have other recovery calls such that CHKP to establish an explicit point, SYNC to
establish a synchpoint applied to non-message driven programs. DBTG, as UDL does,
provides two recovery statements, COMMIT and ROOLBACK.
To sum up this paper, in my point of view recovery is as important when building our
data warehouse system as all the benchmarks done to test our system for performance,
efficiency and others criteria. Downtime, due to recovery process, can lead to loss in revenue,
productivity, profitability, and customers. Therefore, by improving our recovery procedures
will improve our system availability, and our database administrator can sleep without
nightmares.
8
Acronyms:
o OLAP: On-Line Analytic Processing.
o IMS: Integrated Mortgage Strategies.
o DBMS: DataBase Management System.
o SQL: Structured Query Language.
o DL/I: Data Language/1 (Hierarchical database system).
o DBTG: DataBase Task Group.
o UDL: User Defined Logic.
Reference:
o Date Book: Chapter 1 Recovery.
o Figure 1: http://roseanne.tesoriero.washcoll.edu/Courses/404/Recovery.ppt
o http://www.sun.com/storage/white-papers/shadowimage_wp.pdf
9