Download Distributed Systems - Distributed Transactions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Blockchain wikipedia , lookup

Database model wikipedia , lookup

Global serializability wikipedia , lookup

Clusterpoint wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Commitment ordering wikipedia , lookup

Serializability wikipedia , lookup

Concurrency control wikipedia , lookup

Transcript
5. Distributed Transactions
Distributed Systems
Prof. Dr. Alexander Schill
http://www.rn.inf.tu-dresden.de
Outline
 Transactions – Fundamental Concepts
 Remote Database Access
 Distributed Transactions
 Transaction Monitor
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 2
Transactions: Fundamentals
OLTP (Online Transaction Processing) :
(Distributed) Transaction processing:
 Reliable data processing, also during system failures and
multi-user access.
 Application examples:
•
•
•
•
Reservation systems (Booking)
Bank transactions
Accounting
Generally: data access in write mode
 Implementation via Transaction Processing Systems / Monitors,
since 1960; System examples:
• JDBC (pure access interface)
• CICS (IBM)
• Encina (Transarc / IBM)
Prof. Dr. Alexander
Schill
Distributed Systems – Lecture 5: Distributed Transactions
(BEA Systems)
• Tuxedo
Folie 3
Transactions: Fundamentals
Transactions implement the ACID-criteria:
Atomicity
Execution complete or without impacts
Consistency
Transformation between consistent states
Isolation
No overlapping of concurrent transaction executions
Durability
Survival of system failures due to persistent storage
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 4
Transactions: Remote Database Access
Two and Three Tier Architecture
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 5
Transactions: Remote Database Access
JDBC (Java Database Connectivity)
Programming interface for access to relational data bases
Corresponds to ODBC (Open Database Connectivity)
Database operations in Structured Query Language (SQL)
Numerous drivers for different data bases (for instance,
Oracle, Sybase, DB2, SQL Server etc.)
 Realizes however only direct data base access; improved
distributed transaction logic in heterogenic systems
requires Transaction Services / Transaction Monitors




Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 6
Transactions: Database Access via JDBC
try {
Warehouse.con.setAutoCommit(false);
// Update the warehouse inventory
pstmt = Warehouse.con.prepareStatement(“UPDATE
Warehouse SET Count=Count-? WHERE ProductID = ?”);
pstmt.setInt(1, countOrdered);
pstmt.setInt(2, productNumber);
int updated = pstmt.executeUpdate();
pstmt.close();
// Adding of the product to the dispatch list. Number of items = countOrdered.
...
// Check product availability
pstmt = Warehouse.con.prepareStatement(“SELECT Count FROM Warehouse
WHERE ProductID = ?”);
pstmt.setInt(1, productNumber);
resultSet = pstmt.executeQuery();
if(resultSet.first()) {
int count = resultSet.getInt(“Count“); // Available products
if (count >= 0) Warehouse.con.commit(); }
Warehouse.con.rollback();
... }
catch (SQLException se) { Warehouse.con.rollback();
}
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 7
Distributed Transactions: Example
Product Order Processing
Remove Product
from Warehouse
Warehouse
Management
Add Product
to Dispatch List
Dispatch
Order
Processing
Ensuring ACID properties (Atomicity, Consistence, Isolation and Durability):

Either both Warehouse Management and the Dispatch list are updated or non at all

Sum of products in the Dispatch List must be consistent with that in Warehouse
Management

Modifications to number of products are not allowed to become visible for other
operations outside of the transaction until Warehouse Management and Dispatch are
both updated and transaction successfully closed.
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 8
Distributed Transactions: Coordination
Two Phase Commit Protocol

Necessary for coordination between the various participants
involved in a distributed transaction

Assumptions
• Participants save relevant data before transaction so that the original
consistent state can be reset if required
• Transaction started: All required operations carried out but only on
temporary copies of the data.
• Operations can be carried out through RPC and RMI as well as through
various local mechanisms on the individual participants

When is the protocol initiated:
• Only after all temporary changes are made, does the coordinator (a client
or an interconnected server) initiate the Two Phase Protocol
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 9
Distributed Transactions: Coordination
Two Phase Commit Protocol
old Version: K
save Data: K‘
Participant
(for instance,
Server 1)
Coordinator
(for instance,
Client)
K := K‘
Old Version: L
save Data: L‘
Participant
(for instance,
Server 2)
L := L‘
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 10
Distributed Transactions: Coordination
Two Phase Commit Protocol
Input from the network /
Output to the network
start
- / prepare
start
prepare / abort
wait
abort / abort
ready / abort
prepare / ready
ready
ready / commit
commit / abort / -
aborted
committed
a) Coordinator
Prof. Dr. Alexander Schill
aborted
committed
b) Participant
Distributed Systems – Lecture 5: Distributed Transactions
Folie 11
Distributed Transactions: Coordination
Two Phase Commit Protocol

1st Phase
• “Prepare” messages sent by coordinator to participants
• Persistent copy of data with changes made is saved
• Previous version also saved. Participants can therefore use previous or
new version in the event of crash and restart.
• “Ready” message sent by participants when they have successfully
saved the new version of the data.
• “Abort” sent if for some reason this is not possible

2nd Phase
• If “Ready” received from all participants:
o “Commit” sent by coordinator to all participants after “ready” messages
received from all participants.
o Old version of data replaced by new.
o Participants send confirmation to coordinator.
o Transaction complete.
• Else if just one participant sends “Abort” or the coordinator itself
encounters a failure, then aborts are sent to all participants. The new
version of the data is rejected and the old is reinstated.
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 12
Distributed Transactions: Coordination
Pessimistic 2PC Protocol:


Locks held on data until conclusion of transaction
Unnecessary blocking of data access
Optimistic 2PC Protocol:

Locks on the data used by each participant are removed after first phase – i.e.
after ready/abort messages have been sent to coordinator

Other transactions can then read the modified data and make their own
changes

Assumes the first transaction will be successfully concluded

If this does not happen measures must be applied to ensure ACID properties
are maintained.
•
Compensation Transactions – undo all modifications carried out during
unsuccessful transaction
Advantage: Data accessible earlier for other transactions at no higher
expense
 However cancelled transactions
must be specially handled
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions

Folie 13
Distributed Transactions: Concurrency Control
Concurrency Control – isolation of transactions to ensure:


Operations maintain consistent views of data
Intermediate results withheld until conclusion of transaction

Pessimistic Approach (assumptions):
•
•

Frequent conflicts
High expense for undoing transactions
Optimistic Approach (assumptions):
•
•
Conflicts are rare
Undoing and retrying a transaction is less expensive than the blocking of
transactions (through locking) and the analysis of Deadlocks
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 14
Distributed Transactions: Concurrency Control
Pessimistic Approach:
 Locking - i.e. data is reserved exclusively for a particular transaction
 Other transactions must wait until release of locks to access same data
 Issues:
• Avoiding Deadlocks – two transactions requiring access to each others locked data
• Achieving higher concurrency
 Algorithms to ensure transactions executed sequentially – e.g. 2 Phase Locking
a) Simple 2 Phase Locking - lock released as soon as access finished – higher concurrency
Lock
Access
...
Lock
Access
Release
Setting of Locks
Access
...
Release
Release of Locks
Transaction
b) Strict 2 Phase Locking - locks only released at conclusion of transaction - lower concurrency End
Lock
Access
...
Lock
Setting of Locks
 Extension required for distributed systems
Access
Release
Transaction
End
• Central lock manager - bottlenecks
• Distributed lock management – local databases handle setting and releasing of local locks
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 15
Distributed Transactions: Concurrency Control
Optimistic Approach:
 Work Phase
• Data access without locks
• Read-Set – data objects involved in transaction
• Write-Set – changes made to the Read-Set
 Validation Phase
• Check if Read-Set contains inconsistent data – i.e. if it has been altered by other
transactions since the data was read in the Work Phase
• If Read-Set inconsistent then current transaction is cancelled and the changes it has
made are undone
 Securing Phase
• Write-Set is committed to database
• Takes place in the distributed case through a coordinator on the basis of the 2 Phase
Commit Protocol
Work Phase
Transaction
Start
Prof. Dr. Alexander Schill
Validation Phase
Securing Phase
Transaction
End
Distributed Systems – Lecture 5: Distributed Transactions
Folie 16
Distributed Transactions: Nested Transactions
Problem of Simple Transactions:


Success of many individual operations within the transaction are lost with the
resetting of the whole transaction
Lack of parallelism
Nested Transactions - Properties:
Parallel partial transactions through separate threads (editing time reduced)
Separate backtracking of partial transactions (Atomicity maintained through
Transaction Contexts – record of modified data)
 Selective repeats
 Abortion of all partial transactions during abortion of the general transaction
( Data versions to be kept until end)
 Inheritance of locks in both directions within the hierarchy
Systems: Encina, Tuxedo


Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 17
Distributed Transactions: Nested Transactions
Example
Order Execution
Dispatch List
Creation
Check Product
Availability

Create
Dispatch
List
Calculate
Price
Calculate
Discount
Validate
Payment
Information
Validate
Address
Check
Dispatch
Options
Initial failures in Payment partial transactions do not affect Dispatch
List Creation and Dispatch transactions
•

Update
Warehouse
Table
Dispatch
Payment
E.g. incorrect credit card details -> retry with another payment method
If Payment transaction continues to fail then all other partial
transactions under Order Execution root transaction will have to be
reset
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 18
Transaction Monitor
Definition




middleware for supporting
distributed transactions
Realisation of 2PC
protocols
Driver software for linking
different databases
(resource managers)
Administration tools
Applications and Tools
Screen
Managers
Transaction
Processing
Services
Resource
Managers
Base Services
Operating and Transport system
System example IBM Encina - Properties:




TP-Monitor
Integration of different data bases (RDBMS; relational data bases) and
transaction monitors (for instance, CICS over LU 6.2 - SNA)
Integration with CORBA and EJB
Standard conformity (X A, X /Open, ISO/OSI, TP)
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 19
Transaction Monitor: Possible Operations
Possible operations:





local transactions
remote object calls
Parallelization of remote transactions through threads
call of other TP-Monitors
transaction {...
book ("Lufthansa","FRA-JFK");
nested transactions
book ("USAir","JFK-SFA");
output_ticket();
...}
Scenario:
Resource Managers
Application
server
Application
server
Clients
Prof. Dr. Alexander Schill
Mainframe
Distributed Systems – Lecture 5: Distributed Transactions
Folie 20
Transaction Monitor: Possible Operations

Optimization by serving of several clients via one server process
(multi-threaded)

Replication of the servers and partially automatic, heuristic load
balancing, fault tolerance

Advanced load balancing mechanisms - interpolation of future load
Parallel Transactions
transaction {...
concurrent {...
book ("Lufthansa","FRA-JFK");
book ("USAir","JFK-SFA");
output_ticket();
...}
...}
onCommit
printf ("OK");
onAbort
printf ("Failure");
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 21
Transaction Monitor: System Model
Distributed Transaction Processing (DTP) Model



Standardized (by Open Group) collaboration between transaction monitors,
applications and different database systems.
Software Architecture for the realization of distributed transaction systems.
Definition of components and interfaces
Application Programme (AP)
Resources
Specific
(e.g. SQL)
Resources
Manager
(RM)
CPI-C, XATM,
TxRPC
TX
XA
Transaction
Monitor
(TM)
XA+
Communication
Ressources
Manager (CRM)
AP
RM
XAP-TP
TM
CRM
OSI TP
OSI TP
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 22
Transaction Monitor: System Model
DTP Model – Components

Resource Manager (RM)
•
•
•

Application Programme (AP)
•
•
•

Represents resources involved within a distributed transaction e.g. databases, print
servers etc.
Executes local transactions on these resources
Coordinates conclusion of such transactions
Executes operations on various resources involved within a distributed transaction
Defines begin and end of transactions
Decides whether a transaction is concluded with a “commit” or “rollback”
Transaction Manager (TM)
•
•
•
Manages execution and coordination of transaction conclusion with various Resource
Managers
Achieved on the basis of the 2 Phase-Commit-Protocol
Executes commit and reset operations
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 23
Transaction Monitor: System Model
DTP Model – Components (continued)

Communication Resource Manager (CRM)
•
•
•
•
Represents services for the communication between components in different
management domains of transaction monitors
Mediate calls and associated call data between application components
Local representation of components that are in remote management domains
Communication between CRMs via specific network protocol OSI-TP
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 24
Transaction Monitor: Migration and Host-Integration
Example: Integration with CICS legacy system
Server (Unix)
CPI-C
CPI-C
SNA
CORBA /
EJB
LU 6.2
Clients


Internet SNA
(CICS runs on
mainframe)
CPI-C
Mainframe
Encina as CICS region (from point of view of the mainframe)
Access to mainframe data base possible
Basis:



SNA-Gateway; but today also TCP/IP in mainframe environment available
CPI-C (Common Programming Interface for Communications from IBM SAASystems Application Architecture)
COBOL-interface to transactional Encina-data systems
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 25
Transaction Monitor: Tuxedo (BEA Systems)
 Transaction Monitor, comparable with Encina, X Aconform
 Synchronous and asynchronous transactional calls
 Reliable Message Queuing
 Event Communication
 Language support for C, C++, Cobol, Java
 Wide availability
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 26
Transaction Monitor
Component Based Transaction Control



Separation of properties from application functionality
Properties (i.e. specifics of how transactions should be carried out)
defined at deployment time
E.g. Transaction Properties Supported through EJB 3.0:
Attribute
Description
NOT_SUPPORTED
Execution of methods of a Bean within a transaction is not supported
(if applicable – temporary suspension of transaction)
SUPPORTS
Use of a Bean is possible within and without transaction context
REQUIRED
Transaction obligatory; if applicable – implicit starting of a new transaction
(in case a transaction is not active)
REQUIRES_NEW
Transaction obligatory, always started with the method call of a Bean
(if applicable – temporary suspension of a presently running transaction)
MANDATORY
Transaction obligatory, must exist beforehand
(otherwise exception message)
NEVER
Bean not allowed to be used within a transaction
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 27
Summary: Transactions

OLTP / TP-Monitors are essential components for commercial
applications

Transaction Monitors provide greater flexibility than simple JDBC
applications

Trusted, consistent, distributed data management

Optional nested transactions, loading balancing, security aspects

Integration with mainframe systems (for instance, CICS-Monitor)

Important products: Encina (Transarc), Tuxedo (BEA Systems)

Improvement in the Internet-area: TIP (Transaction Internet Protocol):
flexible Two-Phase-Commit via TCP/IP
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 28
References
 Tanenbaum, A.S., van Stehen, M.: Distributed Systems:
Principles and Paradigms. Prentice Hall International, 2008
 Schill, A., Springer, T.: Verteilte Systeme - Grundlagen
und Basistechnologien. Springer, Berlin, 2007
 Grey, J., Reuter, A.: Transaction Processing. Concepts and
Techniques. Morgan Kaufmann Series in Data
Management Systems, 1992
 Reese, G.: Database Programming with JDBC and Java.
O’Reilly Media, 2000
 Open Group: Distributed TP: Reference Model, ISBN
1859121705, free PDF, available online, 1996
Prof. Dr. Alexander Schill
Distributed Systems – Lecture 5: Distributed Transactions
Folie 29