* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CHAPTER 1: The Roles of Data and Database Administrators
Entity–attribute–value model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Oracle Database wikipedia , lookup
Ingres (database) wikipedia , lookup
Commitment ordering wikipedia , lookup
Relational model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Serializability wikipedia , lookup
Clusterpoint wikipedia , lookup
Database model wikipedia , lookup
CHAPTER II: Database Backup and Recovery At the end of each topic, the students are expected : 1. To define the problem of data base recovery. 2. To enumerate the five techniques used in to enhance security. 3. To discuss the importance of data quality and data availability. 4. To enumerate several measures to improve data quality and data availability. Database Recovery Mechanisms for restoring a database quickly and accurately after loss or damage. Basic Recovery Facilities 1. Backup facilities – provide back-up copies of portions of or the entire database. You can use operating system commands, or SELECT . . . INTO SQL commands to perform backups. Cold back-up – full back-ups of a database. Hot back-up – only a selected portion of the database is shut down from use. 2. Journalizing facilities – maintain an audit trail of transactions and database changes. Transaction Effect of transaction or recovery action Database (current) Database Management System Copy of transaction Transaction Log Recovery Action Copy of database affected by transaction Database change Log Database (Backup) 3. Checkpoint facility – by which DBMS periodically suspends all processing and synchronizes its files and journals to establish a recovery point. All transactions in progress are completed, and the journal files are brought up-to-date. ITEC 65 - ADVANCED DATABASE MANAGEMENT SYSTEM 4. Recovery manager – allows the DBMS to restore the database to a correct condition and restart processing transactions when a failure occurs and then resumes processing user questions. Recovery and Restart Procedures Disk Mirroring – at least two copies of the database must be kept and updated simultaneously. When a media failure occurs, processing is switched to the duplicate copy of the database. Restore/Rerun – a technique that involves reprocessing the day’s transactions (up to the point of failure) against the backup copy of the database. Maintaining Transaction Integrity A business transaction is a sequence of steps that constitute some well-defined business activity. Normally, a business transaction requires several actions against the database. When processing transactions, the DBMS must ensure that the transactions follow four well-accepted properties called the ACID properties. 1. Atomic The transaction cannot be subdivided, and hence, it must be processed in its entirely or not at all. 2. Consistent Any database constraints that must be true before the transaction must also be true after the transaction. 3. Isolated Changes to the database are not revealed to users until the transaction is committed. 4. Durable Changes are permanent. Backward Recovery / Rollback The back out, or undo, of unwanted changes to the database. Used to reverse the changes made by transactions that have aborted, or terminated abnormally. Forward Recovery / Rollforward A technique that starts with an earlier copy of the database. After-images (the results of good transactions) quickly moves the database forward to a later state. Types of Database Failure 1. Aborted Transactions A transaction in progress that terminates abnormally. Human error, input of invalid data, hardware failure and deadlock Recovery technique : rollback 2. Incorrect Data When the error data have been processed, the db may be recovered in one of the ff ways: a. If the error is discovered soon enough, backward recovery may be used. b. If only a few errors have occurred, a series of compensating transactions may be introduced through human intervention to correct the errors. c. If the first two measures are not feasible, it may be necessary to restart from the most recent checkpoint before the error occurred, and subsequent transactions processed without the error. Recovery technique : rollback 2 ITEC 65 - ADVANCED DATABASE MANAGEMENT SYSTEM 3. System Failure Some components fails, but the database is not damaged. Power loss, operator loss, loss of communications transmission and system software failure. Recovery technique : Switch to duplicate database 4. Database Destruction The database itself is lost, destroyed, or cannot be read. Disk failure Recovery technique : Switch to duplicate database 5. Disaster Recovery Natural or manmade disaster Major components of a recovery plan o Develop a detailed, written disaster recovery plan. Schedule regular tests of the plan. o Choose and train a multidisciplinary team to carry out the plan. o Establish a back-up data center at an off-site location. o Send back-up copies of database to the back-up data center on a scheduled basis. Controlling Concurrent Access The process of managing simultaneous operations against a database so that data integrity is maintained and the operations do not interfere with each other in a multi-user environment. The Problem of Lost Updates The most common problem encountered when multiple users attempt to update a database without adequate concurrency control is that of lost updates. Another similar type of problem that may occur when concurrency control is not established is the inconsistent read problem. This problem occurs when one user reads data that have been partially updated by another user. Serializability Procedures that process transactions so that outcome is the same as this. Processing transactions using a serializable schedule will give the same results as if the transactions had been processed one after the other. Schedules are designed so that transactions that will not interfere with each other can still be run in parallel. Locking – any data that are retrieved by a user for updating must be locked, or denied to other users, until the update is completed or aborted. Locking Mechanisms Locking Level An important consideration in implementing concurrency control is choosing the locking level. The extent of the database resource that is included with each lock. Levels: o Database – entire database is locked and becomes unavailable o Table – the entire table containing a requested record is locked. o Block or page – the physical storage block (or page) containing a requested record is locked. o Record – only the requested record (or row) is locked. o Field – only the particular field (or column) in a requested record is locked. 3 ITEC 65 - ADVANCED DATABASE MANAGEMENT SYSTEM Types of Locks Shared Lock(also called S locks or read locks) o Allow other transactions to read (but not update) a record or other resources Exclusive Locks (also called X locks or write locks) o Prevent another transaction from reading (and therefore updating) a record until it is unlocked. Deadlock o An impasse that results when two or more transactions have locked a common resource, and each waits for the other to unlock that resource. Managing Deadlock 2 ways to resolve deadlocks 1. Deadlock Prevention User programs must lock all records they require at the beginning of a transaction (rather one at a time Where all locking operations necessary for a transaction occur before any resources are unlocked, a two-phase locking protocol is being used. o two-phase locking protocol – a procedure for acquiring the necessary locks for a transaction where all necessary locks are acquired before any locks are released, resulting in a growing phase, when locks are acquired, and a shrinking phase, when they are released. 2. Dead lock Resolution an approach that allows deadlocks to occur but builds mechanisms into the DBMS for detecting and breaking the deadlocks. Versioning no form of locking. Each transaction is restricted to a view of the database as of the time that transaction started, and when a transaction modifies a record, the DBMS creates a new record version instead of overwriting the old record. Managing Data Quality 3 Important reasons why the quality of data in organizational databases has deteriorated in the past few years. 1. External Data Sources Organizations often purchase data files or database from external organizations, and these sources may contain data that are inaccurate or incompatible with internal data. 2. Redundant Data Storage Many organizations have allowed the uncontrolled proliferation of spreadsheets, desktop databases, legacy databases, data marts, data warehouses, and other repositories of data. Much of thus data are redundant and filled with inconsistencies and incompatibilities. 3. Lack of Organizational Commitment Some organizations are simply in denial that they have problems with data quality. The realize they have a problem but fear that the solution will be too costly or that they cannot quantify the return on investment 4 ITEC 65 - ADVANCED DATABASE MANAGEMENT SYSTEM Data Dictionaries and Repositories Data Dictionary Repository of information about a database that documents data elements of a database. Active data dictionary Managed automatically by the database management software. Always consistent with the current structure and definition of the database because they are maintained by the system itself. Passive data dictionary Managed by the user(s)of the system and is modified whenever the structure of the database is changed. Performed manually by the user; Maybe maintained as a separate database. Not limited to information that can b discerned by the database management system. Information Repositories Used by data administrators and other information specialists to manage the total information processing information. A component that stores metadata that describe an organization’s data and data processing resources manages the total information processing environment, and combines information about an organization’s business information and its application, Information Repository Dictionary System (IRDS) A computer software tool that is used to manage and control access to the information repository. It provides facilities for recording, storing, and processing descriptions of an organization’s significant data and data processing resources. 3 Components of Repository System Architecture Information Model Repository Engine: Objects Relationships Extensible types Version and Configuration Management Repository database 5 ITEC 65 - ADVANCED DATABASE MANAGEMENT SYSTEM Information Model – a schema of the information stored in the repository, which can then be used by the tools associated with the database to interpret the contents of the repository Repository Engine – which manages the repository objects such as reading and writing, repository objects, browsing and extending the information model. 5 core function of repository engine 1. Object Management Object oriented repositories store information about objects. 2. Relationship Management The repository engine contains information about object relationships that can be used to facilitate the use of software tools that attach to the database. 3. Version Management During development, it is important to establish version control. The information repository can be used to facilitate version control for software design tools 4. Configuration Management It may help you think of a configuration as similar to a file directory, except configurations can be versioned and they contain objects rather than files. 6