Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Oracle Database wikipedia , lookup
Relational model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Database model wikipedia , lookup
Clusterpoint wikipedia , lookup
SIEMENS SAP Implementation for PIFRA PIFRA Disaster Recovery Plan Overview: The most important aspect of SAP R/3 implementation at PIFRA is establishing an effective backup and recovery strategy. This process entails a restore of all, or part of the database after hardware or software errors and a recovery to which the PIFRA System(s) is updated to a point just before the failure. Many situations may arise that require a restore and a recovery. These will be discussed in detail in this document. The backup strategy should be as simple as possible. Complications in backup strategy can create difficult situations during restoration and recovery. One of the main aspect of a disaster recovery plan is that the procedures, problem identification, and handling must be well documented so all individuals clearly understand their roles and required tasks. This strategy should also not adversely impact daily transactions at PIFRA. This document discusses backup and restore strategy of systems with respect to the setup and implementation at PIFRA. A System Disaster may be considered "any event that causes significant disruption in services for a period of time that effects the organization". Therefore, this plan covers various levels of service interruption. The information in this plan is organized in such a way that this is possible for the other sites to choose the pieces of the plan necessary for recovery depending on the type of interruption. For example, in the event of a server failure, the DRP must provide the contact numbers of persons necessary for recovery of disaster. It should also provide information of onsite and offsite backups. With this information, the System Administration Team can begin the recovery process in case of a disaster in the most efficient and planned manner. A sensible recovery plan may be the one thing that keeps it going. Here are the basic steps to creating such a plan and making sure it will work when required. These steps are observed through the extensive study of on going process at PIFRA Test Site. Basic Steps Make disaster recovery an integral part of the way PIFRA business process runs. Someone at the top needs explicit responsibility for overseeing the plan, as it is too easy to make dangerous mistakes when times are tough. Prioritize The data and systems that need to be recover first. Each section thinks that theirs is the most important, but the decision has to be made -- and that usually ends up with the System Administration, which has the appropriate business insight. Don't forget to look outside the data centre for things that need protecting. If employees have heavily customized desktops to do their work, how will it affect them if they have to start from scratch? Paper records are also always important. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 1 of 25 Redundancy Make sure System administrator have redundancy for critical systems, whether it's a RAID storage system, server mirroring or even a complete duplicate data centre. There should be no one point of failure, including power supplies, telecommunications or even the office building itself, that will disrupt SAP Servers for any length of time. Backup Along with redundancy, backup is the most important part of disaster recovery. Once System Administrator knows what he needs to backup, decide when and how he will do his backups. A common scheme is to do a full backup at the beginning of each week, followed by deltas -- backups of changes -- at least daily if not more often. These can be differential backups, where the entire difference from the starting state is copied each time, or incremental, where the difference since the last backup is stored. Incremental backups take less time but produce more individual backups that have to be restored in order; with differential, administrator have just two restorations to make. Offsite backups are essential, but difficult to manage -- especially for the smaller organizations. Where teleworking is common, it may be possible to automate the keeping of remote copies of information as part of the standard access arrangements. Whatever the backup process -- and floppy disks, CD-Rs, removable hard disks, tapes, leased lines and VPNs are all common -- ensure that access to offsite backups isn't dependent on just one person. It is common to duplicate the weekly backup and keep it offsite, and also to keep monthly backups. Security Don't neglect security. If System Administrator needs to make backups of sensitive information, is it adequately protected from attack if someone gets access to -- or steals -- the backup? Conversely, if he has a secure backup protected by encryption or severe access controls, is it possible to retrieve the information if key persons are not available. Regular Tests Run regular tests to shake the bugs out of SAP servers -- and that means testing absolutely everything. Tests that produce no errors aren't tough enough: Administrators not testing to make sure it works, but to find out when it doesn't. This will also tell you if your recovery procedure is working but too slow or cumbersome -a system that comes back but takes two days to rebuild may be inappropriate. Deciding backup and restoration strategies should be part of the initial architectural planning of any major system and should influence bus types, storage devices and the segmentation of the network. When your SAP business processes change, reassess your plans. An acquisition, new operating system installation or reorganization can trigger this. Also, when PIFRA change an underlying system and migrate data over make sure you can recover to the old system for as long as may be necessary -- it's no good having old SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 2 of 25 data desperately needed if System Administrator no longer have a system that will read it. There's no point in having your data in the hands of an inexperienced persons. And keep contact information up to date -- lists of employees with addresses and mobile phone numbers, supplier contacts, and making everyone's role in the recovery plan part of their basic training. Siemens Team offer a full range of recovery options depending on PIFRA specific needs and budget from simple hardware replacement to complex mirroring. Backups: Backup means making copies of files from one location to another. Both, the source and the target can be on the same or on different storage device. It is best to copy the files to a different storage device other than the source. Data can be restored from this backup if the original files get damaged, lost or else. The backup strategy adopted at PIFRA is in such a way to minimize the chances of a system disaster and to recover the system at the earliest in case of such a happening. A System Disaster: A System Disaster can mean anything: theft, fire, flood, an earthquake, a virus or anything that could keep users from accessing SAP Servers and hence the data. If System Administrator loses entire system (possibly including hardware), then he has to recover the system as much and as soon as possible. Disaster Recovery Planning and Disaster Recovery There are many sites where most attention is paid to database growth and little or no at all to planning disaster recovery. We can divide disasters into 2 major categories: Lose of data caused by failure of a hard disk drive or a disk controller. If hardware redundancy and backup copies are provided we should be able (usually) to perform onsite recovery within a short period of time. Disasters of catastrophic nature like fire, explosion or flood. In this case we require a comprehensive plan organizing offsite recovery and continuation of business. Data recovery plan for the first case should be a part of comprehensive plan for disaster recovery in second case. In both cases it is important to test recovery plans and to repeat the testing procedure periodically. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 3 of 25 Critical Steps Involved In SAP Recovery Database installatio n and maintenance Planning and conducting of installation Configuration and upgrades. Rearranging data structure. Periodic maintenance of database. Planning platform and System upgrade and migration. Plan This plan follows the document needed for PIFRA Test Site Services, and prepared by Disaster Recovery Team in co-ordination with PIFRA Directorate. Definition of need for a plan The need for a disaster recovery plan is in order to provide data security and access to users during normal operations and at times of disaster. Disaster recovery Team Members of the Disaster Recovery Team responsible for the Data Processing component of disaster recovery include: System Administrator WAN Administrator LAN Administrator Members of the Disaster Recovery responsible for the administrative component of disaster recovery include: Manager of Administrative Services for Test as well as Pilot site. Document goals and objective The Information Technology Services of Siemens will plan and implement disaster recovery techniques applicable to: Computer hardware systems Computer software systems PIFRA project data maintained by computing systems located at the Central Office (Test Site). Data communications systems, including hardware, software, and lines. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 4 of 25 As per the System data security audit, the Assistant Manager for Administrative Services will be the primary administrator for the institutional disaster recovery plan. Each computer-oriented Pilot Site will develop a disaster recovery and/or business interruption plan to assure their continued operation when Central systems are inoperable. Equipment inventory The Test Site at the Central Office is responsible for disaster recovery for the following hardware and software systems at pilot Sites. SAP Servers Hardware, software, and data. Goods inventory system hardware, software, and data. Personnel responsible for hardware, software, and data tapes. PIFRA electronic mail hardware and software. WAN hardware, software, configurations, and lines for systems within. Central Office LAN hardware, software, configurations, data, and wiring. UPS equipment located at the Test Site Office. Key client workstations at the Test Site Office. Disaster Recovery Phases: Mission Critical Elements and Applications 1. 2. 3. 4. 5. 6. SAP Database Servers. PIFRA Information System Local Area Network Electronic Mail System Inventory System LAN/WAN Systems Required To Restore Mission Critical Services 1. SAP Servers - Development server (based on multiprocessor - 512MB memory with RAID Configuration) 2. Quality assurance Server (8GB RAM ,120GB HD) 3. Production Server. 4. Ethernet network 5. Intel based LAN file server with 128MB memory and 40GB disk storage 6. Test site UPS(Server) – 2.2KVA 7. Intel based Print server with 128RAM and 20 GB Hard disk storage. 8. Blank Tapes for Backups 9. Data circuits between Pilot site location Equipment and Space Physical Space for Restoring Institutional Systems In the event current office space was unusable, new office space would be necessary that meets the follow requirements: SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 5 of 25 Minimum of 180 square feet of floor space Temperature between 60 and 75 degrees Fahrenheit Humidity - noncondensing Ethernet or Radio Link/DXX access to backbone Wiring capable of supporting Ethernet LAN would be necessary to support new Central Office’s LAN Equipment PIFRA and SAP servers would need to be acquired from Supplier. LAN server could be any Intel based PC of sufficient size. Access to 2 ports on link integrity a hub would be required for PIFRA and electronic mail servers. 48 port link integrity Switch would be required for restoring new Central Office LAN. Personnel All System administration staff would be required to restore mission critical systems to operation. Key members of other areas including the Functional Consultants, and members of their staffs, would be required to assist in restoration of critical systems. Record Storage Offsite backup tapes are stored in a safe deposit box at ITS Office of Siemens. Each Monday the backups for the previous Friday are moved to this site and previously stored tapes returned. Access to the offsite tapes is available 24 hours. System Administrator has access to this box. The box also includes a copy of this plan, as well as brief documentation on PIFRA, Inventory. SAP Servers recovery includes the time needed to: Find the problem Repair the damage Restore the System Online the System for all users Factors involved in a disaster recovery plan: Business process cost of downtime to recover Operational schedule Global or local users Number of transactions an hour Budget Steps involved in Disaster recovery plan: A disaster recovery plan, sometimes referred to as a business continuity plan (BCP) or business process contingency plan (BPCP) - describes how a PIFRA System Administrator is to deal with potential disasters. Just as a disaster is an event that SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 6 of 25 makes the continuation of normal functions impossible, a disaster recovery plan consists of the precautions taken so that the effects of a disaster will be minimized, and the System Administration Team will be able to either maintain or quickly resume mission-critical functions. Typically, disaster recovery planning involves an analysis of business processes and continuity needs; it may also include a significant focus on disaster prevention. Risk Analysis The first step in drafting a disaster recovery plan is conducting a thorough risk analysis of SAP R/3 systems. List all the possible risks that threaten system uptime and evaluate how imminent they are Test Site. Anything that can cause a system outage is a threat, from relatively common man-made threats like virus attacks and accidental data deletions to more rare natural threats like floods and fires. Establish the Budget: Once you've figured out the risks, ask what we can do to suppress them, and how much will it cost. The results of risk analysis should be a comprehensive list of possible threats, each with its corresponding solution and cost. Develop the Plan: The recovery procedure should be written in a detailed plan or "script." Establish a Recovery Team from among the System Administration Team and assign specific recovery duties to each member. Define how to deal with the loss of various aspects of the network (databases, servers, bridges/routers, communications links, etc.) and specify who arranges for repairs or reconstruction and how the data recovery process occurs. The script will also outline priorities for the recovery: What needs to be recovered first? What is the communication procedure for the initial respondents? To complement the script, create a checklist or test procedure to verify that everything is back to normal once repairs and data recovery have taken place. Test, Test, Test: Once our Disaster Recovery Plan is set, test it frequently. Eventually you'll need to perform a component-level restoration of your largest databases to get a realistic assessment of our recovery procedure, but a periodic walk-through of the procedure with the Recovery Team will assure that everyone knows their roles. Test the systems you're going to use in recovery regularly to validate that all the pieces work. Always record your test results and update the Disaster Recovery Plan to address any shortcomings. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 7 of 25 Update the Disaster Recovery Plan: It is very important to update the disaster recovery plan from time to time. This depends how rapidly changes are being brought in the organization with respect to the change in system architecture, change of system activities, etc. Systems/devices to be prevented from a disaster at PIFRA: System SID IP Address Category Location AP Development Level Development System Play / Sandbox Sindh Play System R3T PLY R3D 192.168.1.251 192.168.1.240 C B Central Site Central Site AGPR s/o Sindh C B C C - Central Site Siemens ITS Siemens ITS ATI, Lahore FD Baloch AGPR s/oSindh AGPR s/oSindh SAP Quality Assurance Level Quality Assurance System Data Migration System I Data Migration System II Training System II (Punjab) Training System I (Baloch) Training System III (Sindh) Data Migration III (Baloch) QAS DMF DMH TR2 TR1 R3Q UTH 192.168.243 192.168.1.9 192.168.1.212 SAP Production Level Federal Production Database Server Application Server I Application Server II NWFP Production Database Server Application Server I Application Server II Punjab Production Sindh Production Balochistan Production MoF Production Database Server Application Server FD NWFP Production Database Server Application Server FD Punjab Production Database Server Application Server FD Sindh Production FDN FDN 192.168.10.151 192.168.10.150 A C C A B B A B B B B B B FDP FDP 192.168.18.2 192.180.18.3 B B FED FED FED 192.168.8.116 192.168.8.114 192.168.8.115 PSH PSH PSH PRP PRS PRB 192.168.1.36 192.168.1.102 192.168.1.85 192.168.1.253 192.168.3.126 192.168.1.253 MOF MOF SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan AGPR Islamabad AGPR Islamabad AGPR Islamabad Central Site Central Site Central Site Siemens ITS AGPR s/o Sindh AGPR s/o Sindh MoF MoF FD NWFP FD NWFP FD Punjab FD Punjab Prepared By Muhammad Bilal Date May 27, 2004 Page 8 of 25 Database Server FDS 192.168.1.79 B FD Sindh Application Server FDS 192.168.1.33 B FD Sindh FD Baloch Production Database Server FDB B Siemens ITS Application Server District Functionality System FDB B Siemens ITS ABF C Siemens ITS C Central Site Central Site Other Servers Exchange Server File Server Network Devices Switch 1 Switch 2 Switch 3 Hub 1 Hub 2 PIFRA SIP 192.168.1.7 192.168.1.250 Brand Allied Telesyn Allied Telesyn Allied Telesyn 3-Com 3-Com Ports 24 ports Location Server Room 24 ports Server Room 16 ports Computer Centre 8 ports 12 ports SDC-124 Computer Centre LAN Equipment: -Centre COM -8 Port Hub , 10-Base-T/BNC Port - 24 Port Switch able Ethernet 10BaseT/100 Base-TX switch-Allied Telesyn - 24 Port Switch able Ethernet 10BaseT/100 Base-TX switch-Allied Telesyn -16 Port Fast Ethernet Switch 10Base-T-16 Port WIRELESS NETWORKING -Wave, Wireless Networking-Speed LAN Uninterruptible Power Supply (UPS) -2.2 KVA UPS for each Server(R3T,QAS,Production,Play) SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 9 of 25 Application/system software inventory PIFRA Information System: System Software: Application Software Database Software: ORACLE, SAP Personnel Desktop Publishing: System Software: Windows 2000,XP Application Software: Database Software: Backups: Incremental (M-TH), Full (F) PIFRA Data Warehouse: System Software: Windows 2000 ,XP Application Software: Client preference Database Software: SQL Server Backups: Incremental Preventive Measurements for a System Disaster: Rapid Recovery – Advanced options for managing data and helping to recover quickly and minimizing data loss. This may include duplicate systems at a Siemens recovery facility which mirrors PIFRA primary systems to provide fast and current data recovery. Fixed Site – Quick, complete recovery capability at specialized, strategically located at Site. Mobile – Self contained mobile trailers configured to Pifra’s requirements and delivered to the location of choice. Managed Delivery – Temporary hardware replacement delivered to the location of Pilot site. Backup and Recovery Considerations Define business, operational, and technical requirements for a backup and recovery strategy Identify the components of a disaster recovery plan Discuss the importance of testing a backup and recovery strategy SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 10 of 25 Database Failures: There are certain cases in which the database of a system crashes or fails to open. These database errors can occur due to the following: Failures caused by user errors (such as logical errors) Failures due to errors in upgrade Failures due to errors in data transfer Limited database free space Corruption of data files Data files getting offline Limited space in rollback segments SAP Recovery Structures and Processes Identify Oracle processes, file structures, and memory components as they pertain to backup and recovery Observe the importance of checkpoints, redo logs, and archives Identify the process of synchronizing files during a checkpoint Multiplex control files and redo logs Oracle Backup and Recovery Configuration Identify recovery implications of operating in “Noarchive” mode Describe the differences between “Archive log” mode and “Noarchivelog” mode Configure a database for “Archive log” mode and automatic archiving Use init.ora parameters to duplex archive log files Oracle Recovery Manager Overview Determine when to use RMAN List the uses of Backup Manager Identify the advantages of RMAN with and without a recovery catalog Create a recovery catalog Connect to Recovery Manager SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 11 of 25 Oracle Recovery Catalog Maintenance Use Recovery Manager to register, resynch, and reset a database Maintain the recovery catalog using change, delete, and catalog commands Query the recovery catalog to generate reports and lists Create and execute scripts to perform backup and recovery operations Create, store, and run scripts Physical Backups without Oracle Recovery Manager Perform database backups using operating system commands Describe the recovery implications of closed and open backups Perform closed and open database backups Identify the backup implications of the “Logging” and “Nologging” modes Identify the different types of control file backups Discuss backup issues associated with “read only” tablespaces List the data dictionary views useful for backup operations Physical Backups Using Oracle Recovery Manager Identify types of RMAN backups Describe backup concepts using RMAN Perform incremental and cumulative backups Troubleshoot backup problems View information from the data diction Types of Failures and Troubleshooting List the types of failure that may occur in an Oracle database environment Describe the structures for instance and media recovery Use the DBVERIFY utility to validate the structure of an Oracle database file Configure checksum operations Use log and trace files to diagnose backup and recovery problems SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 12 of 25 Oracle Recovery without Archiving Note the implications of media failure with a database in noarchivelog mode Recover a database in noarchivelog mode after media failure Restore files to a different location if media failure occurs Recover a database in noarchivelog mode using RMAN Complete Oracle Recovery with Archiving Note the implications of instance failure with an archivelog database Describe a complete recovery operation Note the advantages and disadvantages of recovering an archivelog database Recover an archivelog database after media failure Recover an archivelog database using RMAN and Backup Manager Incomplete Oracle Recovery with Archiving Identify the situations to use an incomplete recovery to recover the system Perform an incomplete database recovery Recover after losing current and active logs Use RMAN in an incomplete recovery Work with table space point-in-time recovery Oracle Export and Import Utilities Use the Export utility to create a complete logical backup of a database object Use the Export utility to create an incremental backup of a database object Invoke the direct-path method export Use the Import utility to recover a database object SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 13 of 25 Additional Oracle Recovery Issues List methods for minimizing downtime Diagnose and recover from database corruption errors Reconstruct a lost or damaged control file List recovery issues associated with an offline or read-only table space Additional Security Options Backups: Backup means making copies of files from one location to another. Both, the source and the target can be on the same or on different storage device. It is best to copy the files to a different storage device other than the source. Data can be restored from this backup if the original files get damaged, lost or else. The backup strategy adopted at PIFRA is in such a way to minimize the chances of a system disaster and to recover the system at the earliest in case of such a happening. System Backups Both full and incremental backups Nightly incremental (M-Th) Weekly full backups Weekly backups rotated offsite Weekly exports of Oracle tables Restore: Usually restoration of data is carried out due to the following reasons: Recover after a system disaster Test your disaster recovery plan Copy your database to another system For continuous business transactions, the restoration procedures should be well– outlined and time-effective to get the system operational. Raid Technology: The basic idea behind RAID (Redundant Array of Independent Disks) is to combine multiple small, inexpensive disk drives into an array which yields performance exceeding that of one large and expensive drive. This array of drives will appear to the computer as a single logical storage unit or drive. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 14 of 25 Since at PIFRA, there are large quantities of data to keep, it would be beneficial using the RAID technology. One of the primary reasons to use RAID includes greater efficiency in recovering from a disk failure. Therefore RAID reduces the chances of a disaster to a system. ERD’s: Emergency Repair Disk (ERD) creation procedure has been integrated with Microsoft Servers in case of registry corruption. Registry is the main database of operating system which holds all the information related to hardware and software installed on the machine. It is recommended that ERD to be updated frequently. Backups on Tapes (DDS Tapes) Data to backup: SAP R/3 Systems: Operating System files SAP application files Oracle Database Log files Other Systems (Domain Servers, File Servers etc): Operating System files PIFRA documents (Microsoft Office documents) Miscellaneous Types of Backup A backup on DDS Tapes is taken in either of the two modes: Offline Online Offline Incase of the SAP R/3 System an offline backup is taken with the application and database stopped - that is, the users cannot work. In an offline backup of the complete database, you have a backup of the database that is consistent. If you work with the database after the backup, the backup is consistent, but not up-to-date. In this case, you have to recover the database after you restore the backup. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 15 of 25 Online Online backup is taken with the application and database running - that is, the users can continue to work normally. The management of database changes by the corresponding Oracle background processes is not affected either. Backup Utilities Operating System backup utility: The offline backup is taken while the SAP application and database is down in the case of SAP R/3 Systems. Similarly for systems other than SAP, it is essential that no users are connected to those systems for an offline backup. Since high capacity tape drives are now more common, it is simple and safe to backup the entire server. This full server backup eliminates the possibility of not backing up an important file. In an offline backup the data in the database does not change while the backup is being made, which means that you have a static “picture” of the database and do not have to deal with the issue of data changing while the backup is being run. A full server offline backup also gives you the most complete backup in the event of a catastrophic disaster. On one tape, you have everything of the server. SAP R/3 backup utilities: SAP R/3 offers the utility programs BRBACKUP, BRARCHIVE and BRRESTORE. Each of these programs has its own range of functions that is backup, archiving the redo log files and restore respectively. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 16 of 25 Backup Schedule to be followed at PIFRA PIFRA SAP R/3 Development System 1. Monday 2. Tuesday 3. Wednesday 4. Thursday 5. Friday 6. Saturday 7. Sunday Online Backup (Online Tape # 1) ----------------------------------------------Online Backup (Online Tape # 2) ----------------------------------------------Online Backup (Online Tape # 3) ----------------------------------------------Offline Backup (Offline Tape # 1) On next Monday the first tape is repeated for online backups. DDS tapes for offline backups will be recycled on every 4th Sunday. Note: We have 3 days Online Backup of the development system and Offline Backup of 2 Sundays in hand. First DDS tape of Offline Backup is repeated on every 4th Sunday. Offsite backups are to be rotated on weekly basis. PIFRA SAP R/3 Quality Assurance System 1. Monday 2. Tuesday 3. Wednesday 4. Thursday 5. Friday 6. Saturday 7. Sunday Online Backup (Online Tape # 1) ----------------------------------------------Online Backup (Online Tape # 2) ----------------------------------------------Online Backup (Online Tape # 3) ----------------------------------------------Offline Backup (Offline Tape # 1) On next Monday the first tape is repeated for online backups. DDS tapes for offline backups will be recycled on every 4th Sunday. Note: We have 3 days Online Backup of the Quality assurance system and Offline Backup of 2 Sundays in hand. First DDS tape of Offline Backup is repeated on every 4th Sunday. Offsite backups are to be rotated on weekly PIFRA SAP R/3 Production Systems: 1. Monday 2. Tuesday 3. Wednesday 4. Thursday 5. Friday 6. Saturday 7. Sunday 8. Monday 9. Tuesday Online Backup (Online Tape # 1) Online Backup (Online Tape # 2) Online Backup (Online Tape # 3) Online Backup (Online Tape # 1) Online Backup (Online Tape # 2) Online Backup (Online Tape # 3) Offline Backup (Offline Tape # 1) Online Backup (Online Tape # 1) Online Backup (Online Tape # 2) SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 17 of 25 10. Wednesday 11. Thursday 12. Friday 13. Saturday 14. Sunday Online Backup (Online Tape # 3) Online Backup (Online Tape # 1) Online Backup (Online Tape # 2) Online Backup (Online Tape # 3) Offline Backup (Offline Tape # 2) From every 15th day the first DDS tape is to be repeated for online backups 15. Monday 16. Tuesday 17. Wednesday 18. Thursday 19. Friday 20. Saturday 21. Sunday 22. Monday 23. Tuesday 24. Wednesday 25. Thursday 26. Friday 27. Saturday 28. Sunday Online Backup (Online Tape # 1) Online Backup (Online Tape # 2) Online Backup (Online Tape # 3) Online Backup (Online Tape # 1) Online Backup (Online Tape # 2) Online Backup (Online Tape # 3) Offline Backup (Offline Tape # 3) Online Backup (Online Tape # 1) Online Backup (Online Tape # 2) Online Backup (Online Tape # 3) Online Backup (Online Tape # 1) Online Backup (Online Tape # 2) Online Backup (Online Tape # 3) Offline Backup (Offline Tape # 1) Note: We have 3 days Online Backup and Offline Backup of 3 Sundays in hand. First DDS tape of Offline Backup is repeated on the 4th Sunday. Offsite backups are to be rotated on Monthly basis. Other PIFRA Servers (File Servers, Domain Servers etc): 1. Tuesday 2. Thursday 3. Saturday Complete Backup (Tape # 1) Complete Backup (Tape # 2) Complete Backup (Tape # 3) On every Tuesday the first tape is repeated for backups. Note: We have 2 days Backup of the systems in hand. Offsite backups are to be rotated on weekly basis. Possible Failures at Test Site There are chances of failure or crash of system at various situations. The possible failures can be: Hardware Failures Network Failures Operating System Failure SAP Software/Application Failure Database Failure SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 18 of 25 Reason Of Failures The possible reasons of failure can be: Fire Theft Earthquake Flood Electric Fluctuations UPS Failure Hard Disk problems Problems in the hardware e.g. RAM, Motherboard, VGA, Data Buses etc. Abnormal shutdown of server Virus Attack Password Policy Illegal Operation in the System Precautions/Safety Measurements at PIFRA To save the system from possible crash there are number of options that must be taken into consideration; To protect data loss & system disaster, RAID level (software & hardware) is configured. For safety you should have two Array controller and SCSII disks should be distributed equally. It is also important that your hard disk should be hot plug-in (i.e. incase of disk failure the disk could be replaced while the system is running). There must be at least two Network Cards to protect the system against Network Card failure. To save the system from electric shock there is dire need of having power system solely dedicated for a particular LAN setup, for that purpose a separate electric power-system powered by generator is a necessary prerequisite. If UPS fails to support the servers, there are chances that operating system may be saved but in extreme case there is a possibility of crash of operating system. For that purpose Repair Option during Operating System installation must be utilized from operating system CD. If hard disk failure occurs than first repair or replace the hard disk. Do not install unnecessary softwares on the servers, for those purposes view the error logs daily to rectify the problem. Avoid abnormal shutdown of servers. If it happens recover the system from the last known good configuration settings SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 19 of 25 To recover the crashed operating system ERDs must be prepared weekly to save the data along with the full backup. ERDs stores the registry information as well needed for operating system recovery. If operating system crashes and blue screen appears, then fresh installation of operating system is recommended on the same drive where the previous operating system is residing with new folder name. Systems should be updated with the latest anti virus software in order to be secure from virus attacks. To secure the system and protect it from any hazard, operating system’s password must be changed frequently and make it difficult for others to guess it. Proper password policy should be maintained. Benefit of Preventive Measurements Benefits are from complex business continuity recovery issues to focus on PIFRA’S core business process. Provides long term cost efficiencies with complete project management and professional task execution Ensure a comprehensive business continuity program consistent with PIFRA’S policies Provides a single point of contact for improved communications and coordination between Client & Customer. WAN Connectivity Problems Loss of Site connectivity to SAP Servers (Dev, QAS, PRO) systems. Loss of electronic mail. Loss of internet connectivity Central Office LAN Connectivity Problem Loss of access to word processing software, spreadsheet software, database software, and other shared software. Loss of access to personnel data stored on File & Print Server. Loss of access to departmental data stored on server. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 20 of 25 Other Important Tasks To Avert Disaster Performance tuning Performance should be one of the main issues during every phase of system development: application analysis, design, and implementation. But usually it becomes an issue once it becomes a problem – when the system is in run. To solve performance problems requires a close look at the whole system to identify potential bottlenecks. More often problems with performance arise from many different areas like: hardware configuration software configuration application design transaction and query design Steps To Be Taken Typically the action consists of the following steps: Analysis of the overall function of the system Server analysis – collecting of OS and RDBMS statistics Query analysis – execution plans and concurrency issues Remote database monitoring and administration There are possibilities where service of full-time database administrator is not required. It is enough to provide remote monitoring of performance and tracing potential sources of failure like lack of system resources. Many database administration services can be done remotely. On site assistance can be provided on as-needed basis. For example on leased line or dial up at abbottabad , can be connected to Test Site at Islamabad. All the servers at test site can be accessed from Abbottabad when required. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 21 of 25 By combination of on-site and on-line administration costs can be cut to minimum with the same productivity and reliability of system. Database audit It is must to review and evaluation of existing database practices and procedures. Special attention is paid to backup and recovery, migration from development to production, tape handling, upgrades and patch installations. This may also cover any site specific procedures that may affect data security and integrity. TIPS & HINTS Possible Disasters Repercussions Feasible solutions/tools specific to PIFRA Server room may catch fire or any natural calamity may occur Servers are burnt or become in-operative Hardware failure Server may not be available for users. UPS failure/shutdown Abnormal shut down of database may occur. Data base may corrupt Operating System Failure Operating System crashes SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Fire alarm system must be installed in the Server rooms. Latest offsite and online backups be available. Availability of all software’s (SAP, Oracle, OS etc.) Acquire new hardware, install software & restore complete data from backups. To avert hardware failure, RAID technology must be used as well as dual network cards and SCSII. Repair or replace the hardware as soon as possible. Restore the system Power supply must be secured through generators. UPS must be replaced Time to time check of hardware performance using tools. Thorough inspection of hardware should be made. Availability of generators is a prerequisite. Fresh installation of operating system is recommended on the same drive where previous operating system is residing but with new folder name Use ERDs to recover the operating Prepared By Muhammad Bilal Date May 27, 2004 Page 22 of 25 system Abnormal shutdown of database. System variables are deleted or corrupted intentionally or unintentionally OS, SAP and Database may be corrupted. Server may not be operative for SAP users. Database failure An error occurred during an upgrade (Scenario 1) Database failure (Scenario 2) Database Failure (Scenario 3) Database Failure (Scenario 4) A logical error occurred during normal database operations. SAP System will remain unavailable to users A logical error occurred during normal database operation that was only recognized later. The structure of the database was changed between the error and the last complete backup. A logical error occurred during normal database operation that was only recognized later. The database was reorganized between the error occurring and its discovery. You want to recover the database to the point in time before the error. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Servers must be placed in lock and key coupled with responsible person. Troubleshoot the exact damage (if any) to the system and recover the system Secure password policy should be implemented. Password of OS, sapadm user must be changed frequently. A softcopy and a hard copy of environment variables should always be available in the case that environment/system variables are to be added manually in the system. Restore the last complete backup without control files and online redo log files through SAPDBA Recover the database through SAPDBA Restore the last complete backup without control files and online redo log files through SAPDBA Import the redo logs and recover database. Restores the last complete backup without control files and online redo log files Recover the structural changes (CREATE DATA <filename> AS <filespec>). The redo log files are imported and the database recovered. SAPDBA restores the last complete backup without control files and online redo log files Restore the control files as they were before the reorganization. During the reorganization the control files were backed up in the directory <ORACLE_HOME>/sapreorg/. The redo log files are imported and the database recovered. (Recovery with the option and USING BACKUP CONTROLFILE). Prepared By Muhammad Bilal Date May 27, 2004 Page 23 of 25 Database Failure (Scenario 5) Database Failure (Scenario 6) Database Failure Datafile(s) may become offline and inaccessible to the users Limited or no space left in database and rollback segments Online the datafile(s) using server manager tool Increase the space of the database by either increasing the size of the datafile or adding a new datafile System hangs due to archival stuck Remove the old archive logs or backup them on a separate storage device Database could not open and hence not accessible to users Data files may be corrupted, recover the specific datafile If required restore the datafile from a backup and then recover the database (Scenario 7) Database Failure (Scenario 8) Important Instructions: System Administration team is assigned the task of recovering the system in case of a disaster. They should know the following regarding disaster recovery plan: Properly monitor your system, closely and efficiently, in order to reduce the chances of a system disaster. Take regular backups of the systems according to the backup schedule. Take immediate actions in case of errors in backup and resolve the backup errors. Proper tagging of the backup tapes should be carried out. Exact location of all the software necessary for recovering a complete system should be known. Exact location of the onsite backup should be known. There should be a sufficient number of tapes for backups. Separate tapes should be kept for offline and online backups. Weekly rotation of offsite backup tapes should be carried out. Exact location of the offsite backup should be known. Backup tapes should be in a secure place and should be easily accessible to the System Administrator in case of a system disaster. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 24 of 25 In case of a disaster, find the cause of the problem. Take appropriate actions to the problem occurred; some of these have been already mentioned above. Check logs and take precautionary actions in order to avoid the same disaster in future. Update the disaster recovery plan as per changes in the system architecture, system activities etc Present contact persons and numbers (Siemens Pakistan): S.No. Name 1. Abdus Samad 2. Rehan Saleem 3. Muhammad Ali Shaikh Contact Address PIFRA Central Site, Islamabad PIFRA Central Site, Islamabad PIFRA Central Site, Islamabad Contact Number 051-9224036 051-9224056 051-9224036 051-9224056 051-9224036 051-9224056 Company Siemens Pakistan Siemens Pakistan Siemens Pakistan We suggest PIFRA Directorate to nominate some PIFRA system administrators to be a part of PIFRA Disaster Recovery Team. SIEMENS SAP Implementation for PIFRA Disaster Recovery Plan Prepared By Muhammad Bilal Date May 27, 2004 Page 25 of 25