* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Mitilec09
Open Database Connectivity wikipedia , lookup
Serializability wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Ingres (database) wikipedia , lookup
Oracle Database wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational model wikipedia , lookup
Functional Database Model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Clusterpoint wikipedia , lookup
Layouts and Scripts In this session, we will be looking at a number of different physical layouts The physical layout very much influences How much data a database can hold How many users How many concurrent processes Recovery capability Performance Nature of Database Administration Cost Expansion © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 1 Oracle Architecture Oracle 8i is an object-relational database management system. It contains the capabilities of relational and object-oriented database systems It is a database server for many types of business applications including On Line Transaction Processing (OLTP) Decision Support Systems Data Warehousing © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 2 Oracle Architecture In perspective, Oracle is a not strictly high end application DBMS - but it is getting close A high end system has one or more of these characteristics: Management of a very large database (VLDB) - probably hundreds of gigabytes or terabytes Provides access to many concurrent users - in the thousands, or tens of thousands Gives a guarantee of constant database availability for mission critical applications - 24 hours a day, 7 days a week. © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 3 Oracle Architecture High end applications environments are not normally controlled by Relational Database Management Systems High end database environments are controlled by mainframe computers and non-relational DBMSs. Current RDBMSs (such as Oracle 9i, DB2/6000) manage very large amounts of data, or perform under demanding transaction loads. However, very fast processors, fast and secure communications, XML, large and fast storage facilities, ‘intelligent’ software, and advanced distributed database engines are making RDBMSs very attractive © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 4 Layouts and Scripts There are some guidelines for designing a database with files distributed so that optimum performance, from a specific configuration, can be achieved The primary aspect which needs to be clearly understood is the nature of the database Is it transaction oriented ? Is it read-intensive ? What is the mix ? © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 5 Layouts and Scripts The key items which need to be understood are Identifying Input/Output contention among datafiles Identifying Input/Output bottlenecks among all database files Identifying concurrent Input/Output among background processes Defining the security and performance goals for the database Defining the system hardware and mirroring architecture Identifying disks which can be dedicated to the database © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 6 Layouts and Scripts Let’s look at tablespaces : These ones will be present in some combination System Data Data_2 Indexes Indexes_2 RBS RBS_2 Temp Temp_user © 2003 Monash IT Pty Ltd Data dictionary Standard-operation tables Static tables used during standard operation Indexes for the standard-operation tables Indexes for the static tables Standard-operation Rollback Segments Special rollback segments used for data loads Standard operation temporary segments Temporary segments created by a temporary user DataBase Design Lect 9 / 7 Layouts and Scripts Tools Tools_1 Users Agg_data Partitions RDBMS tools tables Indexes for the RDBMS tools tables User objects in development tables Aggregation data and materialised views Partitions of a table or index segments; create multiple tablespaces for them Temp_Work Temporary tables used during data load processing © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 8 Layouts and Scripts (A materialised view stores replicated data based on an underlying query. A materialised view stores data which is replicated from within the current database. A Snapshot stores data from a remote database. The system optimiser may choose to use a materialised view instead of a query against a larger table if the materialised view will return the same data and thus improve response time. A materialised view does however incur an overhead of additional space usage, and maintenance) © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 9 Layouts and Scripts Each of the tablespaces will require a separate datafile Monitoring of I/O performance among datafiles is done after the database has been created, and the DBA must estimate the I/O load for each datafile (based on what information ?) The physical layout planning is commenced by estimating the relative I/O among the datafiles, with the most active tablespace given a weight of 100. Estimate the I/O from the other datafiles relative to the most active datafile © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 10 Layouts and Scripts Assign a weight of 35 for the System tablespace files and the index tablespaces a value of 1/3 or their data tablespaces Rdb’s may go as high as 70 (depending on the database activity) - between 30 and 50 is ‘normal’ In production, Temp will be used by large sorts Tools will be used rarely in production - as will the Tools_2 tablespace © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 11 Layouts and Scripts So, what do we have ? Tablespace Weight Data 100 Rbs 40 System 35 Indexes 33 Temp 5 Data_2 4 Indexes_2 2 Tools 1 - Something like this % of Total 45 18 16 15 2 2 1 1 (220) © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 12 Layouts and Scripts 94% of the Input/Output is associated with the top four tablespaces This indicates then that in order to properly the datafile activity, 5 disks would be needed, AND that NO other database files should be put on the disks which are accommodating the top 4 tablespaces There are some rules which apply : 1. Data tablespaces should be stored separately from their Index tablespaces 2. Rbs tablespaces should be stored separately from their Index tablespaces © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 13 Layouts and Scripts and 3. The System tablespace should be stored separately from the other tablespaces in the database In my example, there is only 1 Data tablespace. In production databases there will probably be many Data spaces (which will happen if Partitions are used). If/when this occurs, the weightings of each of the Data tablespaces will need to be made (but for my efforts, 1 Data tablespace will be used). © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 14 Data Protection There is a Log Writer process (LGWR) whose function is to write this data to the log file. Transaction Log Group Log Switch Log Log group group Archived transaction log Log Switch © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 15 Layouts and Scripts As you have probably guessed, there are other tablespaces which require to be considered - many used by the many and various ‘processes’ of Oracle One of these considerations is the on-line redo log files (you remember these and their purpose ?) They store the records of each transaction. Each database must have at least 2 online redo log files available to it - the database will write to one log in sequential mode until the redo log file is filled, then it will start writing to the second redo log file. © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 16 Layouts and Scripts Redo log files (cont’d) The Online Redo Log files maintain data about current transactions and they cannot be recovered from a backup unless the database is/was shut down prior to backup - this is a requirement of the ‘Offline Backup’ procedure (if we have time we will look at this) On line redo log files need to be ‘mirrored’ A method of doing this is to employ redo log groups - which dynamically maintain multiple sets of the online redo logs The operating system is also a good ally for mirroring files © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 17 Layouts and Scripts Redo log files should be placed away from datafiles because of the performance implications, and this means knowing how the 2 types of files are used Every transaction (unless it is tagged with the nologging parameter) is recorded in the redo log files The entries are written by the LogWriter (LGWR) process The data in the transaction is concurrently written to a number of tablespaces(the RBS rollback segments and the Data tablespace come to mind) via the DataBase Writer (DBWR) and this raises possible contention issues if a datafile is located on the same disk as a redo log file © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 18 Layouts and Scripts Redo log files are written sequentially Datafiles are written in ‘random’ order - it is a good move to have these 2 different demands separated If a datafile must be stored on the same disk as a redo log files, then it should not belong to the System tablespace, the Rbs tablespace, or a very active Data or Index tablespace So what about Control Files ? There is much less traffic here, and they can be internally mirrored. (config.ora or init.orafile). The database will maintain the control files as identical copies of each other. There should be 3 copies, across 3 disks © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 19 Layouts and Scripts The LGWR background process write to the online redo files in a cyclical manner When the lst redo file is full, it directs writing to the 2nd file …. When the ‘last’ file is full, LWGR starts overwriting the contents of the 1st file .. and so on When ARCHIVELOG mode is used, the contents of the ‘about to be overwritten file’ are written to a redo file on a disk device © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 20 Layouts and Scripts There will be contention on the online redo log as LGWR will be attempting to write to one redo log file while the Archiver (ARCH) will be trying to read another. The solution is to distribute the redo log files across multiple disks The archived redo log files are high I/O and therefore should NOT be on the same device as System, Rbs, Data, or Indexes tablespaces Neither should they be stored on the same device as any of the online redo log files. © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 21 Layouts and Scripts The database will stall if there is not enough disk space, and the archived files should directed to a disk which contains small and preferably static files Concurrent I/O A commendable goal, and one which needs careful planning to achieve. Placing two random access files which are never accessed at the same time will quite happily avoid contention for I/O capability © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 22 Layouts and Scripts What we have just covered is known as 1. Concurrent I/O - when concurrent processes are being performed against the same device (disk) This is overcome by isolating data tables from their Indexes for instance 2. Interference - when sequential writing is interfered by reads or writes to other files on the same disk © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 23 Layouts and Scripts At the risk of labouring this a bit, The 3 background processes to watch are 1. DBWR, which writes in a random manner 2. LGWR, which writes sequentially 3. ARCH, which reads and writes sequentially LGWR and ARCH write to 1 file at a time, but DBWR may be attempting to write to multiple files at once - (can you think of an example ?) Multiple DBWR processes for each instance or multiple I/O slaves for each DBWR is a solution © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 24 Layouts and Scripts What are the disk layout goals ? Are they (1) recoverability or (2) performance Recoverability must address all processes which impact disks (storage area for archived redo logs and for Export dump files (which so far we haven’t mentioned) come to mind)). Performance calls for file I/O performance and relative speeds of the disk drives © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 25 Layouts and Scripts What are some recoverability issues ? All critical database files should be placed on mirrored drives, and the database run in ARCHIVELOG mode The online redo files must also be mirrored (Operating system or mirrored redo log groups) Recoverability issues involve a few disks and this is where we start to look at hardware specification © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 26 Layouts and Scripts Mirroring architecture leads to specifying the number of disks required the models of disks (capacity and speed) the strategy If the hardware system if heterogeneous, the faster drives should be dedicated to Oracle database files RAID systems should be carefully analysed as to their capability and the optimum benefit sought - RAID-1, RAID-3 and RAID-5 have different processes relating to parity © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 27 Layouts and Scripts The disks chosen for mirroring architecture must be dedicated to the database This guarantees that non-database load on these disks will not interfere with database processes © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 28 Layouts and Scripts Goals for disk layout : The database must be recoverable The online redo log files must be mirrored via the system or the database The database file I/O weights must be estimated Contention between DBWR, LGWR and ARCH must be minimised Contention between disks for DBWR must be minimised The performance goals must be defined The disk hardware options must be known The disk mirroring architecture must be known Disks must be dedicated to the database © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 29 Layouts and Scripts So where does that leave us ? We’re going to look at ‘solutions’ from Optimal to Practical and we’ll assume that : the disks are dedicated to the database the online redo log files are being mirrored by the Operating System the disks are of identical size the disks have identical performance characteristics (obviously the best case scenario !) © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 30 Layouts and Scripts So, with that optimistic outlook let’s proceed Case 1 - The Optimum Physical Layout Disk No Contents 1 Oracle Software 2 SYSTEM tablespace 3 RBS tablespace 4 DATA tablespace 5 INDEXES tablespace 6 TEMP tablespace 7 TOOLS tablespace 8 OnLine Redo Log 1 9 OnLine redo log 2 10 OnLine redo Log 3 11 Control file 1 © 2003 Monash IT Pty Ltd Disk No. Contents 12 Control file 2 13 Control file 3 14 Application software 15 RBS_2 16 DATA_2 17 INDEXES_2 18 TEMP_USER 19 TOOLS_1 20 USERS 21 Archived redo dest. disk 22 Archived dump file DataBase Design Lect 9 / 31 Hardware Configurations The 22 disk solution is an optimal solution. It may not be feasible for a number of reasons, including hardware costs In the following overheads there will be efforts to reduce the number of disks, commensurate with preserving performance © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 32 Hardware Configurations This leads to - 17 disk configuration Disk Contents Disk Contents 1 Oracle software 11 Application software 2 SYSTEM tablespace 12 RBS_2 3 RBS tablespace 13 DATA_2 4 DATA tablespace 14 INDEXES_2 5 INDEXES tablespace 15 TEMP_USER 6 TEMP tablespace 16 Archived redo log 7 TOOLS tablespace destination disk 8 Online Redo log 1, Control file 1 17 Export dump 9 Online Redo log 2, Control file 2 destination disk 10 Online Redo log 3, Control file 3 © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 33 Hardware Configurations The Control Files are candidates for placement onto the three redo log disks. The altered arrangement reflects this. The Control files will interfere with the online redo logfiles but only at log switch points and during recovery © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 34 Hardware Configurations The TOOLS_1 tablespace will be merged with the TOOLS tablespace In a production environment, users will not have resource privileges, and the USERS tablespace can be ignored However, what will be the case if users require development and test access ? Create another database ? (test ?) © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 35 Hardware Configurations The RBS and RBS_2 tablespaces have special rollback segments used during data loading. Data loads should not occur during production usage, and so if the 17 disk option is not practical, we can look at combining RBS and RBS_2 - there should be no contention TEMP and TEMP_USER can be placed on the same disk The TEMP tablespace weighting (5 in the previous table) can vary. It should be possible to store these 2 tablespaces on the same disk. TEMP_USER is dedicated to a specific user - (such as Oracle Financials, and these have temporary segments requirements which are greater than the system’s users) © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 36 Hardware Configurations The revised solution is now Disk Contents Disk Content 1 Oracle software 11 Application software 2 SYSTEM tablespace 12 DATA_2 3 RBS, RBS_2 tablespace 13 INDEXES_2 4 DATA tablespace 14 Archived Redo Log 5 INDEXES tablespaces destination disk 6 TEMP, TEMP_USER tablespace 15 Export dump file 7 TOOLS tablespace destination disk 8 Online Redo Log 1, Control file 1 9 Online Redo Log 2, Control file 2 15 disks 10 Online Redo Log 3, Control file 3 © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 37 Hardware Configurations What if there aren’t 15 disks ? -->> Move to attempt 3 Here the online Redo Logs will be placed onto the same disk. Where there are ARCHIVELOG backups, this will cause concurrent I/O and interference contention between LGWR and ARCH on that disk What we can deduce from this, is that the combination about to be proposed is NOT appropriate for a high transaction system or systems running in ARCHIVELOG mode (why is this so - Prof. Julius Sumner Miller ?) © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 38 Hardware Configurations The ‘new’ solution Disk Contents 1 Oracle software 2 SYSTEM tablespace, Control file 1 3 RBS, RBS_2 tablespaces, Control file 2 4 DATA tablespace, Control file 3 5 INDEXES tablespaces 6 TEMP, TEMP_USER tablespaces 7 TOOLS, INDEXES_2 tablespaces 8 OnLine Redo Logs 1, 2 and 3 9 Application software 10 DATA_2 11 Archived redo log destination disk 12 Export dump file destination disk © 2003 Monash IT Pty Ltd 12 disks DataBase Design Lect 9 / 39 Hardware Configurations Notice that the Control Files have been moved to Disks 3, 4 and 5 The Control Files are not I/O demanding, and can safely coexist with SYSTEM, RBS and DATA What we have done so far is to ‘move’ the high numbered disks to the ‘low’ numbered disks - these are the most critical in the database. The next attempt to ‘rationalise’ the disk arrangement is to look carefully at the high numbered disks. © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 40 Hardware Configurations DATA_2 can be combined with with the TEMP tablespaces (this disk has 4% of the I/O load). This should be safe as the static tables (which ones are those ?) are not as likely to have group operations performed on them as the ones in the DATA tablespace The Export dump files have been moved to the Online Redo disk (the Redo log files are about 100Mb and don’t increase in size -(is that correct ?) Exporting causes minor transaction activity. The other is the combination of the application software with the archived redo log file destination area. This leaves ARCH space to write log files, and avoids conflicts with DBWR © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 41 Hardware Configurations Disk Content 1 Oracle software 2 SYSTEM tablespace, Control file 1 3 RBS tablespace, RBS_2 tablespace, Control file 2 4 DATA tablespace, Control file 3 5 INDEXES tablespace 9 disks 6 TEMP, TEMP_USER, DATA_2 tablespaces 7 TOOLS, INDEXES_2 tablespaces 8 Online Redo logs 1, 2 and 3, Export dump file 9 Application software, Archived Redo log destination disk © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 42 Hardware Configurations Can the number of required disks be further reduced ? Remember that the performance characteristics will deteriorate It’s now important to look closely at the weights set during the I/O estimation process. © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 43 Hardware Configurations Estimated Weightings for the previous (9 disk) solution are Disk Weight Contents 1 Oracle software 2 35 SYSTEM tablespace, Control file 1 3 40 RBS, RBS_2 tablespace, Control file 2 4 100 DATA tablespace, Control file 3 5 33 INDEXES tablespaces 6 9 TEMP, TEMP_USER, DATA_2 tablespace 7 3 TOOLS, INDESES_2 tablespaces 8 40+ Online Redo logs 1,2 and 3, Export dump file destination disk 9 40+ Application software, archived redo log destination disk © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 44 Hardware Configurations A further compromise distribution could be Disk Weight Contents 1 Oracle software 2 38 SYSTEM, TOOLS, INDEXES_2 tablespaces, Control file1 3 40 RBS, RBS_2 tablespaces, Control file 2 4 100 DATA tablespace, Control file 3 5 42 INDEXES, TEMP, TEMP_USER, DATA_2 tablespaces 6 40+ Online redo logs 1,2 and 3, Export dump file destination disk 7 40+ Application software, Archived redo log destination disk © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 45 Hardware Configurations A few thoughts for a small database system - 3 disks 1. Suitable for an OLTP application. Assumes that the transactions a small in size, large in number and variety, and randomly scattered among the available tables. The application should be as index intensive as possible, and the full table scans must be kept to the minimum possible © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 46 Hardware Configurations 2. Isolate the SYSTEM tablespace. This stores the data dictionary - which is accessed for every query and is accessed many time for every query In a ‘typical case’, query execution requires the column names to be checked in CODES_TABLE table the user’s privilege of access to the CODES_TABLE table the user’s privilege to access the Code column of the CODES_TABLE table the user’s role definition(s) the indexes defined on the CODES_TABLE table the columns of the columns defined on the CODES_TABLE table © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 47 Hardware Configurations 3. Isolate the INDEXES tablespace. This probably accounts for 35 to 40% of the I/O 4. Separate the rollback segments and DATA tablespaces There is a point to watch here - with 3 disks there are 4 tablespaces - SYSTEM, INDXES, DATA and RBS. The placement of RBS is determined by the volume of transactions. If high, RBS and DATA should be kept apart. If low, RBS and DATA should work together without causing contention © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 48 Hardware Configurations The 3 disk layout would be one of these : Disk 1: SYSTEM tablespace, control file, redo log Disk 2 : INDEXES tablespace, control file, redo log, RBS tablespace Disk 3 : DATA tablespace, control file, redo log OR Disk 1 : SYSTEM tablespace, control file, redo log Disk 2 : INDEXES tablespace, control file, redo log Disk 3 : DATA tablespace, control file, redo log, RBS tablespace © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 49 Hardware Configurations Summary : Database Type Tablespaces Small development SYSTEM database DATA INDEXES RBS TEMP USERS TOOLS © 2003 Monash IT Pty Ltd DataBase Design Lect 9 / 50 Hardware Configurations Summary : Database Type Production OLTP database © 2003 Monash IT Pty Ltd Tablespaces SYSTEM DATA DATA_2 INDEXES INDEXES_2 RBS RBS_2 TEMP TEMP_USER TOOLS DataBase Design Lect 9 / 51 Hardware Configurations Summary : Database Type Production OLTP with historical data © 2003 Monash IT Pty Ltd Tablespaces Tablespaces SYSTEM TEMP DATA TEMP_USER DATA_2 TOOLS DATA_ARCHIVE INDEXES INDEXES_2 INDEXES_ARCHIVE RBS RBS_2 DataBase Design Lect 9 / 52 Hardware Configurations Summary : Database Type Data Warehouse © 2003 Monash IT Pty Ltd Tablespaces SYSTEM DATA DATA_2 INDEXES INDEXES_2 RBS RBS_2 TEMP TEMP_USER TOOLS Tablespaces PARTITIONS PARTITIONS_2 AGG_DATA AGG_DATA_1 SNAPS SNAPS_1 TEMP_WORK TEMP_WORK_1 DataBase Design Lect 9 / 53