* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download A Rough Guide to RAC
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					A Rough Guide to RAC Julian Dyke Independent Consultant Web Version 1 © 2005 Julian Dyke juliandyke.com Agenda       2 © 2005 Julian Dyke Introduction Availability Scalability Manageability Total Cost of Ownership Conclusion juliandyke.com Introduction 3 © 2005 Julian Dyke juliandyke.com Some RAC Terminology OCRDUMP RAC CSS CRSCTL SRVCTL GCS LMD CLUVFY OCFS2 LMS PI OCR OIFCFG LCK OCRCHECK VIP OCSSD CRSD GRD DIAG VIPCA CRS FAN EVMD ONS LMON BAST OCFS AST OCRCONFIG GES ASM TAF LKDEBUG FCF CRS_STAT 4 © 2005 Julian Dyke GSD juliandyke.com What is RAC?  Multiple instances running on separate servers (nodes)  Single database on shared storage accessible to all nodes  Instances exchange information over an interconnect network Instance 1 Interconnect Node 1 Local Disk 5 © 2005 Julian Dyke Instance 2 Node 2 Shared Storage Local Disk juliandyke.com Instances versus Databases 6  A RAC cluster includes  one database  one or more instances  A database is a set of files  Located on shared storage  Contains all persistent resources  An instance is a set of memory structures and processes  Contain all temporal resources  Can be started and stopped independently © 2005 Julian Dyke juliandyke.com Instances versus Databases Private Network (Interconnect) Public Network Instance 1 Instance 2 Instance 3 Instance 4 Node 1 Node 2 Node 3 Node 4 Storage Network Database 7 © 2005 Julian Dyke juliandyke.com What is a RAC Database? 8  Located on shared storage accessible by all instances  Includes  Control Files  Data Files  Online Redo Logs  Server Parameter File  May optionally include  Archived Redo Logs  Backups  Flashback Logs (Oracle 10.1 and above)  Change Tracking Writer files (Oracle 10.1 and above) © 2005 Julian Dyke juliandyke.com What is a RAC Database?  Contents similar to single instance database except  One redo thread per instance ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 3 SIZE 51200K, GROUP 4 SIZE 51200K; ALTER DATABASE ENABLE PUBLIC THREAD 2;  If using Automatic Undo Management also require one UNDO tablespace per instance CREATE UNDO TABLESPACE "UNDOTBS2" DATAFILE SIZE 25600K AUTOEXTEND ON MAXSIZE UNLIMITED EXTENT MANAGEMENT LOCAL;  9 Additional dynamic performance views (V$, GV$ but not X$) created by $ORACLE_HOME/rdbms/admin/catclust.sql © 2005 Julian Dyke juliandyke.com What is the Interconnect? 10  Instances communicate with each other over the interconnect (network)  Information transferred between instances includes  data blocks  locks  SCNs  Typically 1GB Ethernet  UDP protocol  Often teamed in pairs to avoid SPOFs  Can also use Infiniband  Fewer levels in stack  Other proprietary protocols are available © 2005 Julian Dyke juliandyke.com Why Use Shared Storage?   11 Mandatory for  Database files  Control files  Online redo logs  Server Parameter file (if used) Optional for  Archived redo logs (recommended)  Executables (Binaries)  Password files  Parameter files  Network configuration files  Administrative directories  Alert Log  Dump Files © 2005 Julian Dyke juliandyke.com What Shared Storage is Supported?  12 Oracle supplied options  Oracle Cluster File System (OCFS)  Version 1  Windows and Linux  Supports database and archived redo logs  No executables  Version 2 - August 2005  Linux, Windows and Solaris  As OCFS1 plus executables  Automatic Storage Management (ASM)  Oracle 10.1 and above  More transparent in Oracle 10.2 and above  Both require underlying SAN or NAS  Do not require LVM © 2005 Julian Dyke juliandyke.com What Shared Storage is Supported?  13 Can use (continued)  Network Attached Storage  NFS-based  Potentially lower cost - no fibre channel required  Easy to administer  Raw devices  Difficult to administer  Cannot be used with archived redo logs  Third-party Cluster File System  Still a popular choice with many sites  Others (not supported)  Firewire - maximum two nodes - recommended in 10g  NBD - Network Block Devices - Solaris and Linux  NFS - not supported, but might still work © 2005 Julian Dyke juliandyke.com What is a Shared Oracle Home?      14 Can install multiple copies of Oracle executables on local disks on each node Can also install Shared Oracle Home  single copy of Oracle executables on shared storage Oracle 9.2  Only Oracle database software Oracle 10.1  Cluster Ready Services (CRS)  Oracle database software + ASM Oracle 10.2  Oracle Clusterware (CRS)  ASM  Oracle database software © 2005 Julian Dyke juliandyke.com Internal Structures and Services 15  Global Resource Directory (GRD)  Records current state and owner of each resource  Contains convert and write queues  Distributed across all instances in cluster  Global Cache Services (GCS)  Implements cache coherency for database  Coordinates access to database blocks for instances  Maintains GRD  Global Enqueue Services (GES)  Controls access to other resources (locks) including  library cache  dictionary cache © 2005 Julian Dyke juliandyke.com Background Processes   16 Each RAC instance has set of standard background processes e.g.  PMON  SMON  LGWR  DBWn  ARCn RAC instances use additional background processes to support GCS and GES including  LMON  LCK0  LMDn  LMSn  DIAG © 2005 Julian Dyke juliandyke.com Portability   17 Most single-instance applications should port to RAC Some exceptions  Application must scale well on single instance  Can be difficult to evaluate  Some features do not work e.g.  DBMS_ALERT  DBMS_PIPE  External inputs/outputs may need modification  Flat files etc  Some RAC features require additional coding  TAF  Code may need upgrading to use RAC functionality e.g.  FCF requires JDBC Implicit Connection Cache © 2005 Julian Dyke juliandyke.com Why Do Users Deploy RAC?  18 Users may deploy RAC to achieve  Increasing availability  Increasing scalability  Improving maintainability  Reduction in total cost of ownership © 2005 Julian Dyke juliandyke.com Why Do DBAs Deploy RAC?  19 DBAs may want to deploy RAC because:  Realistic next step for experienced Oracle DBAs  Intellectual challenge  Job protection - ties organisation to Oracle technology  Possible improved earnings  It looks good on their CV © 2005 Julian Dyke juliandyke.com Availability 20 © 2005 Julian Dyke juliandyke.com What is Failover?  If one node or instance fails  Node detecting failure will  Read redo log of failed instance from last checkpoint  Apply redo to datafiles including undo segments (roll forward)  Rollback uncommitted transactions  Cluster is frozen during part of this process Instance 1 Node 1 21 © 2005 Julian Dyke Interconnect Instance 2 Node 2 juliandyke.com What are Database Services? 22  Database Services are logical groups of sessions  Can be configured using  DBCA  Enterprise Manager (10.2 and above)  Can also be configured using  SRVCTL (Oracle Cluster Registry only)  SQL*Plus (Data Dictionary only)  Text editor (Network Configuration)  In Oracle 10.1 and above, each service has  Preferred Nodes (used by default)  Available Nodes (used if preferred node fails) © 2005 Julian Dyke juliandyke.com What are Database Services? 23  Can be used with Resource Manager to control resource usage e.g.  CPU  Parallel execution  Can be used for monitoring  V$SERVICE_STATS  Can be used for diagnostics  DBMS_MONITOR  trace  statistics © 2005 Julian Dyke juliandyke.com What is Oracle Clusterware?      24 Introduced in Oracle 10.1 (Cluster Ready Services - CRS) Renamed in Oracle 10.2 to Oracle Clusterware Cluster Manager providing  Node membership services  Global resource management  High availability functions On Linux  Configured in /etc/inittab  Implemented using three daemons  CRS - Cluster Ready Service  CSS - Cluster Synchronization Service  EVM - Event Manager In Oracle 10.2 includes High Availability framework  Allows non-Oracle applications to be managed © 2005 Julian Dyke juliandyke.com What is the OCR? 25  Oracle Cluster Registry (OCR)  Configuration information for Oracle Clusterware / CRS  Introduced in Oracle 10.1  Replaced Server Management (SRVM) disk/file  Similar to Windows Registry  Located on shared storage  In Oracle 10.2 and above can be mirrored  Maximum two copies © 2005 Julian Dyke juliandyke.com What is the OCR?  26 Defines cluster resources including:  Databases  Instances  RDBMS  ASM  Services  Node Applications  VIP  ONS  GSD  Listener Process © 2005 Julian Dyke juliandyke.com What is a Voting Disk? 27  Known as Quorum Disk / File in Oracle 9i  Located on shared storage accessible to all instances  Used to determine RAC instance membership  In the event of node failure voting disk is used to determine which instance takes control of cluster  Avoids split brain  In Oracle 10.2 and above can be mirrored  Odd number of copies (1, 3, 5 etc) © 2005 Julian Dyke juliandyke.com What is VIP? 28  Node application introduced in Oracle 10.1  Allows Virtual IP address to be defined for each node  All applications connect using Virtual IP addresses  If node fails Virtual IP address is automatically relocated to another node  Only applies to newly connecting sessions © 2005 Julian Dyke juliandyke.com What is TAF? 29  TAF is Transparent Application Failover  Sessions connected to a failed instance will be terminated  Uncommitted transactions will be rolled back  Sessions can be reconnected to another instance automatically if using TAF  Can optionally re-execute in-progress SELECT statements  Statement re-executed with same SCN  Fetches resume at point of failure  Session state is lost including  Session parameters  Package variables  Class and ADT instantiations © 2005 Julian Dyke juliandyke.com What is TAF?  TAF is Transparent Application Failover  Requires additional coding in client  Requires configuration in TNSNAMES.ORA RAC_FAILOVER = (DESCRIPTION = (ADDRESS_LIST = (FAILOVER = ON) (ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = node2)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = RAC) (SERVER = DEDICATED) (FAILOVER_MODE =(TYPE=SELECT)(METHOD=BASIC)(RETRIES=30)(DELAY=5)) ) ) 30 © 2005 Julian Dyke juliandyke.com What is FAN? 31  Fast Application Notification (FAN)  Introduced in Oracle 10.1  Method by which applications can be informed of changes in cluster status  Handle node failures  Workload balancing  Applications must connect using services  Can be notified using  Server side callouts  Fast Connection Failover (FCF)  ONS API © 2005 Julian Dyke juliandyke.com What is ONS? 32  Oracle Notification Service (ONS)  Introduced in Oracle 10.1  Allows out-of-band messages to be sent to  Nodes in cluster  Middle-tier application servers  Clients  Underlying mechanism for Fast Application Notification (FAN) © 2005 Julian Dyke juliandyke.com Does RAC Increase Availability? 33  Depends on definition of availability  May achieve less unplanned downtime  May have more time to respond to failures  Instance failover means any node can fail without total loss of service  Must provide have overcapacity in cluster to survive failover  Additional Oracle and RAC licenses  Load can be distributed over all running nodes  Can use Grid to provision additional nodes © 2005 Julian Dyke juliandyke.com Does RAC Increase Availability? 34  Can still get data corruptions  Human errors / software errors  Only one logical copy of data  Only one logical copy of application / Oracle software  Lots of possibility for human errors  Power / network cabling / storage configuration  Upgrades and patches are more complex  Can upgrade software on subset of nodes  If database is affected then still need downtime © 2005 Julian Dyke juliandyke.com Scalability 35 © 2005 Julian Dyke juliandyke.com What is Scalability? 36  RAC overhead means that linear scalability is difficult to achieve  Global Cache Services (blocks)  Global Enqueue Services (locks)  As number of instances increases, probability that instance is a resource master decreases  Scaling factor of 1.8 is considered good  Dependent on application design and implementation  Scaling factor improves with  Node affinity  Elimination of contention © 2005 Julian Dyke juliandyke.com What is Scalability?  Workload  Scalability is the relationship between increments of resources and workloads Can be any resource but with RAC normally refers to adding instances Scalability can be  linear - optimal but rare  non-linear - suboptimal but normal Workload  Linear Resource 37 © 2005 Julian Dyke NonLinear Resource juliandyke.com What is Workload Balancing?    Balancing of workload across available instances Can have  Client-side connection balancing  Server-side connection balancing Client-side connection balancing  Workload distributed randomly across nodes RAC = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = node2)(PORT = 1521)) (LOAD_BALANCE = ON) (FAILOVER = ON) (CONNECT_DATA = (SERVICE_NAME = RAC) (FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)) ) ) 38 © 2005 Julian Dyke juliandyke.com What is Workload Balancing?     39 Server-side connection balancing Dependent on current workload on each node PMON monitors workload and updates listeners Depends on long or short connections  In Oracle 10.1  Set PREFER_LEAST_LOADED_NODE in listener.ora  OFF for long connections  ON for short connections (default)  In Oracle 10.2  Can specify load balancing goal for each service  NONE, SERVICE_TIME or THROUGHPUT  Can also specify connection load balancing goal  SHORT or LONG © 2005 Julian Dyke juliandyke.com Increasing Scalability 40  If application scales well on a single-instance then it should scale well on RAC  Eliminate contention  Use sequences  Use locally partitioned tables and indexes  Attempt to achieve node affinity  Avoid contention for single blocks  Distribute rows for hot blocks  Small block size e.g. 2048 or 4096  ALTER TABLE MINIMIZE RECORDS PER BLOCK  High PCTFREE / Low PCTUSED  Filler columns e.g. CHAR (2000) © 2005 Julian Dyke juliandyke.com Increasing Scalability 41  Use Automatic Segment Space Management  Default in Oracle 10.2  Use larger block size for read-only objects  Reduce number of GCS messages required  Minimize lock usage  Eliminate unnecessary parsing  Increase size of shared pool  Bind variables  Cursor sharing  Use optimistic locking  Eliminate unnecessary SELECT FOR UPDATE statements © 2005 Julian Dyke juliandyke.com Manageability 42 © 2005 Julian Dyke juliandyke.com Server Parameter File        Introduced in Oracle 9.0.1 Must reside on shared storage Shared by all RAC instances Binary (not text) files Parameters can be changed using ALTER SYSTEM Can be backed up using the Recovery Manager (RMAN) Created using CREATE SPFILE [ = ‘SPFILE_NAME’ ] FROM PFILE [ = ‘PFILE_NAME’ ];  init.ora file on each node must contain SPFILE parameter SPFILE = <pathname> 43 © 2005 Julian Dyke juliandyke.com Parameters  RAC uses same parameters as single-instance  Some must be different on each instance  Some must be same on each instance  Can be global or local [*.]<parameter_name> = <value> [<sid>]<parameter_name> = <value>  Must be set using ALTER SYSTEM statement ALTER SYSTEM SET parameter = value [ SCOPE = MEMORY | SPFILE | BOTH ] [ SID = <sid>] ALTER SYSTEM RESET parameter = value [ SCOPE = MEMORY | SPFILE | BOTH ] [ SID = <sid>] 44 © 2005 Julian Dyke juliandyke.com Parameters   45 Some parameters must be same on each instance including *:  ACTIVE_INSTANCE_COUNT  ARCHIVE_LAG_TARGET  CLUSTER_DATABASE  CONTROL_FILES  DB_BLOCK_SIZE  DB_DOMAIN  DB_FILES  DB_NAME  DB_RECOVERY_FILE_DEST  DB_RECOVERY_FILE_DEST_SIZE  DB_UNIQUE_NAME  MAX_COMMIT_PROPAGATION_DELAY  TRACE_ENABLED  UNDO_MANAGEMENT * Correct for Oracle 10.1 © 2005 Julian Dyke juliandyke.com Parameters 46  Some parameters, if used, must be different on each instance including  THREAD  INSTANCE_NUMBER  INSTANCE_NAME  UNDO_TABLESPACE  ROLLBACK_SEGMENTS  DML_LOCKS must be identical on each instance if set to zero © 2005 Julian Dyke juliandyke.com DBCA  47 Can be used to  Create RAC database and instances  Create ASM instance  Manage ASM instance (10.2)  Add RAC instances  Create RAC database templates  structure only  with data  Create clone RAC database (10.2)  Create, Manage and Drop Services  Drop instances and database © 2005 Julian Dyke juliandyke.com What is SRVCTL?     48 Utility used to manage cluster database Configured in Oracle Cluster Registry (OCR) Controls  Database  Instance  ASM  Listener  Node Applications  Services Options include  Start / Stop  Enable / Disable  Add / Delete  Show current configuration  Show current status © 2005 Julian Dyke juliandyke.com SRVCTL - Examples  Starting and Stopping a Database srvctl start database -d RAC srvctl stop database -d RAC  Starting and Stopping an Instance srvctl start instance -d RAC -i RAC1 srvctl stop instance -d RAC -i RAC1  Starting and Stopping a Service srvctl start service -d RAC -s SERVICE1 srvctl stop service -d RAC -s SERVICE1  Starting and Stopping ASM on a specified node srvctl start asm -n node1 srvctl stop asm -n node1 49 © 2005 Julian Dyke juliandyke.com Enterprise Manager  In Oracle 10.1 and above  Database Control  Installed by DBCA  Controls single cluster  Grid Control  Uses separate repository  Oracle 10.2 version available  Requires Oracle 10.1 database   50 Fully supports RAC in both versions Except  Oracle 10.1 cannot create / delete services  Oracle 10.2 better interconnect performance monitoring © 2005 Julian Dyke juliandyke.com What is CLUVFY? 51  Introduced in Oracle 10.2  Supplied with Oracle Clusterware  Can be downloaded from OTN (Linux and Windows)  Written in Java - requires JRE (supplied)  Also works with 10.1 (specify -10gR1 option)  Checks cluster configuration  stages - verifies all steps for specified stage have been completed  components - verifies specified component has been correctly installed © 2005 Julian Dyke juliandyke.com CLUVFY  52 Stages include -post hwos post check for hardware and operating system -pre cfs pre-check for CFS setup -post cfs post-check for CFS setup -pre crsinst pre-check for Oracle Clusterware installation -post crsinst post-check for Oracle Clusterware installation -pre dbinst pre-check for database installation -pre dbcfg pre-check for database configuration © 2005 Julian Dyke juliandyke.com CLUVFY  53 Components include nodereach Checks reachability between nodes nodecon Checks node connectivity cfs Checks CFS integrity ssa Checks shared storage accessibility space Checks space availability sys Checks minimum system requirements clu Checks cluster integrity clumgr Checks cluster manager integrity ocr Checks OCR integrity crs Checks Oracle Clusterware (CRS) integrity nodeapp Checks node applications exist admprv Checks administrative privileges peer Compares properties with peers © 2005 Julian Dyke juliandyke.com CLUVFY  For example, to check configuration before installing Oracle Clusterware on node1 and node2 use: sh runcluvfy.sh stage -pre crsinst -n node1,node2 54  Checks:  node reachability  user equivalence  administrative privileges  node connectivity  shared stored accessibility  If any checks fail append -verbose to display more information © 2005 Julian Dyke juliandyke.com Other Utilities 55  Additional RAC utilities and diagnostics include  OCRCONFIG  OCRCHECK  OCRDUMP  CRSCTL  CRS_STAT  Additional RAC diagnostics can be obtained using  ORADEBUG utility  DUMP option  LKDEBUG option  Events © 2005 Julian Dyke juliandyke.com Does RAC Improve Manageability? 56  Advantages  Fewer databases to manage  Easier to monitor  Easier to upgrade  Easier to control resource allocation  Resources can be shared between applications  Disadvantages  Upgrades potentially more complex  Downtime may affect more applications  Requires more experienced operational staff  Higher cost / harder to replace © 2005 Julian Dyke juliandyke.com Total Cost of Ownership 57 © 2005 Julian Dyke juliandyke.com Reduction in TCO? 58  Possible for sites with legacy systems  Mainframes / Minicomputers  Applications / Packages  RAC option adds 50% to licence costs except for  Users with site licences  Standard edition (10.1+, max 4 CPU with ASM)  Retrain existing staff or use dedicated staff  Consolidation may bring economies of scale  Monitoring  Backups  Disaster Recovery © 2005 Julian Dyke juliandyke.com Reduction in TCO? 59  Additional resources required  Redundant hardware  Nodes  Network switches  SAN fabric  Hardware e.g. fibre channel cards  Reduction in hardware support costs  May not require 24 hour support  Viable to hold stock of spare components © 2005 Julian Dyke juliandyke.com What are the Alternatives to RAC?  60 Data Guard  Physical Standby  Introduced in Oracle 7.3.4  Stable, well proven technology  Requires redundant hardware  Implemented by many sites  Can be used with RAC  Logical Standby  Introduced in Oracle 9.2  Still not widely adopted  Streams  Introduced in Oracle 9.2  Implemented by increasing number of sites  Advanced Replication © 2005 Julian Dyke juliandyke.com What are the Alternatives to RAC? 61  Symmetric Multiprocessing (SMP) Systems  Single Point of Failure  Simplified configuration  Eliminate RAC overhead  Parallel systems  For systems with deterministic input  Messaging  Data Warehouses  Other Clustering Technologies  SAN  Operating System  etc © 2005 Julian Dyke juliandyke.com Conclusion 62  Success of RAC deployments dependent on  Application design and implementation  Failover requirements  IT infrastructure  Flexibility and commitment of IT department(s)  Before deploying RAC  Investigate and reject alternatives  Perform proof of concept  Test application  Evaluate benefits and costs  Learn RAC concepts and administration  Buy a good book :) © 2005 Julian Dyke juliandyke.com Thank you for your interest For more information and to provide feedback please contact me My e-mail address is: [email protected] My website address is: www.juliandyke.com 63 © 2005 Julian Dyke juliandyke.com
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            