Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational model wikipedia , lookup
Concurrency control wikipedia , lookup
Database model wikipedia , lookup
Oracle Database wikipedia , lookup
Grid Computing Meets the Database Chris Smith Platform Computing Session # 36686 The best thing about the Grid is that it is unstoppable. The Economist, June 21, 2001 2 © Platform Computing Inc. 2003 What is Grid computing? Grid: Transparent, secure and coordinated computing resource sharing across geographically disparate sites 3 © Platform Computing Inc. 2003 Benefits of Grid Computing Grid technology is used to aggregate computing resources across the entire organization, regardless of location or business unit. Provides virtually unlimited computing capacity Delivers reliable, “always-on” computing infrastructure Virtualizes IT infrastructure for end-users Coordinates the usage of heterogeneous computing resources in order to accomplish business processing tasks 4 © Platform Computing Inc. 2003 Example Use Cases 5 Batch Process Automation Multi-Site Capacity Computing Service Virtualization © Platform Computing Inc. 2003 Batch Process Automation What is Platform JobScheduler? Intelligent batch process automation Grid-enabled enterprise batch process automation software Provides a Graphical Design Studio & Management console to design and control the scheduling of Oracle jobs and compute jobs with various dependencies (Line-of-Business Processes) across a virtualized environment 7 © Platform Computing Inc. 2003 Simplified Scheduling Environment for Oracle jobs and Compute jobs Single Point of Control to Design & Monitor Job Events, File Events, Time Events Central Repository for Storing/Sharing Jobs Business flows Sub flows Proxy dependencies Consistent, Flexible & Extensible Automated Exception Handling Re-running jobs, Killing jobs, Triggering other jobs 8 © Platform Computing Inc. 2003 More Efficient Use of Computing Resources for Oracle jobs and Compute jobs Resource Virtualization Ensures the reliability of mission critical business flows and alwayson availability of resources Provision additional databases for specific tasks across time Matching demand for resources with the supply of resources 9 © Platform Computing Inc. 2003 JobScheduler Architecture Process Designing/ Control Load XML Save XML Client 10 © Platform Computing Inc. 2003 Scheduling Time, Job, file, Other events Grid-Enabled Application Execution Infrastructure Oracle Database Log Jobflow Server Grid Master & Grid Agents JobScheduler and Oracle scheduler integration 2 3 Platform JobScheduler server LSF Master host 1 Platform JobScheduler client elim.oracle.C elim.oracle.B orajobstart 4 LSF Cluster LSF host Oracle client C 11 © Platform Computing Inc. 2003 Oracle instance B Oracle instance ETL using Platform JobScheduler A common use of the Platform JobScheduler and Oracle scheduler integration is for ETL into a data warehouse. Example: a brokerage firm wants to load the day’s trading data into their data warehouse for analysis (e.g. risk positions, trending, etc) ETL flow is triggered by: Time of day event Arrival of market data in flat-file format Completion of a stored procedure which collects location brokerage data Data is cleansed and loaded with SQL*Loader into the database Stored procedures are invoked which do some analysis and initial reporting 12 © Platform Computing Inc. 2003 Multi-Site Capacity Computing Increasing Computing Capacity with Platform MultiCluster A parameter space study is done on tens of thousands of individual sets of parameters, resulting in tens of thousands of analysis jobs Local cluster doesn’t have enough capacity, so Platform MultiCluster is used to allow the forwarding of analysis jobs to clusters located at other sites of the organization The DBMS_STREAMS_ADM.MAINTAIN_TABLESPACES procedure provided with Oracle Database 10g is used to replicate input data for the analysis at the remote site Database aware scheduling is used to make intelligent decisions about which sites are suitable for receiving jobs 15 © Platform Computing Inc. 2003 Platform MultiCluster Job Forwarding Model Send queue Compute Servers Site A 16 © Platform Computing Inc. 2003 Receive queue Compute Servers You submit We do --• Job transfer • data staging • Account mapping • Accounting Site B Enterprise Grid Architecture 17 © Platform Computing Inc. 2003 Workload driven data management 1. Job forwarded 2. Run pre-exec Master molecular database (MOL) 3. Connect to MOL and run MAINTAIN_TABLESPACES Pre-exec script 5. pre-exec finished 6. Job is run Streams maintained version of MOL Streams DML updates Application Tablespaces for MOL 7. Job uses copy 4. MOL metadata and tablespaces transferred Tablespaces for MOL 18 © Platform Computing Inc. 2003 Database aware scheduling Site 1 – MOL, MOL2 2. Update cache info Site 2 – (none) Data Management Service Site 3 - MOL 4. Local site is overloaded Database aware scheduler plug-in decides to forward the job to site 3, since it has the MOL database Site 1 1. Poll for datasets Site 3 5. Job forwarded to site 3 3. bsub -extsched MOL MOL MOL2 MOL Site 2 19 © Platform Computing Inc. 2003 Service Virtualization Demo Lab Hardware -- A Common Web Service/Application Environment Interconnect network Storage network Public network Web CISCO Hardware Load Balancer Web Server & App Server Node Node Node Node Node Node Node Node (Linux) Oracle RAC (Linux AS 2.1) NAS/SAN 21 © Platform Computing Inc. 2003 Oracle RAC Provisioning Demo System Web Layer/Nodes (Linux) Node5 Agent Manager Node6 Node8 Agent Manager Service Agent Service Agent Web Server Apps Apps instances App Server Apps Apps instances Managed node Application Layer/Nodes (Linux) Provisioner … Agent Manager App Agent RAC Agent Applicatio Apps Apps n instances Managed node Managed node RAC Apps Apps instances RAC Layer/Nodes (Linux AS 2.1) 22 © Platform Computing Inc. 2003 Managed cluster RAC Node1 … Node4 Proof of Concept Demos Dynamic Provisioning within Database Layer Dynamic Provisioning cross Database and Application Layers 23 © Platform Computing Inc. 2003 Provisioning Within DB Layer Web Layer Web Server Node - App Layer App Server App Server Node Node Node RAC Layer ? Show one RAC node running dbFinance, two RAC nodes running dbHR, and one RAC node is idle - Have a lot of data access to dbFinance, a few of data access to dbHR - Without dynamic provisioning, the response time to dbFinance is very slow, while other RAC nodes are idle - Applying dynamic provisioning, one idle node is added to dbFinance, one dbHR node is shutdown and moved to dbFinance - The response time to dbFinance is improved dbFinance dbHR Provisioning Across DB & App Layers Web Layer Web Server Node App Layer ? Show one RAC node running dbFinance, one RAC node running dbHR, and two RAC nodes are idle - Have a lot of applications need to run on App Layer - Without dynamic provisioning, the response time of App Layer is very slow, while some RAC nodes are idle Applying dynamic provisioning, some applications are running on two idle RAC nodes The response time of App Layer is improved - dbFinance App Server App Server Node Node Node - - RAC Layer - When there are some data accesses to dbFinance, more database instances are needed - Applications on the RAC nodes are gracefully preempted, and two more dbFinance instances are started dbHR App Server App Server RAC Agent Gathers Metrics: numInstances – Instances in a given database. instanceState – Operation state of an instance. dbLoad – Various load metrics from a database User Calls, Recursive Calls Physical Reads, Physical Writes Consistent Gets, dB Block Gets Takes Actions: startInstance – Start an instance on a candidate stopInstance – Stop an instance on a candidate 26 © Platform Computing Inc. 2003 Policy Functions Discover State of System What is the current state of the Candidates Database High Load If a candidate is free start an Instance of the loaded database. Database Low Load If a candidate was added, shutdown the database instance on the candidate. 27 © Platform Computing Inc. 2003 Scenario 1: Results Discovery • Discover pe02, and pe03 are free High Load • Detect High Load on HR database. • Have a candidate free. • Remove candidate from free host list. • Start another instance of the HR database. • Add the candidate to the list of HR instances. 28 © Platform Computing Inc. 2003 Scenario 1: Results Continued High Load • Add the remaining candidate to the HR instances. Low Load • Detect low load on the HR database. • Detect that candidate hosts are in use. • Remove from last added candidate from list of HR instances. • Stop HR instance on candidate. • Return candidate to list of free hosts. 29 © Platform Computing Inc. 2003 Questions?