Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Session id: 40177 Building a Terabyte Data Warehouse, Using Linux and RAC George Lumpkin Director Product Management Oracle Corporation Do More with Less More performance More scalability More users Less capital cost Less administration cost RAC for Scalability, Availability, and Flexibility Linux and RAC for DW Scalability Linux ‘Starter’ Cluster: -Two nodes -One shared database Data Warehouse DB Linux and RAC for DW Scalability As the Business Grows … Data Warehouse DB Linux and RAC for DW Scalability … so does your Environment: As the Business Grows … -Three Nodes -One Database Data Warehouse DB Linux and RAC for DW Scalability … and again: As the Business Grows … -Four Nodes -One Database Data Warehouse DB Linux and RAC for DW Availability When one node fails … Data Warehouse DB Linux and RAC for DW Availability … the load is rebalanced and When one node fails … 3/4th of the cluster continues the work Data Warehouse DB Linux and RAC for DW Flexibility ETL Query ETL Query ETL Query ETL Query The Cluster can share all workload ubiquitously … Data Warehouse DB Linux and RAC for DW Flexibility ETL ETL ETL ETL Query Query Query Query … or do workload partitioning Data Warehouse DB Linux and RAC for DW Flexibility ETL ETL ETL ETL Workload Management and Provisioning made easy Data ETL ETL Query Query Query Query Christmas – “Data Season” for Retail Warehouse DB Linux and RAC for DW Flexibility ETL ETL ETL ETL Workload Management and Provisioning made easy Data Query Query Query Query Query Query January – “Analysis Season” Warehouse DB RAC and Parallel Execution RAC and Parallel Execution • Very large queries utilize all resources on the cluster Large Query RAC and Parallel Execution • Many large-scale DWs have many concurrrent jobs – Multiple “small-to-medium” size queries – Degree of parallelism < CPUs-per-node • With Oracle, queries will automatically run on a single node, eliminating traffic over the interconnect Q1 Q2 Q5 Q9 Q3 Q4 Q6 Q7 Q8 Q10 Q11 Q12 Recipe for a RAC Linux DW Processors I/O Interconnect Recipe for a RAC Linux DW: Processors Data warehouse workload determines total number of CPU’s – Same sizing considerations as non-clustered DW How many processors per node? – Enough CPU’s so that a single node can handle most database operations Often, 4 cpu’s is a good balance Recipe for a RAC Linux DW: I/O I/O is typically the primary determinant of data warehouse performance – Storage configurations for a data warehouse should always be chosen based on I/O bandwidth not storage capacity Rule of thumb: at least 100 MBytes/sec of IO bandwidth per gigahertz of processing power Every component of the IO system should provide enough bandwidth: disks, IO channels, IO adapters Recipe for a RAC Linux DW: I/O CPU power and IO bandwidth should be balanced within a server – Example: Each node has 4 x 2ghz processors each node can utilize at least 800 MB/sec Each node should have enough slots to accommodate the necessary IO throughput - If one host bus adapter drives 150 MB/sec, then 6 HBA’s should accommodate the needed IO bandwidth - Note that at least one slot is required for the interconnect Recipe for a RAC Linux DW: Interconnect Gigabit ethernets are generally sufficient for data-warehouse workloads – Oracle minimizes interconnect traffic for multiuser workloads Workloads requiring inter-node parallel query will utilize more interconnect bandwidth – 10Gb ethernet, fibre channel, Infiniband ‘Typical’ Cluster configuration 1 Gigabit ethernet 4 nodes, each with 4 x 2 Ghz CPUs 5 PCI slots 16-port switch 16-port switch 16 Storage arrays, each with 10-20 disks Oracle Linux/RAC DW Customers RAC/Linux DW Customers Euronext – – – – – Database size: 1.5 TB Hardware: 2 x HP DL580 (4 CPUs) Storage: HP MSA 1000 Interconnect: 1 Gb ethernet OS: Red Hat AOK Berlin – – – – – Database size: 780 GB Hardware: 2 x HP DL580 (4 CPUs) Storage: EMC Symmetrix Interconnect: 2 x 1Gb ethernet OS: SuSE Vanderbilt University – – – – – Database size: 50 GB Hardware: 3 x HP DL580 (4 CPUs) Storage: EMC Symmetrix Interconnect: 1 Gb ethernet OS: Red Hat National Bank AG – – – – Database size: 75 GB Hardware: 3 x IBM Express5800 (2 CPUs) Interconnect: 100 Mb ethernet OS: SuSE Ellis Island Foundation – – – – – Database size: 60 GB Hardware: 2 x HP DL580 (4 CPUs) Storage: NetApp Interconnect: 1Gb ethernet OS: Red Hat Linux-RAC and the Grid Evolution of Business Intelligence with Oracle Increasingly common customer theme these days is “provisioning” Customers want more value out of their hardware expenditures – they want to take advantage of unused capacity Oracle’s architecture is unique in being able to truly support flexible provisioning of processing power across multiple databases Oracle will be widely deployed in large commercial computing “grids” in the future Real Application Clusters Order Entry, Shipments, Procurement, Inventory, … ETL processing, Query & Reporting, Data Mining and Scoring, Cube Creation and OLAP Analysis Resource Provisioning December: Order Processing Heavy – Analytics Light Order Entry, Shipments, Procurement, Inventory, … ETL processing, Query & Reporting, Data Mining, … Resource Provisioning January: Order Processing Light – Heavy Analytics Order Entry, Shipments, Procurement, Inventory, … ETL processing, Query & Reporting, Data Mining and Scoring, Cube Creation and OLAP Analysis Oracle RAC Brings Flexible Processing Power to Databases on the Grid Next Steps … Data Warehousing DB Sessions Monday Tuesday 11:00 AM #40153, Room 304 8:30 AM #40125, Room 130 Oracle Warehouse Builder: New Oracle Database 10g Release Oracle Database 10g: A Spatial VLDB Case Study 3:30 PM #40176, Room 303 3:30 PM #40177, Room 303 Security and the Data Warehouse Building a Terabyte Data Warehouse, Using Linux and RAC 4:00 PM #40166, Room 130 5:00 PM #40043, Room 104 Oracle Database 10g SQL Model Clause Data Pump in Oracle Database 10g: Foundation for Ultrahigh-Speed Data Movement For More Info On Oracle BI/DW Go To http://otn.oracle.com/products/bi/db/dbbi.html Next Steps … Data Warehousing DB Sessions Thursday 8:30 AM #40179, Room 304 Oracle Database 10g Data Warehouse Backup and Recovery Business Intelligence and Data Warehousing Demos All Four Days In The Oracle Demo Campground Oracle Database 10g Oracle OLAP 11:00 AM #36782, Room 304 Experiences with Real-Time Data Warehousing using Oracle 10g 1:00PM #40150, Room 102 Oracle Data Mining Oracle Warehouse Builder Oracle Application Server 10g Turbocharge your Database, Using the Oracle Database 10g SQLAccess Advisor For More Info On Oracle BI/DW Go To http://otn.oracle.com/products/bi/db/dbbi.html Reminder – please complete the OracleWorld online session survey Thank you.