* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download SQL Server 2008 R2 Parallel Data Warehouse: Under the
Survey
Document related concepts
Concurrency control wikipedia , lookup
Oracle Database wikipedia , lookup
Microsoft Access wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Tandem Computers wikipedia , lookup
Clusterpoint wikipedia , lookup
Database model wikipedia , lookup
Relational model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Transcript
SQL Server 2008 R2 Parallel Data Warehouse: Under the Hood Brian Mitchell Senior Premier Field Engineer Introducing Parallel Data Warehouse • Tier 1 Enterprise Data Warehouse Appliance Offering – – • Flexibility and Choice – – • High scalability from 10s to100s of terabytes High performance through MPP system Choice of deployment options through distributed architecture Highly Scalable Most Comprehensive Solution – – – Complete data warehouse solution spanning desktop, enterprise data warehouse (EDW), and data marts Deep integration with Microsoft business intelligence (BI) Comprehensive toolset for BI, ETL, MDM, and streaming data Agenda • • • • SQL Server 2008 R2 PDW Overview Disk CPU Memory Appliance Model • Sold as a “black box” to customers • End-to-end solution includes software and hardware • Preconfigured from vendor • Based on a balanced reference architecture • Hardware specifications promote data warehousing workloads • Provides enterprise-level redundancy Appliance Hardware Schema PDW: High Availability • Failover Clustering • Dual Networking – Dual Infiniband – Dual Ethernet – Dual Fiber Channel • Dual Power • Storage – RAID 0 – Hot Spare PDW Benefits • Appliance Model – System arrives assembled with software preinstalled • Appliance optimized for DW Workloads • CPU and IO bandwidth is balanced for scan-intensive queries • Simple to get running and productive PDW Advantages • All loads and queries are highly parallel, automatically • All DML (Inserts, Updates) are also parallel* • Can increase scale and reduce execution time by adding compute racks • Fewer ‘knobs’, less complexity at DBA level – Eliminates physical file layout considerations from database and table creation – Memory, parallelism, and many other SQL configuration options preset and fixed PDW: Built on Tech You Know • • • • • Windows Server 2008 (SP2) SQL Server 2008 (SP2) Failover Clustering Web Based Admin Console SQL Server 2008 R2 BI Tools connect natively – – – – Analysis Services Reporting Services Integration Services PowerPivot Demo: PDW Built on Tech You Know PDW: Basic Concepts Create Database Syntax • CREATE DATABASE database_name WITH ( [ AUTOGROW = ON | OFF , ] REPLICATED_SIZE = replicated_size [ GB ] , DISTRIBUTED_SIZE = distributed_size [ GB ] , LOG_SIZE = log_size [ GB ] ) [;] Example • CREATE DATABASE BigData WITH (AUTOGROW = ON , REPLICATED_SIZE = 1024 , DISTRIBUTED_SIZE = 16384 , LOG_SIZE = 1024 ) Create Table Examples Replicated Table Distributed Table • CREATE TABLE myTable ( id integer NOT NULL, lastName varchar(20), zipCode varchar(6) ); • CREATE TABLE myTable ( id integer NOT NULL, lastName varchar(20), zipCode varchar(6) WITH ( DISTRIBUTION = HASH (id)) ); PDW: Handling Disk I/O Replicating Tables dimTime dimProduct Date Dim ID Calendar Year Calendar Qtr Calendar Mo Calendar Day Prod Dim ID Prod Category Prod Sub Cat Prod Desc Date Dim ID Store Dim ID Prod Dim ID Mktg Camp Id Qty Sold Dollars Sold SQL DimMktCampaign Store Dim ID Store Name Store Mgr Store Size SQL SQL factSales dimStore Smaller Dimension Tables are Replicated on Every Compute Node SQL Mktg Camp ID Camp Name Camp Mgr Camp Start Camp End 15 Ultra Shared Nothing • SQL Server PDW – Stores a portion of each table in each compute node – Stores 8 “portions” per compute node • Called Distributions – Table Scan: all distributions on all nodes Distributing Tables dimTime Date Dim ID Calendar Year Calendar Qtr Calendar Mo Calendar Day dimProduct Prod Dim ID Prod Category Prod Sub Cat Prod Desc Larger Fact Table is Hash Distributed Across All Compute Nodes SQL SQL factSales Date Dim ID Store Dim ID Prod Dim ID Mktg Camp Id Qty Sold Dollars Sold SQL SQL dimMktCampaign dimStore Store Dim ID Store Name Store Mgr Store Size Mktg Camp ID Camp Name Camp Mgr Camp Start Camp End 17 PDW: Database Filegroups PDW: Database Files PDW Compute Node SAN Architecture **N+1 cluster architecture Handling Processing Throughput CPU • Each Compute Node is set up using Soft-Numa • Each Compute Node Listens on Multiple Ports • Each Port is mapped to a Soft-Numa Node PDW: Affinity SELECT Name FROM tableA WHERE state = ‘TX’ Filegroups LUNs 1500 A 1 1501 B 2 1502 C 3 1503 D 4 1504 E 5 1505 F 6 7 1506 G 7 8 1507 H 8 Affinitized to Cores Affinitized to Tables on FileGroups Affinitized to Disks Soft-Numa 1 3 PDW Engine 4 5 6 8 Connections to SQL Server Compute Node 2 PDW: Memory For Everyone PDW Memory: Resource Governor SELECT Name FROM tableA WHERE state = ‘TX’ Compute Node 1 RAM 1 1500 QueryGroup_A 1501 QueryGroup_B 1502 QueryGroup_C 1503 QueryGruop_D 1504 QueryGroup_E 6 1505 QueryGroup_F 7 1506 QueryGroup_G 8 1507 QueryGroup_H 2 3 PDW Engine 4 5 QueryPool_A 11% QueryPool_B 11% QueryPool_C 11% QueryPool_D 11% QueryPool_E 11% QueryPool_F 11% QueryPool_G 11% QueryPool_H 11% Monitoring PDW • • • • • • • • Admin Console DMV’s DBCC Commands PDW Logs DMS Logs SQL Server Logs Event Logs Cluster Logs Monitoring PDW - Demo Please Complete the Evaluation Form Pick up your evaluation form: • In each presentation room Drop off your completed form • Near the exit of each presentation room • At the registration area Presented by Dell SQL Server 2008 R2 Parallel Data Warehouse: Under the Hood 28 THANK YOU! Presented by Dell For attending this session and PASS SQLRally Orlando, Florida 29 Session Code | Session Title