Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Tandem Computers wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Clusterpoint wikipedia , lookup
Object-relational impedance mismatch wikipedia , lookup
Relational model wikipedia , lookup
Database model wikipedia , lookup
Boosting DWH-Performance with SQL Server 2014 ColumnStore Index, In-Memory Tables & Natively Compiled Stored Procedures Introduction Markus Ehrenmüller-Jensen Business Intelligence Architect @MEhrenmueller [email protected] runtastic Pluskaufstraße 7 4061 Pasching Austria, Europe http://www.runtastic.com SQL Server 2005 - 2014 BI Developer Database Developer Database Admin Agenda Introduction Disk vs. In-Memory Column Store Columnar Storage In-Memory OLTP In Memory but durable Native Compiled Stored Procedures Procedures as DLL Turbo-DWH Case Study Sneak Preview In-Memory in SQL Server Cache Buffer Pool Pin-to-Memory DBCC PINTABLE ColumnStore (Non-)Clustered Index SSAS Tabular Power Pivot In-Memory OLTP Memory Optimized Table Natively Compiled Stored Procedure Ordinary Report http://www.flickr.com/photos/marts-pics/3554153954/sizes/l/in/photostream/ ColumnStore Index http://commons.wikimedia.org/wiki/User:Malene Improve Query Performance with ColumnStore Index DEMO How? xVelocity (VertiPaq, PowerPivot, BISM Tabular) Compression (7x) Column-Elimination Segment-Elimination Parallel Read Operations Query Processor in Batch-Mode Typically 10x faster RowStore vs. ColumnStore Page 1 of row store Page 2 of row store Key AlternateKey Name 1 AR-5381 Adjustable Race 2 BA-8327 Bearing Ball 3 BE-2349 Ball Bearing Cage 4 BE-2908 Ball Bearing Grease 5 BL-2036 Blade 6 CA-5965 LL Crankarm 7 CA-6738 ML Crankarm Segment for column 1 Segment for column 2 Segment for column 3 Stock 1000 1000 800 800 800 500 500 Segment for column 4 Use Cases > 1 Mio rows Aggregations, groupings & filters (DWH/OLAP) Write once, read multiple times Less distinctive values per column Sweet spot Design (eg. matching data type, no functions), star & snowflake schema, inner joins Can substitute datamarts/aggregation-tables ROLAP & Tabular Model DirectQuery Query Optimizer includes CS Index Restrictions Data types: ntext, text, and image, vardecimal, varchar(max) and nvarchar(max), rowversion (and timestamp), sql_variant, CLR types (hierarchyid and spatial types), xml, uniqueidentifier Page/Row compression Replication Change Tracking, Change Data Capture Filestream Enterprise Edition only ColumnStore Index Column Store Power Pivot Analysis Services SQL Server Non clustered Index SQL 2012+ Clustered Index SQL 2014+ Nonclustered ColumnStore Index (NCI) May combined with other indices Decide which column to include Only one NCI per table Reduntant storage No sort order SQL Server 2012+ NCI Restrictions No constraints allowed (unique) Indexed table is read-only NCI Best Practice Memory, Memory, Memory Include all columns MAXDOP > 1 Fact-tables and big dimension tables Update/Insert Disable/enable index Partitioning View/Union all Clustered ColumnStore Index (CCI) Physical columnare storage (instead of row based) No reduntant storage All columns included automatically No other index allowed UPDATE-able Supports more data types Swichting between ROW-mode and BATCH-mode allowed SQL Server 2014+ CCI: Structure CREATE CLUSTERED COLUMNSTORE INDEX Deleted Bitmap Columnstore BULK INSERT INSERT DELETE REORGANIZE Deltastore(s) REBUILD CCI: Deltastore Ordinary Rowstore OPEN / CLOSED / (COMPRESSED) Compression is expensive operation Tuple-Mover (every 5 minutes) CCI: Restrictions No Constraints (unique, primary, foreign key) No Triggers Not all datatypes supported No ISOLATION LEVEL SNAPSHOT Enterprise Edition only ColumnStore Index Non-clustered Clustered SQL Server 2012+ Additional index (redundancy) Read-only Subset of columns Max. one NCI per table SQL Server 2014+ Master Update-able All columns No additional index allowed In-Memory Myth Buster In-Memory OLTP ia a response to competitors Microsoft worked on this technology since 2009 In-Memory OLTP is like DBCC PINTABLE PINTABLE bounds pages to buffer pool; In-Memory OLTP does neither have a buffer-pool nor pages In-Memory OLTP is a separated product In-Memory OLTP is fully integrated in SQL Server In-Memory OLTP can be used without app-changes Small (schema) changes are necessary In-Memory tables are nor durable or high available Schema and data are durable; AlwaysOn is supported Disk-based storage (not In-memory) Page based „Latch contention“ during concurrent access Random IO when writing on disk Transactionslog Every change to the data and all indices UNDO with rollback Memory Optimized Tables Based on Filestream Table converted into a DLL Data files 16MB (<=16GB memory) or 128MB 256KB chunks Inserted rows only Chronological row versions Delta files 1MB (<=16GB memory) or 8 MB 4KB chunks IDs of deleted rows In-Memory Rows New row format Optimized for in-memory No data page container Versioning No direct UPDATE But DELETE & INSERT instead Non-blocking SELECT, UPDATE, … Database CREATE DATABASE MyDatabase GO ALTER DATABASE MyDatabase ADD FILEGROUP MyIMFilegroup CONTAINS MEMORY_OPTIMIZED_DATA GO ALTER DATABASE MyDatabase ADD FILE (name='MyIMFile',filename='C:\DATA\MyIMFile') TO FILEGROUP MyIMFilegroup GO Memory Optimized Table Filegroup Durability by default Index is never durable Neither in data file nor in transaction-log Write-access sequentially only No Random IO Write Transaction-log: synchronous Datafile: asynchronous Option: SCHEMA_ONLY No writes to transaction-log not durable ETL, temporary tables, … Optimized for In-Memory No latches, no locks No pages Direct access to rows No Heaps All indecis are covering Multiversion optimistic concurrency Durability Durability by defulat Deactivate explicitly for eg. ETL Succesfully (comitted) transactions in log only No UNDO necessary (only sequentially) SSD recommended In-Memory Filegroup Sequential access only (appending) SSD recommended Create multiple files (container) recovery Indexing In-Memory only Never on disk Rebuild during server restart & database restore No fragmentation All indices are covering Direct pointer to in-memory row Not supported Filtered Index, nullable key, … Max. 8 indices per table (N)VARCHAR must use _BIN2-collation Case sensitive Accent sensitive Hash or Range Index Hash Range Multi-column index: All columns used for hashing (no scan for single columns possible) Index is no sorted Size determined during creation through BUCKET_COUNT (rounded to binary value) Multi-column index: Similar to non-clustered indices of disk-based tables Sort order is crucial Index can grow Unequal & sorted scans Multiversion Adress 4711 Adress 4711 4712 Timestamp Timestamp Link to Content From Until next 100 ∞ null Hello! Timestamp From 100 200 Timestamp Until 200 ∞ Link to Content next 4712 Hello! null Hello World! Multi-Version Optimistic Concurrency Multi-Version Timestamp As long as a version is needed Pointer to next version Optimistic No locks Synchronous UPDATE ROLLBACK READ is never blocked Garbage Collection Old row version Slows down access (xcan) Occupy memory Automatically During transaction Separate system job Lowest active timestamp Every transaction has a timestamp Restrictions Whole table has to fit into memory Max. 8060 Bytes per row (size of disk page) Max 250GB per table One-stop CREATE TABLE Columns, primary key, indices No ALTER TABLE, no CREATE INDEX Not supported Varchar(max), SPARSE column, DBCC CHECKTABLE, XML, CLR, … Workarounds for foreign key constraints, constraints, triggers, …, SQL_VARIANT, LOB, …, ALTER TABLE/INDEX, CREATE INDEX Must have Primary Key, Index on Primary Key Use Cases Performance-critical OLTP Orders, stock exchange, live bets High Data-Input Rate Shock-absorber Staging for ETL Relational Cache Session State No Use Case Schema-changes not allowed Full DWH Reporting ColumnStore XML/Fulltext Search Parallel queries Best Practice Sufficient Memory Ca. 4x the size of the table Try out with productive environment Though not on productive system Multiple data files Can be read in parallel (restart/restore) IO spread on several disks Statistics Are not maintained automatically UPDATE STATISTICS <tbl> WITH FULLSCAN, NORECOMPUTE Use VIEWS for access ALTER VIEW is allowed Natively Compiled Stored Procedures T-SQL C DLL Execution Plan is created during compilation Won‘t be updated later automatically DROP/CREATE necessary Lives in SQL Server Process Performance >10x Use Case CPU is bottleneck Latence is problem Query/DML NCSP helpful Validation/COMMIT IO-System to be optimized instead You can live with the restrictions Native Compilation Advisor Use NCSP only for parts of your implementation Restriction Memory optimized tables only SCHEMABINDING mandatory T-SQL DISTINCT, CTE, RANK, EXISTS, IN, Sub-Queries, OUTER-JOIN, …, ROLLUP, EXTERNAL, … Built-in functions EXECUTE AS CALLER No parallel operation Workarounds SELECT INTO, CASE, GOTO, … Cursors, table-valued functions, UNION, OR, NOT, … Best Practice Avoid mixture of compiled & non-compiled precedures Low efficiency Might be less performant then non-compiled at all As much logic into NCSP as possible Single-Statement precedures do not benefit ColumnStore vs. In-Memory OLTP DWH vs. OLTP Aggregation vs. Single Rows Disk vs. Memory Compressed vs. uncompressed Peformance-Sample Staging-Load After TRUNCATE/DELETE Initial-Load DWH DWH-Load without changes DWH-Load after changed column in all rows DWH-Load with empty staging table DWH-typical query Performance-Measurement InMemory has no IO Elapsed time instead AdventureWorksDW2012..FactResellerSales Blown up to 6 Mio rows Challange for ETL Non-Clustered ColumnStore Index Read only, no UPDATE/DELETE Memory Optimized Tables No TRUNCATE Collation *_BIN2 only No automatic UPDATE of STATISTICS No UPDATE on Primary Key (eg. SCD2/ValidUntil) Natively Compiled Procedures Memory Optimized Tables only Restriction in WHERE: no OR, no IN, … MEhrenmueller‘s Assumptions Staging Memory Optimized Table SCD1 Clustered Column Store Index SCD2 Clustered Column Store Index Answer of Puzzle Disk Heap Disk CCI Disk PK (SSIS) Disk Heap Disk Heap Disk Heap Disk Heap Disk Heap MOT Disk PK NCI Disk Heap SQL Server v2016 UPDATE-able Nonclustered ColumnStoreIndex In-Memory ColumnStore Index Call to Action Try Non-Clustered Column Store Index Clustered Column Store Index In-Memory Table Natively Compiled Stored Procedure On Staging, DWH, Datamarts Wrap up Introduction Disk vs. In-Memory Column Store Columnar Storage In-Memory OLTP In Memory but durable Native Compiled Stored Procedures Procedures as DLL Turbo-DWH Case Study Questions? Markus Ehrenmüller-Jensen Business Intelligence Architect @MEhrenmueller [email protected] runtastic Pluskaufstraße 7 4061 Pasching Austria, Europe http://www.runtastic.com SQL Server 2005 - 2014 BI Developer Database Developer Database Admin