* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download SQL Server Analysis Services Tabular Model
Survey
Document related concepts
Transcript
SQL Server Analysis Services Tabular Model A BR I E F E N D TO E N D BY KI RSTEN BUR N E T T - E N T ELEC T Our Journey through Tabular Models 1. Tabular in a nut shell 2. The Business Intelligence Semantic Model 3. A brief history of Tabular 4. Tabular benefits over Power Pivot 5. Tabular vs Multidimensional and choosing between the two 6. xVelocity In Memory Analytics Engine 7. DirectQuery 8. Managing a Tabular Model (Processing and Partitioning) 9. Automating with ASSL/TMSL and AMO 10. Creating and Deploying a Tabular model (Demo) 11. Tabular and xVelocity in the market and the future for SQL Server Analysis Services Image from: https://sqldusty.com/2013/08/06/performance-tuning-of-tabular-models-in-ssas-2012-whitepaper-now-available/ Tabular in a nutshell Measures, KPI’s, Hierarchies Perspectives & Translations DAX ASSL, AMO, TMSL, DMV’s DirectQuery Role and Row Security xVelocity Relationships, Tables, Columns, Rows SSAS Database Partitioning & Processing Tabular Various Data Sources SSDT & SSMS http://www.nutritionbynature.com.au/blog/nuts-seeds-are-for-birds-and-squirrels-not-humans Business Intelligence Semantic Model BI APPLICATIONS SSRS EXCEL 3RD PARTY APPS DATA MODEL MULTIDIMENSIONAL TABULAR BUSINESS LOGIC MDX DAX DATA ACCESS DATA SOURCES ROLAP RDBMS MOLAP FILE DIRECT QUERY LOB xVelocity CLOUD A Brief History of Tabular 2 Internal papers in 2006 by Amir Netz about a BI Sandbox and Inmemory technology Power Pivot for Excel was first released as an add-in in May 2010 as part of the SQL Server 2008 R2 release SQL Server 2012 saw the introduction of a completely new data model, SQL Server Analysis Services Tabular model Microsoft’s entry into the Self-Service BI market. https://news.microsoft.com/stories/people/amir-netz.html Power Pivot vs Tabular SCALABILITY TABULAR BENEFITS OVER POWER PIVOT MANAGEABILITY SECURABILITY PowerPivot has essentially become the desktop version of Analysis Services, but remains an excellent prototyping tool for Tabular. Tabular vs Multidimensional EXCELLENT QUERY PERFORMANCE VERY LARGE DATABASES (TERABYTES) VARIED DATA SOURCES EASIER, QUICKER, NEWER TECHNOLOGY CALCULATED COLUMNS/TABLES TABULAR MORE GRANULAR SECURITY PROVEN, STABLE, OLDER TECHNOLOGY MULTIDIMENTIONAL WRITEBACK, AGGREGATIONS, ACTIONS, NAMED SETS xVelocity In Memory Analytics Engine Compression algorithm Compression techniques • Less RAM usage and decreased costs • Increased performance as less data to scan • Dictionary/Hash Encoding • Value Encoding • Run Length Encoding Originally called the VertiPaq engine, xVelocity is an in-memory columnar database where all of the data in a model lives in RAM. Columnstore indexes use this same compression technology. xVelocity Dictionary Encoding A dictionary of unique values in a column is built and then data values are replaced with the values of the dictionary index. SIZE Regardless of datatype the same performance gains are achieved. Small Small SIZE ID 0 ID SIZE 0 0 Small Extra Large 1 1 Extra Large Medium 2 2 Medium Medium 2 3 Extra Small Extra Small 3 4 Large Large 4 xVelocity Value Encoding xVelocity looks for mathematical relationships between the values in a column, then uses the value itself, with or without a mathematical operation applied, to decrease the memory usage. Provides better performance as computations can operate directly on the data without a separate lookup step as with dictionary encoding. PRODUCT_KEY 100023 100024 100025 100026 PRODUCT_KEY 100000 23 24 25 26 100027 27 100028 28 100029 29 Less storage used for smaller value by subtracting 100000. Adds the value back in when required. xVelocity Run Length Encoding Complementary compression technique that tries to reduce the size of a dataset by avoiding repeated values. QUARTER Values are replaced with a slightly more complex structure that contains the value only once, with the number of contiguous rows having the same value. Q1 Q1 Q1 Repeated 210 times ….. Q2 Q2 Q2 QUARTER START COUNT Q1 1 210 Q2 211 260 Repeated 260 times … Example from Ferrari, Alberto; Russo, Marco. The Definitive Guide to DAX: Business intelligence with Microsoft Excel, SQL Server Analysis Services, and Power BI (Business Skills) (p. 411). Pearson Education. Kindle Edition. xVelocity Re-encoding SAMPLE CHOOSE COMPRESS RE-ENCODE COMPRESS • Read a sample of the data • Choose a compression technique based on sample • Compress the data using chosen technique • Determine a new compression technique, if required • Compress the data using chosen technique Provide a good sample of data when doing an initial partition process to keep re-encoding to a minimum. xVelocity Relationships To improve performance, relationships between tables are stored in memory as pairs of ID’s and row numbers. Given the ID in one table, it can immediately find the corresponding rows in the joined table. SALE AMOUNT PRODUCT ID 10 2 12 3 34 7 PRODUCT ID PRODUCT 2 Nutty Kitty ROW 1 3 Chocolate Bunny ROW 2 7 Nutty Chocolate Bunny ROW 3 SALES[PRODUCT ID] PRODUCT[ROW NUMBER] 2 1 3 2 7 3 xVelocity Design Considerations A column can have both Run Length Encoding AND Dictionary OR Value encoding. 1. CARDINALITY Number of unique values in the column, lower is better 2. DISTRIBUTION Repeated values have higher compression rates than those columns with frequently changing values DESIGN CONSIDERATIONS 3. TABLE SIZE Number of rows in the table 4. DATA TYPE Affects the dictionary size DirectQuery REAL TIME REPORTING VERY LARGE DATABASES DAX CONVERTED TO SQL DIRECTQUERY SEMANTIC LAYER DIFFERENCES IN DAX FORMULAS AND SUPPORT SINGLE RELATIONAL SOURCE Partitioning Partitions divide a table into logical parts and provides several benefits. AMOUNT DATE 220 2015-12-31 …. ….. 468 2015-01-01 1290 2014-12-31 … … 34 2014-01-01 3 2013-12-31 … … 98 2013-01-01 DECREASED PROCESSING TIME Partition for 2015 Partition for 2014 Partition for 2013 EASILY REMOVE OR UPDATE DATA CONSOLIDATE DATA FROM VARIOUS SOURCES Processing – Full, Add, Recalc, Default FULL ADD RECALC DEFAULT REFRESHES ALL STRUCTURES UPDATES PARTITION WITH NEW DATA UPDATES AND RECALCULATES MAKES OBJECT QUERYABLE AVAILABLE AT DATABASE, TABLE AND PARTITION LEVEL AVAILABLE AT PARTITION LEVEL AVAILABLE AT DATABASE LEVEL AVAILABLE AT DATABASE, TABLE AND PARITION LEVEL MODEL IS QUERYABLE THROUGHOUT MODEL IS QUERYABLE THROUGHOUT NEEDS TO BE MANUALLY ISSUED AFTER PROCESS DATA ONLY OPERATIONS NECCESARY TO MAKE OBJECT QUERYABLE NO AUTOMATIC DETECTION AFFECTS HIERARCHIES, CALCS AND RELATIONSHIPS OF DUPLICATES CAN RUN PROCESS DATA AND OR PROCESS RECALC ENTIRE COPY OF DATABASE STORED IN MEMORY Processing - Data, Clear, Defrag PROCESS DATA PROCESS CLEAR PROCESS DEFRAG LOADS DATA ONLY DROPS ALL DATA REBUILDS TABLE DICTIONARY AVAILABLE AT TABLE AND PARTITION LEVEL AVAILABLE AT DATABASE, TABLE AND PARTITION LEVEL AVAILABLE AT TABLE LEVEL MODEL IS NOT QUERYABLE AFTERWARDS MODEL IS NOT QUERYABLE AFTERWARDS USED WHEN PARTITIONS FREQUENTLY CHANGE RUN PROCESS DEFAULT OR PROCESS RECALC AFTER NEED TO LOAD DATA AND RECALC USELESS AFTER PROCESS FULL OR PROCESS DATA SQL Server 2016 Analysis Services (SSAS) includes parallel processing for tables with two or more partitions. Automating with ASSL, TMSL & AMO Analysis Services Scripting Language (ASSL) is an extension of the XMLA Open Standard and can be used to automate admin tasks. Tabular Model Scripting Language (TMSL) – JSON. New in SQL Server 2016 with relevant Tabular language elements. ASSL TMSL AMO Analysis Management Objects (AMO) – Managed Interface used to run XMLA, MDX etc. Basic Tabular Demo Image from: http://quoteaddicts.com/topic/good-luck-funny/ The Future MERGED TECHNOLOGIES TABULAR REPLACES MULTI-DIMENSIONAL ? ANALYSIS SERVICES AS A SERVICE (AZURE) HYBRID SOLUTIONS References http://sqlblog.com/ Ferrari, Alberto; Russo, Marco; Webb, Chris. Microsoft SQL Server 2012 Analysis Services: The BISM Tabular Model (Developer Reference) Pearson Education. Kindle Edition. https://msdn.microsoft.com/en-us/library/hh212945.aspx https://www.mssqltips.com/sqlservertip/2818/understanding-the-sql-server-2012-bi-semanticmodel-bism/ Credit: Business Intelligence in Microsoft SharePoint 2010 By Norm Warren, Mariano Neto, John Campbell, Stacia Misner Ferrari, Alberto; Russo, Marco. The Definitive Guide to DAX: Business intelligence with Microsoft Excel, SQL Server Analysis Services, and Power BI (Business Skills) (p. 411). Pearson Education. Kindle Edition. Thank you to our sponsors