Download Analysis Services

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Expense and cost recovery system (ECRS) wikipedia , lookup

Data analysis wikipedia , lookup

Versant Object Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Information privacy law wikipedia , lookup

PL/SQL wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Operational transformation wikipedia , lookup

3D optical data storage wikipedia , lookup

Data model wikipedia , lookup

SAP IQ wikipedia , lookup

Open data in the United Kingdom wikipedia , lookup

Business intelligence wikipedia , lookup

SQL wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Data vault modeling wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Analysis Services
Choosing between
Multidimensional and Tabular
Helena Cabral - BI4ALL
[email protected]
Our Main Sponsors:
Say Thank you to Volunteers:
 They spend their FREE time to give you this
event.
 Because they are crazy. 
 Because they want YOU
to learn from the BEST IN THE WORLD.
Paulo Matos:
Pedro Simões:
André Batista:
Paulo Borges:
André Melancia:
Murilo Miranda:
Quilson Antunes:
5 Sponsor Sessions at 15:10
 Don’t miss them, they might be getting
distributing some awesome prizes!





Rumos
BI4ALL
Bold Int
CozyRoc
Pythian
Important Activities:
WIT – Women in Technology
 15:10 at BizSpark Room (Ground Floor)
SQLClinic Challenges
 10:00 DEV (Neil Hambly)
 11:50 DBA (Uwe Ricken)
 17:00 BI (Steph Locke)
Agenda
 SSAS Evolution
 Tabular: What’s it and how does it work?
 Multidimensional vs Tabular







Development experience
Data modeling
Scalability
Query performance
Process performance
Security
Client applications support
 Conclusions
v12
SQL Server 2014
v11
SQL Server 2012
v10
SQL Server 2008
(and R2)
v9
SQL Server 2005
v8
SQL Server 2000
v7
SQL Server 7.0
•First release with OLAP Services
•MOLAP, ROLAP and HOLAP architectures
•MDX query language
•SSDT-BI
•Power View for Multidimensional Models (using DAX)
•xVelocity engine
•Power Pivot and Tabular Model
•DAX language
•Business Intelligence Semantic Model (BISM)
•Improved Scalability and Performance
•Attribute and User Hierarchies
•Business Intelligence Development Studio (BIDS)
•Unified Dimension Model (UDM)
•Data Mining services
•Improved dimension design (changing dimensions, parent-child,
virtual dimensions)
•Improved calculation engine (custom rollups, cell calculations)
•Dimension security
•Connectivity over HTTP
SSAS Evolution
SSAS Evolution
 Why two approaches?

Only one data modeling tool for
different needs

Similar development
experience

Easy migration between SelfService and Corporate BI

New column-oriented and inmemory storage technology

Multidimensional concepts are
still difficult to learn
What is it? How does it work?
Does it still have dimensions and measure groups?
SSAS TABULAR
SSAS Tabular
 What is it?

Relational modeling

Uses xVelocity which is an in-memory
engine with column-oriented storage

Provides higher compression and faster
in-memory operations such as
aggregations

Supports for both MDX and DAX query
languages

Available in SQL Server 2012/14
Enterprise and Business Intelligence
editions

Also supports a DirectQuery mode

You can add calculated columns and
measures to any table in the model

You can create hierarchies in any table
in the model

You can partition any table in the model

Unlike Multidimensional models, there’s no
difference between a dimension table and a
measure group table.

Sometimes we don’t need a dimensional
model!
Dimensions vs Fact Tables
Which development tools can I use?
How long is the learning curve?
DEVELOPMENT EXPERIENCE
Development tools
TABULAR
 SQL Server Data Tools for
Visual Studio
 Open an existing Power Pivot
model
MULTIDIMENSIONAL
 SQL Server Data Tools for
Visual Studio
Development process
TABULAR
 Easy to wrap a model over raw
sources or DW for prototyping
purposes
 Model can be created from a
Self-Service experience
 Smaller learning curve (Excel
like)
 Need to learn DAX
MULTIDIMENSIONAL
 Model should be created over
a DW source
 Not so easy prototyping
 Higher learning curve
 IT required
 Wide knowledge base
 Need to learn MDX
Development experience
TABULAR
 Workspace database directly
related with the project
 WYSIWYG experience, but
also can make the
development experience too
slow
 Single developer at a time (one
model per solution)
MULTIDIMENSIONAL
 Need to be deployed and
processed to server to see the
results
 Support for multiple developers
Does Tabular have the same advanced model options?
Do I have to learn DAX?
DATA MODELING
Data Modeling
TABULAR
MULTIDIMENSIONAL
Relational Modeling:
 Tables and more tables
Dimensional Modeling:
 Facts and Dimensions
 Basic Relationships (…and
advanced relationships
through DAX calculations)
 Advanced Relationships
natively supported (parentchild, many-to-many, attribute
relationships, …)
 No role playing dimension (two
tables can only have one
active relationship)
 Support multiple granularity
 DAX calculations
 Do not support multiple
granularity in the same
measure group
 MDX calculations
Other (un)supported features
TABULAR
MULTIDIMENSIONAL
 Some features (as Actions,
Translations and Display
Folders) are possible by
editing the generated XMLA or
by using BIDS Helper tool
available at CodePlex
 Named-Sets, Scopes, Actions,
Translations
 Scopes can be defined by
using calculated columns
 Display Folders, custom format
of measures, default values for
dimensions attributes
 Write-back, Data Mining,
Custom Assemblies, Custom
Rollups
 Process behavior configuration
DimSalesReason
SalesReasonKey
SalesReasonName
SalesReasonReasonType
FactInternetSalesReason
SalesOrderNumber
SalesOrderLineNumber
SalesReasonKey
Relational model
FactInternetSales
SalesOrderNumber
SalesOrderLineNumber
SalesAmount
SalesAmountWithReason :=
CALCULATE (
SUM ( 'Internet Sales'[Sales Amount] );
FILTER (
FactInternetSalesReason;
CALCULATE ( COUNTROWS ( DimSalesReason ) > 0 )
)
)

Many to many relationships are not
supported by Tabular (natively)

But can be achieved by using some
advanced DAX calculations!
DAX Calculation
DEMO #1
DimProduct
ProductKey
(...)
Relational model
FactProductInventory
DimDate
ProductKey
DateKey
UnitsIn
UnitsOut
DateKey
(...)
DAX Query
EVALUATE
SUMMARIZE (
CALCULATETABLE (
'Product Inventory';
'Product Category'[Product Category Name] = "Clothing"
);
'Product'[Product Name];
'Date'[Date];
"Total Units Movement", [Total Units Movement]
)
ORDER BY 'Product'[Product Name]; 'Date'[Date]

Calculating balances by using
transactions

Need to sum all transactions up to a
given day

A common solution is to create a
snaphost table with the quantity
(calculated in ETL) for each product/day

Calculate balances on the fly using
Tabular/DAX!
DEMO #2
Does Tabular support a large volume of data?
SCALABILITY AND DEPLOY
Operation modes
TABULAR
MULTIDIMENSIONAL
In-memory/cached:
 All data is loaded in memory
and all queries are answered
from there
MOLAP:
 Data is pulled from relational
store, is aggregated and stored
in a multidimensional format
Direct Query:
 Bypass the in-memory model,
queries are answered directly
from database source (SQL
Server only)
ROLAP:
 All data stay in the relational
store and additiona objetcs are
created for calculated
aggregations
Model scalability
TABULAR
MULTIDIMENSIONAL
 Data volume mostly limited to
physical memory
 Data volumes can scale to
multiple terabytes
 Small/medium size models
 Medium/large size models
Processor centric:
 Memory is the most critical
resource
Disk centric:
 Disk performance is important
because is used as the primary
data storage for preaggregated data
 Disk performance is not an
important factor
How fast is Tabular compared to Multidimensional?
Is always faster?
QUERY PERFORMANCE
Querying scenarios
TABULAR
 Always read columnar data
from RAM
 Aggregate data in memory
 Second (warm) run is just as
fast the first (cold) one
MULTIDIMENSIONAL
 With no aggregations defined:
read atomic data from disk and
aggregate data in memory
 With predefined aggregations:
 Cold Cache: read aggregated
data from disk
 Warm Cache: read
aggregated data from RAM
Querying preparation
TABULAR
MULTIDIMENSIONAL
 Use DAX for detailed reports
 Use MDX or DAX for querying
 Use MDX for aggregated data
 Need to define aggregations
 No special tuning is required
 Benefits from warmup cache
mechanisms
Outperforms Multidimensional:
 Report on low granularity data
Outperforms Tabular:
 Report on aggregated data
with predefined aggregations
on Warm Cache
 Report on aggregated data
with no predefined
aggregations
MDX Query
SELECT
{
[Measures].[Discount Amount]
,[Measures].[Reseller Freight Cost]
,[Measures].[Reseller Gross Profit]
,[Measures].[Reseller Total Product Cost]
,[Measures].[Reseller Sales Amount]
,[Measures].[Reseller Tax Amount]
} ON COLUMNS,
NON EMPTY
(
[Geography].[Geography].[City]
*[Product].[Product].[Product]
*[Sales Territory].[Sales Territory Region].[Sales
Territory Region]
*[Promotion].[Promotion].[Promotion]
)
ON ROWS
FROM [Adventure Works]

Detailed queries are normally an
headache

Most of the times, we opt to query
directly the DW relational source

With Tabular, this detailed queries
perform better!
DEMO #3
DimCustomer
CustomerKey
Relational model
(...)
FactInternetSales
CustomerKey
OrderDateKey
SalesAmount
DimDate

Distinct Count (DC) operations are
expensive in Multidimensional

Moreover, need to create and maintain
multiple measure groups (one for each
DC measure)

With Tabular, DC measures performs
better without need special modeling!
DateKey
(...)
DAX Query
NewCustomers :=
CALCULATE (
DISTINCTCOUNT ( 'Internet Sales'[CustomerKey] );
FILTER (
ALL ( Date );
'Date'[DateKey] <= MAX ( 'Date'[DateKey] )
)
)
- CALCULATE (
DISTINCTCOUNT ( 'Internet Sales'[CustomerKey] );
FILTER (
ALL ( Date );
'Date'[DateKey] < MAX ( 'Date'[DateKey] )
)
)
DEMO #4
What are the process options?
Is there any dependencies?
PROCESS PERFORMANCE
Processing preparation
TABULAR
 Every table can be partitioned
 No processing dependencies
(one table can be processed
without processing related
tables)
MULTIDIMENSIONAL
 Only measure groups can be
partitioned
 Processing dependencies
(need to process dimensions
first)
 Don’t support parallel partition
processing
 Support for parallel partition
processing
 Data in other tables/partitions
can be queried during
processing
 Aggregations must be
recalculated when dimensions
processing occurs
Processing options
TABULAR
 Full process of a whole DB
might consume too much
memory
 Process Defrag: rebuilds the
table dictionary
 Process Recalc: recalculates
calculated columns, rebuilds
relationships, user hierarchies
and other internal engine
structures
MULTIDIMENSIONAL
• Process Index: creates or
rebuilds indexes and
aggregations for all processed
partitions
Does Tabular have the same security options as Multidimensional?
SECURITY
Security
TABULAR
MULTIDIMENSIONAL
 Don’t support cell security, only
table and row level security
• Advanced security features that
can go up to cell-level
 Drill through permission is set
to all roles
• It’s possible to set drill through
permission at role level
 Dynamic Security
• Dynamic Security
 Only supports visual totals
• Supports both visual totals and
non-visual totals
 Use DAX to restrict access to
Row Sets
Do I have to change my client applications?
CLIENT APPLICATIONS SUPPORT
Client Applications Support
TABULAR
Query languages:
MULTIDIMENSIONAL
Query languages:
 In Cached Mode: MDX and
DAX
 In DirectQuery mode: only
DAX
• MDX and DAX (as of SQL
Server 2012 SP1 CU4)
Client tool
Language
Microsoft Excel PivotTable
MDX
Power View
DAX
SQL Server Reporting Services
MDX / DAX
PerformancePoint
MDX
Other applications (ADO.NET of OLE DB
provider)
MDX / DAX
So.. What do I choose?
CONCLUSIONS
You should choose Multidimensional!
 You have an extremely large cube that exceeds server memory
 You need some of the features not supported by Tabular (even
with DAX)
 You have a complex model (budgeting/forecasting, multiple
many-to-many, parent-child…)
 You are an MDX expert
 You hate workarounds
 Your reporting needs are typically aggregated
 You have all your models in Multidimensional mode and want to
migrate to Tabular just because!
You should choose Tabular!
 You have small/medium models that fit in server memory
 You need a non-multidimensional model
 You have simple models (and not need neither of the
unsupported features)
 You really never understood MDX (just give a try to DAX,
although is not so simple than they say)
 You are an enthusiastic about workarounds!
 You have detailed and ad-hoc reports that takes to long to run
 You have a new model with no special needs and it will be great
if you can show it just after lunch (prototyping!)
Our Main Sponsors: