Download Kevin S. Goff – Brief BIO

Document related concepts

Database wikipedia , lookup

Big data wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Database model wikipedia , lookup

Transcript
The Baker’s Dozen
Business Intelligence
13 Productivity Tips for Data Warehouse Patterns in SQL
Server/Business Intelligence
Kevin S. Goff
Microsoft SQL Server MVP
Kevin S. Goff – Brief BIO
• Developer/architect since 1987 / Microsoft SQL Server MVP
• Columnist for CoDe Magazine since 2004,
“The Baker’s Dozen” Productivity Series”, 13 tips on a SQL/BI topic
• Wrote a book, collaborated on a 2nd book
• Frequent speaker for SQL Server community events and SQL Live!360
Conferences
• Email: [email protected]
• My site/blog: www.KevinSGoff.Net (includes SQL/BI webcasts)
• Releasing some SQL/BI video courseware in 2015
5/22/2017
Data Warehousing w/SQL-BI
2
Overview/Objectives
• Today: 13 tips for Data Warehousing w/SQL and BI Tools
• Much of successful data warehousing is like running a medical
practice, a legal practice, etc. – many best/recommended
practices, many proven methodologies, many patterns, etc.
• This presentation is going to cover a variety of scenarios
• Based on custom courseware - packed w/details
to help long after this presentation
• CoDe Magazine article on this:
– http://www.codemag.com/Article/1304071
•
•
•
•
•
If you have not read this book – go out and buy it NOW!
Not tied to any one technology
Written several years ago – about 99.9% is still as relevant today (third
edition also out)
Amazon link:
Kimball Group Website
5/22/2017
Data Warehousing w/SQL-BI
3
Some of the concepts to cover…
•
•
•
•
•
•
•
•
•
•
•
•
Granularity/Transaction Grain
Conformed/Common Dimensionality
Role playing dimensions
Hierarchies
Fact-Dimension Usage/Bus Matrix
Transactional vs Snapshot Fact Tables
Factless Fact Tables
Junk Dimensions
Star and Snowflake Schemas
Dimension Outriggers
Slowly changing dimensions
Early and Late Arriving Data
5/22/2017
Data Warehousing w/SQL-BI
4
Topics for today
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Goals/overview of a data warehouse / analytic database
Major components of a Data Warehouse (Facts and Dimensions)
Cumulative Transactional Fact Tables
Factless Fact Tables
Periodic Snapshot Fact Tables
Dimension Tables in General
Role-Playing Dimensions
Junk Dimensions
Dimension Outriggers
Many-to-Many Bridge Relationships
Type 2 Slowly Changing Dimensions
Storing NULL values in Fact Tables - DON'T!!!
Storing Ratios in Fact Tables - DON'T!!!
5/22/2017
Data Warehousing w/SQL-BI
5
1-Goals of a Data Warehouse/Analytic Database
A few key points
• Data Warehouse methodology is different than Relational
Normalized Methodology
• Data Warehouses and Data Marts are generally more denormalized and flat
• Why?
• Transaction systems utilizing normalized methodologies need to get data into the
database as efficiently as possible
• Data Warehouse/Data Mart systems – more interested in getting data “out” of the
system
• A key process in data warehousing – shaping data into
Fact/Dimension models
• Once data in Fact/Dimension structures, can use many tools
• Discovery process can be challenging – requirements come from
many areas
5/22/2017
Data Warehousing w/SQL-BI
6
1-Goals of a Data Warehouse/Analytic Database
Customer (Manufacturer of Steel) wants to look at
•
•
•
•
•
•
•
•
Customer Orders in Tons
Might need
Material Production in Tons
multiple plant
Defects (the count)
support
Amount of Material Regraded in Tons
Regraded % with respect to Regrade Thresholds
Amount of Material Reworks in Tons
Material Quality Test Results
Heat Chemistries
And they want to look at these numbers “BY”
•
•
•
•
•
•
•
•
What is our “end game?”
You might build:
Build for the
future!!!
Customer and Order
Material Type, Name, Line, Size, Width, Thickness
Responsible Department
Defect Type
Sometimes
Chemical Element
existing reports
Date/Week/Month/Quarter/Year
can help to drive
Disposition
requirements
Regrade Type
5/22/2017
• A relational data mart/Data
Warehouse using
denormalized star-schema
models according to the
Kimball Methodology
• You might use Self-Service
BI Tools (Power Pivot and
Power View) for power users
to “get at” the data
• You might create Analytic
OLAP Cubes or SSAS
Tabular Models from the
Data Mart for more
powerful/advanced analytics
Data Warehousing w/SQL-BI
7
1-Goals of a Data Warehouse/Analytic Database
Fact/Dimension Usage – Key Area
Here is an “END GOAL”
Note that a single date serves
many proposes (roles)
Note that material
production is related to
Rep/Resp Units
“through” a Unit to
Material Line Bridge
Table
Note that a business unit
serves as a Reporting or
Responsible Unit
5/22/2017
Data Warehousing w/SQL-BI
8
1-Goals of a Data Warehouse/Analytic Database
Every
intersection
point between a
Fact table and
a dimension
“tells a story”
Common
dimensionality
Role playing
dimension
5/22/2017
Data Warehousing w/SQL-BI
9
1-Goals of a Data Warehouse/Analytic Database
Some of the things
we can build
5/22/2017
Data Warehousing w/SQL-BI
10
1-Goals of a Data Warehouse/Analytic Database
Some of the things
we can build
Here’s the magic word
“by” – We want to see
volume “by” first time
regrades
5/22/2017
Data Warehousing w/SQL-BI
11
1-Goals of a Data Warehouse/Analytic Database
Some of the things
we can build
Percentages calculated
on the fly – not stored in
the cube
5/22/2017
Data Warehousing w/SQL-BI
12
1-Goals of a Data Warehouse/Analytic Database
All numbers are based on tallies (# of
instances where dimension member
values come together to form an
event…Factless Fact Table
5/22/2017
Data Warehousing w/SQL-BI
Some of the things
we can build
13
1-Goals of a Data Warehouse/Analytic Database
Some of the things
we can build
5/22/2017
Data Warehousing w/SQL-BI
14
1-Goals of a Data Warehouse/Analytic Database
Some of the things
we can build
Calculation of
week number
based on current
date
5/22/2017
Data Warehousing w/SQL-BI
15
1-Goals of a Data Warehouse/Analytic Database
Some of the things
we can build
Lowest level of detail
5/22/2017
Data Warehousing w/SQL-BI
16
1-Goals of a Data Warehouse/Analytic Database
People want drilldown!!!
User might want to
know – for these 29
regrades, what was all
the detail underlying
data?
The database cube
(and Excel) allow user
to right-click on the
measure (either the
regrade tons or the
count), and under
“Additional Actions”,
drill through to the
lowest level.
That will launch a 2nd
Excel sheet (see next
slide) with all
underlying detail
5/22/2017
Data Warehousing w/SQL-BI
17
1-Goals of a Data Warehouse/Analytic Database
Lowest level details for the 29 regrades in August for Caster/CC1
User can scroll out to the right for more details
Note: each database/BI tool supports
drilldown/drillthrough differently – there is no core
standard
5/22/2017
Data Warehousing w/SQL-BI
18
1-Goals of a Data Warehouse/Analytic Database
Fact tables represent “what
happened” with
measurements we can
aggregate
Shaping each
business activity into
Fact and related
dimension structures
Sometimes best to prototype a few at
a time
The supporting data warehouse has 2
components: Fact tables and
Dimension tables
5/22/2017
Dimension tables provide
business context to fact
tables. Loosely speaking,
they are the “business
master
tables”
Data Warehousing
w/SQL-BI
Key point in Dimensional
Modeling is the relationships
between fact and dimension
tables. Sometimes very
easy and clean, sometimes
more complicated
19
1-Goals of a Data Warehouse/Analytic Database
Role playing
relationship – Date
can serve multiple
roles in Fact Table
Self-join relationship (often
seen in organization
hierarchies
5/22/2017
Data Warehousing w/SQL-BI
20
1-Goals of a Data Warehouse/Analytic Database
Factless Fact table – no
measures, just a tally of
“how many times something
happened”
5/22/2017
Data Warehousing w/SQL-BI
21
1-Goals of a Data Warehouse/Analytic Database
Not all Fact table measures are “summed”.
We might want MAX, MIN, and AVG values.
What about percentages? We’ll talk about
them later
5/22/2017
Data Warehousing w/SQL-BI
22
1-Goals of a Data Warehouse/Analytic Database
Some fact tables are large,
contain hundreds of millions of
rows
Others, like threshold or goal or
target tables for KPIs, tend to be
smaller
5/22/2017
Data Warehousing w/SQL-BI
23
1-Goals of a Data Warehouse/Analytic Database
Hierarchies (parent/child
relationships play a big
part, in drilldowns and
roll-ups
5/22/2017
Data Warehousing w/SQL-BI
24
1-Goals of a Data Warehouse/Analytic Database
Key Performance
Indicators
(Some database/OLAP
tools have strong KPI
features, other times
you just build the
measure and target
manually
5/22/2017
Data Warehousing w/SQL-BI
25
1-Goals of a Data Warehouse/Analytic Database
Key Performance
Indicators: need to define
rules and Data for
Thresholds/Goals
Maybe 1 fact table holds
sales, another holds
quotas – common
dimensionality is
employee and time period
5/22/2017
Data Warehousing w/SQL-BI
26
2-Major Components of a Data Warehouse
What’s the overall story?
Everyone should have this
5/22/2017
Data Warehousing w/SQL-BI
27
2-Major Components of a Data Warehouse
•
ETL process usually
significant
Could be operational
data store (ODS) in
between
•
•
Often this is reversed – data
marts used to accumulate a DW,
not vice-versa
•
Possibly
rules for
what rows a
user can see
5/22/2017
•
Data Warehousing w/SQL-BI
10 different
companies could
have 7 different
variations
Some companies
might use OLAP (or
other Analytic
databases), others
might not
A data
warehouse/data mart
is a relational
database – just might
not be normalized
In most Kimball
databases, data is
“flattened” and
denormalized
Don’t discount ETL
efforts – they can be
MASSIVE!
28
2-Major Components of a Data Warehouse
The measures in a Fact
Table have a common
“grain” (dimension
granularity)
•
•
•
•
Joined with
surrogate keys
•
The process of
identifying
facts/dimensions and
establishing direct (or
indirect) relationships is
what we call
Dimensional Modeling
5/22/2017
Data Warehousing w/SQL-BI
•
•
Data warehouses consist of
2 main elements: Fact
Tables and dimension
Tables
Again, these are relational
tables
Fact Tables contain
measures that businesses
aggregate/evaluate
Dimension tables provide
business context for the
facts
Loosely speaking,
dimensions are often the
“master tables” from OLTP
systems
Facts are related to
dimensions in PK/FK
relationships w/integer
keys
Big paradigm shift from
OLTP/normalized platform
29
2-Major Components of a Data Warehouse
•
•
•
•
•
•
•
•
From Kimball methodology
Data Warehouse Dimension Usage Matrix
“BUS” architecture
Reflects the fact tables across different business processes (“value chain”) and the intersection points with
dimensions
From Kimball methodology
Data Warehouse Dimension Usage Matrix
(Image above from the Ralph Kimball book)
Key up-front deliverable: helps communications regarding proj mgmt & technical design
5/22/2017
Data Warehousing w/SQL-BI
30
2-Major Components of a Data Warehouse
• So before you begin….
• Make sure you’ve shaped your data into star-schema
Fact/Dimension tables, using surrogate integer keys
• Fact tables ideally should only contain numeric measures
(dollars, units sold) and foreign key integer values that relate to
Business Dimension master tables
• Database engine features like xVelocity, Columnstore index can
optimize these structures
• Recommend: use the Kimball Methodology
• Read this book, and read it again, and again!
– Data Warehouse Toolkit: Complete Guide to
Dimensional Modeling
5/22/2017
Data Warehousing w/SQL-BI
31
3-Cumulative Transactional Fact Tables
• Measures are fully additive by all
related dimensions
• Each fact table must have a fully
understood “grain statement” (level
of detail, level of granularity)
• Sometimes facts are at a very low
level (product sku, ship to account)
or much higher (by region, by
market, by month, etc.)
• Populated by ETL processes that run
daily, or weekly, or monthly, or even
throughout the day
• Some people might store an identity
column in a fact table
• Best fact tables – only numeric data
(Columnstore index in SQL 2012)
5/22/2017
Data Warehousing w/SQL-BI
32
3-Cumulative Transactional Fact Tables
• Clean Transactional Fact table, with foreign keys
• (some might name them StateFK, DateFK instead of PK suffix)
• Some might assign an identity column
• Some might also store a datetime last updated
• As for “Last User”, ideally, only one process should be writing out
Fact/Dimension Data
• T-SQL MERGE statement can be used to populate
• SQL Server 2012/2014 can use columnstore indexes to optimize fact tables
• In my webcast area, a webinar on columnstore indexes: (2/24/2013)
5/22/2017
Data Warehousing w/SQL-BI
33
4-Factless Fact tables
• Special type of
transactional fact table
• Each row represents an
“event” (such as a person
signing up for a course,
sold by a sales rep)
• We are looking to tally the
# of instances where
dimensions come together
• In some databases, the rollup of these tallies could be
very critical!!!
5/22/2017
Data Warehousing w/SQL-BI
34
4-Factless Fact tables
• Tally of Visits for a
Health Care
Provider
• By Year, Gender,
Visit Type,
Physician Group
5/22/2017
Data Warehousing w/SQL-BI
35
4-Factless Fact tables
Visit tally “by”
Patient, Visit
Date, Age at Visit,
Physician,
Location, Visit
Type, Quality
Indicator
5/22/2017
Data Warehousing w/SQL-BI
36
5-Periodic Snapshot Fact Tables
• Populated on some interval/period
• Measures represent a “point in time”
count or balance or value
• Unlike transactional fact tables, the
measures in snapshot fact tables are
SEMI-ADDITIVE
• Meaning, they can be rolled up by some
dimensions, and maybe averaged across
some dimensions
• But measures are NOT “fully-additive”,
“full-aggregatable”
• End of month or end of period ETL
processes to load these tables
• Variation of this, discussed in the Kimball
methodology: accumulating snapshot fact
tables
• Think of a mortgage fact table, with dollar
values and dates for initial approval,
underwriting approval, final approval
5/22/2017
Data Warehousing w/SQL-BI
37
6-Dimension Tables
General contents of dimension tables
Many use
script to create
Multiple
hierarchies
Might be a
range of
values
Fiscal
as well
• Represents the context by which users want to aggregate or
“slice and dice” data
• Each dimension should have a surrogate integer key, a
business key, one or more descriptions, and one or more
attributes (that might form one or more parent-child
hierarchy relationships)
5/22/2017
Data Warehousing w/SQL-BI
38
6-Dimension Tables
• Might use script to Create
• Many people will generate the DatePK as
YYYYMMDD…still an integer, but makes it
easier to read in Fact Table
• Might have both Fiscal Quarter and Calendar
Quarter
• Allows rollups of sales by month, quarter,
etc.
5/22/2017
Data Warehousing w/SQL-BI
39
6-Dimension Tables
Rules/guidelines about populating Fact/Dimension Tables
• Always populate Dimensions first!
• Populate Fact tables second (since Dimensions provide
context for the facts)
• Good use for T-SQL MERGE (for both fact and dimensions)
– Might have a million incoming rows
– Maybe 100,000 represent new rows, 50,000 represent changed
rows, and the rest are rows that haven’t changed
– MERGE TargetTable T
USING IncomingSource I on T.BzKey = I.BzKey
WHEN NOT MATCHED THEN INSERT….
WHEN MATCHED and T.NonKeyCol <> I.NonKeyCol THEN UPDATE SET T…
OUTPUT $Action, INSERTED.*, DELETED.* TO TableOfInsertsAndUpdates
5/22/2017
Data Warehousing w/SQL-BI
40
6-Dimension Tables
Snowflake dimension schemas
• As a general rule, build fact-dimension relationships in
flat, denormalized structures (star-schema)
• Sometimes, however, the repetition of data might be so
high that you might make an exception and normalize
one or more dimensions (snowflake schema based on
dimension outrigger)
• Snowflake schemas are not “horrible”, but they can
introduce complications (sometimes minor) in ETL
processes and for end users who build reports against a
snowflake model
5/22/2017
Data Warehousing w/SQL-BI
41
7-Role-Playing Dimension Relationships
• A single dimension key
might serve multiple
purposes, or “roles” in a
fact table
• Example: an order
might have an order
date, a due date, a ship
date, etc.
• No need to create 3
versions of a Date
dimension – just one,
with 3 relationships
• Products like Analysis
Services will
automatically create 3
“views” into the date
Dimension
• Once had a client with
SIX roles!!!
5/22/2017
Data Warehousing w/SQL-BI
42
7-Role-Playing Dimension Relationships
Limitation in SSAS Tabular!
Must either create three views of
the Date dimension
Or…Must extend the Tabular
Model with DAX formulas:
Sum of ShipSalesAmount Reseller :=
CALCULATE(
sum( [SalesAmount]),
userelationship(
'Date'[DateKey],
ResellerSales[ShipDateKey]))
5/22/2017
Data Warehousing w/SQL-BI
43
7-Role-Playing Dimension Relationships
SSAS OLAP handles natively
One single date can be used as an Invoice Date, or
a Paid Date, or a PO Date in a transactional
spending table
An account might be the invoicing account or the
PO account
A department might be the invoicing department
or the PO department
5/22/2017
Data Warehousing w/SQL-BI
44
8-Junk dimensions
840 rows
5 rows
4 rows
6 rows
This “works”
but we need
to maintain
several small
tables
5/22/2017
7 rows
This is cleaner, and
users can still
aggregate and slice
Sales by any of the
attributes
• If you have several dimensions that
each contain a small # of rows,
consider creating a Cartesian
product
• No “absolute rule”, more a
judgment call
Data Warehousing w/SQL-BI
45
9-Dimension Outriggers
Might belong to
hundreds of
parent counties
where each
county has
attributes to
describe it
Customer might
contain millions
of rows
Straight from Ralph Kimball
book
•
•
•
•
Image above is straight from Kimball Dimensional Modeling book
Might have millions of customers that fall into hundreds of counties
Each county has a number of attributes specific to that county
If we stored the county attributes directly in customer dimension, you’d have a large
number of unique values that don’t vary much by customer
• This can happen, but isn’t common
5/22/2017
Data Warehousing w/SQL-BI
46
10-Many-to-many Bridge Relationships
• One of the more
complicated
model
relationships
• Often involves
rates or ratios
• Consider another
example:
• A book could be
written by
multiple authors
(who each
contribute a %)
• An author can
write multiple
books
5/22/2017
Data Warehousing w/SQL-BI
47
10-Many-to-many Bridge Relationships
• Another example: we
want to look at sales by
day, but across different
currencies for different
countries where the rate
varies by day
5/22/2017
Data Warehousing w/SQL-BI
48
10-Many-to-many Bridge Relationships
• Suppose the bridge table
represents conversion rates from
a base volumetric (Lbs) to other
Units of Measure
• Fact Shipments cannot be sliced
directly to Units of Measure - but
it can be related to a fact table
(FactUOMConversionRates) that is
also related to a common
dimension: product
• The bridge table permits us to
take core shipments for a given
product, and apply the conversion
rate for whatever unit of measure
we wish to view
5/22/2017
Data Warehousing w/SQL-BI
49
10-Many-to-many Bridge Relationships
Limitation in SSAS Tabular!•
Single book can be written multiple authors.
• An author can write multiple books
• Sales table tracks the sales of the book, not the author
• We want to see sales by author’s contribution towards
book (for purposes of calculating royalty)
SSAS Tabular does
not support bridge
table relationships at
all
To express author
dollars (using share),
must implement DAX
calculation:
AuthorDollars :=
SUMX(
'DimBookPrice',
CALCULATE(
SUM(
'FactBookSales‘
[SalesDollars]) *
SUM(
'BooksXAuthors‘
[AuthorShare])/100 ) )
5/22/2017
Data Warehousing w/SQL-BI
50
10-Many-to-many Bridge Relationships
SSAS OLAP handles natively
Regrades are by Resp/Rep Unit
Material Production is by Line
We want to look at Regrades and associated Production Tons
We have a bridge table of M2M between Material Line and Unit
(One Line can span many Units, a Unit can span many Lines)
We tell relationship editor that we look at Material Product “by”
Unit “through” the M2M bridge table.
5/22/2017
Now we have “common dimensionality”
Data Warehousing w/SQL-BI
51
10-Many-to-many Bridge Relationships
SSAS OLAP handles natively – but build proof of concept cases to demonstrate
Only
sums
one
instance
of HRM1
5/22/2017
Data Warehousing w/SQL-BI
HRM1 is a line
associated with units
CC1, CC2, DEV,
HRM1, IT, MSP, and
SCH
52
11-Type 2 Slowly Changing Dimensions
Need to write out Book key that
was in effect at time of sale
5/22/2017
• When Dimension attributes change and we care
about tracking history associated with the change,
this is known as a Type 2 Slowly Changing
Dimension
• In the book sales application, we might have a
Fact Sales table. Want to track sales of a book
based on its historical price point
• Any time a book price changes, we “retire” the
dimension row that’s been the “current row” (by
setting an end date), and then insert a new
dimension row
• When we write out the fact row, we use the
BookPK that’s “ in effect” at the time of the sales.
So the surrogate key from the dimension is put
into the fact table, based on the effective date of
the sale with respect to the StartDate and EndDate
from the dimension table. This allows the fact data
to be easily joined to the correct dimension data
for the corresponding effective date
• Allows us to report on sales by the book as a
whole, or based on sales history of the book
Data Warehousing w/SQL-BI
53
11-Type 2 Slowly Changing Dimensions
•
Late arriving dimension data (price changed on
Oct 1, we post sales throughout Oct, but we only
get price change on Nov 1) might involve
updates to fact table, or posting of reversing
entries
•
•
•
•
•
5/22/2017
Data Warehousing w/SQL-BI
Items to take into account when
designing a type 2 SCD:
Clearly defining the business process
Accounting for all necessary columns
& relationships in data model
Capturing the change to the attribute
(using database triggers, Change Data
Capture, SSIS SCD task)
Determining the correct dimension
surrogate key to use, when
populating the fact table
Dealing with early and late arriving
data (row is posted into a dimension
table for a change that won’t take
effect for another month….or latearriving sales data that occurred
months prior, where we need to
determine the correct product PK “at
that point in time”
54
11-Type 2 Slowly Changing Dimensions
•
•
•
•
Options for implementing:
Pure T-SQL code to do lookups to “retire old row”, “insert new one”
Custom SSIS packages with custom T-SQL code
SSIS Slowly Changing Dimension “Super-Transformation”
• You provide the input pipeline of columns
• You provide the target table, the business key to do lookup, and the columns representing
“historical changes” you want to track (i.e. You want to generate new row in price change)
• SSIS generates an entire workflow of lookups and transformations and inserts/updates
• Works nicely, though performance isn’t great and not very flexible (if you alter the generated
workflow and then go back and change original parameters, SSIS will overwrite your changes)
• Some free open source alternatives with benefits:
• https://scdmergewizard.codeplex.com/
• http://dimensionmergescd.codeplex.com/
• Key points, however you do it:
• Define columns you want to historically preserve
• Retire old row, insert new row
• Write out correct dimension PK into Fact table (as FK)
• Deal with early and late arriving data
5/22/2017
Data Warehousing w/SQL-BI
55
11-Type 2 Slowly Changing Dimensions
•
•
•
•
•
•
Full video demo using the SSIS Type 2 Slowly Changing Dimension
www.commongroundsolutions.net/Type2SCDDemo.zip
Zip file contains 3 videos
The SSIS Type 2 SCD demo starts at the one hour, 59 minute mark of Video 1
Goes all the way through Video 2
Ends at Video 3, at the 53 minute mark
Starts at 1 hr, 59 minutes
Entire video
Ends at the 53 minute mark
5/22/2017
Data Warehousing w/SQL-BI
56
12-Storing NULL values in Fact Tables – DON’T!!!
• Suppose you have a Fact Table with sales measures, and a
CostCenterFK (that relates to a costCenterPK in a CostCenter
Master)
• Suppose that on 5% of the Sales rows, there is no Cost Center
• While databases will optionally permit it, you should NOT store a
NULL for the Foreign Key! This is a very bad idea
• Instead, store an “Unclassified cost center” or an “N/A Cost
Center” in the Cost Center Master (maybe with a key of -1) and use
that value in the Master Table
• This allows users to aggregate sales by the valid cost centers and
also see the sales where there was an “unused cost center”
5/22/2017
Data Warehousing w/SQL-BI
57
12-Storing NULL values in Fact Tables – DON’T!!!
Don’t store NULL for foreign key in
Fact Table – makes it less than
ideal for reporting
Store as a -1 in the
Dimension table for
the master row, and
then in the Fact
table.
Better for reporting
5/22/2017
Data Warehousing w/SQL-BI
58
13-Storing ratios in Fact Tables – DON’T!!!
• In fact tables, you can store measures that are either fully additive
(sales) or partly additive (end of month inventory count)
• You can also store measures that are derived from simple math (Net
Revenue = Gross Revenue less Returns, less Damages, etc)
• But DON’T store measures that represent percentages or ratios that are
derived from division – calculate them “on the fly”. Why? Because they
won’t aggregate to any sensible value
• For Instance:
– Store A
$10 in returns, $20 in sales (return % of 50%)
– Store B
$10 in returns, $100 in sales (return % of 10%)
– We want to roll up the returns % for the region – but we obviously want a
“weighted” returns %. Storing the returns % is meaningless when we want to
aggregate/roll-up.
– Best practice: store the numbers that represent the numerator/denominator, and
then calculate “on the fly”
5/22/2017
Data Warehousing w/SQL-BI
59
Book recommendation
• Blogs for all 3 authors (have
written a great deal on both
SSAS OLAP and SSAS Tabular)
• Chris Webb's Blog
• Alberto Ferrari's Blog
• Marco Russo's Blog
5/22/2017
• The best book available on
the Tabular Model
• Get this book – read it, read it
again
• The “Three Wise Men” of
SSAS Tabular and SSAS OLAP
• Just like Mosha Pasumansky
was the MDX Expert…Chris
Webb, Alberto Ferrari, and
Marco Russo are the Tabular
and DAX gurus
Data Warehousing w/SQL-BI
60
Book recommendation
•
•
•
•
Book recommendation for SSAS OLAP
The best book available on SSAS
Amazon link:
Blogs for all 3 authors (have written a
great deal on both SSAS OLAP and SSAS Tabular)
• Chris Webb's Blog
• Alberto Ferrari's Blog
• Marco Russo's Blog
5/22/2017
Data Warehousing w/SQL-BI
61