Download Dimension Table

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Versant Object Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Data vault modeling wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Schemas
Stars,, Snowflakes,, and Fact Constellations:
Schemas for Multidimensional Databases
The E-R data model is commonly used in the design of
relational databases, where a database schema consists
of a set of entities and the relationships between them.
them
Such a data model is appropriate
pp p
for OLTP.
A data warehouse requires a subject-oriented schema
th t facilitates
that
f ilit t OLAP.
OLAP
For Data Warehouse three types of schema exist in the
form of a star schema, a snowflake schema and fact
constellation schema.
Star schema
The most common schema is the star schema. In Star
schema, the data warehouse contains:
(1) a llarge central
t l table
t bl (fact
(f t table)
t bl ) containing the bulk of the
data, with no redundancy.
(2) a set of smaller dimension tables, one for each
dimension.
The schema graph look like a “STAR”, with the
dimension tables circles around the central fact table.
An example of a star schema for All Electronics
l . SSale
l are considered
id d along
l
ffour di
i
sales
dimensions,
namely, time, item, branch, and location `
The schema contains a central fact table for sales that
contains keys to each of the four dimensions, along with
two measures: dollars_sold and units_sold.
Each dimension is represented by only one table
Each table contains a set of attributes.
In Fig, the “location” dimension table contains attribute set
{location key street,
{location_key,
street city,
city province-or_state,
province or state country}
Example of Star Schema
Dimension Table
Dimension Table
time
item
time_key
time
key
day
day_of_the_week
month
quarter
year
Sales Fact Table
time key
time_key
item_key
Dimension Table
branch key
branch_key
item_key
item_name
brand
type
supplier_type
Dimension Table
location
branch
location_key
branch_key
bbranch
a c _name
a e
branch_type
Measures
units sold
units_sold
dollars_sold
location_key
street
city
state_or_province
country
Snowflake schema
The snowflake schema is a variant of the star schema
model, where some dimension tables are normalized
N
Normalization
li ti in
i thi
this schema
h
by
b splitting
litti th
the ddata
t iinto
t
additional tables.
The resulting schema graph forms a shape similar to a
snowflake.
Thee major
ajo ddifference
e e ce be
between
ee thee ssnowflake
o a e aandd sstar
a
schema models is that the dimension tables of the
snowflake model may be kept in normalized form to
reduce redundancies
redundancies.
Snowflake schema Cont ...
S h a ttable
Such
bl iis easy tto maintain
i t i andd saves storage
t
space.
But this saving space is negligible in comparison to the fact table.
table
Snowflake structure can reduce the effectiveness of browsingg
since more joins will be needed to execute a query.
A a result,
As
lt th
the system
t performance
f
adversely
d
l iimpacted.
t d
Hence, the snowflake schema is not as popular as the star
Hence
schema in data warehouse design.
Example of Snowflake schema
The main difference between the two schemas is in the definition
dimension tables.
The single dimension table for item in the star schema, normalized in
the snowflake schema,
schema resulting in new item and supplier table
For example
example, the item dimension table now contains the attributes
item_key, item_name, brand, type, and supplier_key.
Supplier_key is linked to supplier dimension table, containing
supplier_key and supplier_type information.
Example of Snowflake Schema
time
time_key
time
key
day
day_of_the_week
month
quarter
year
item
Sales Fact Table
time key
time_key
item_key
item_key
item_name
brand
t
type
supplier_key
supplier
supplier_key
supplier_type
branch_key
y
location
branch
location_key
branch_key
bbranch
a c _name
a e
branch_type
units_sold
dollars_sold
avg_sales
Measures
location_key
street
city_key
it k
city
city_key
city
state_or_province
country
Similarly, the single dimension table for location in
the star schema can be normalized into two new
tables: location and city.
The city_key in the location table links to the city
dimension. Notice that further normalization can
be performed on province_or_state and country
in the snowflake schema, when desirable.
Fact constellation
Sophisticated applications may require multiple fact tables
to share dimension tables.
This kind of schema can be viewed as a collection of stars.
So is called a galaxy schema or fact constellation.
constellation
example of a fact constellation
Thi schema
This
h
specifies
ifi ttwo ffactt ttables,
bl sales
l andd shipping.
hi i
The sales table definition is identical to that of the star
schema.
The shipping table has five dimensions,
dimensions (or keys):
item_key, time_key, shipper _key, from_location, and
to location and two measures: dollars_cost
to_location,
dollars cost and
units_shipped.
A fact constellation schema allows dimension
tables to be shared between fact tables.
For example,
example the dimensions tables for time,
time
item, and location are shared between both the
sales and shipping fact tables.
tables
Diagram showing Fact schema
Distinction between
Data Warehouse and Data Mart
A data warehouse collects information about subjects of entire
organization, such as customers, items, sales and its scope is
enterprise wide
enterprise-wide
In data warehouses, the fact constellation schema is commonly
used since it can model multiple, interrelated subjects.
A data
d t martt is
i a department
d
t
t subset
b t off the
th data
d t warehouse
h
that
th t
focuses on selected subjects, and its scope is department-wide.
In data marts, the star or snowflake schema are commonly used
because both used to model single subjects.