Download Data Mining and Data Warehouse - Home

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Business Intelligence
Instructor: Bajuna Salehe
Email: [email protected]
Design the Data Warehouse
Data Warehouse Schema
• The entity-relationship data model (ERD)
is commonly used in the design of
relational databases, where a database
schema consists of a set of entities and
the relationships between them.
• Such a data model is appropriate for online transaction processing (OLTP)
Data Warehouse Schema
• A data warehouse, however, requires a
concise, subject-oriented schema that
facilitates on-line data analysis (OLAP).
• The most popular data model for a data
warehouse is a multidimensional model.
Data Warehouse Schema
• Such a model can exist in the following
– a star schema
– a snowflake schema
– a fact constellation schema.
• The major focus will be on the star
schema which is commonly used in the
design of many data warehouse.
Star Schema
• This is the most common modeling
paradigm for designing data warehouse.
• In this model a data warehouse consists
– a large central table (fact table) containing the
bulk of the data, with no redundancy
– a set of smaller attendant tables (dimension
tables), one for each dimension.
• The diagram below show an example of
star schema
Star Schema
• Star schema of a data warehouse for sales.
Star Schema
• A star schema for AllElectronics sales is shown
in Figure in the above slide. Sales are
considered along four dimensions namely, time,
item, branch, and location.
• The schema contains a central fact table for
sales that contains keys to each of the four
dimensions, along with two measures: dollars
sold and units sold.
• To minimize the size of the fact table, dimension
identifiers (such as time key and item key) are
system-generated identifiers.
Star Schema
• Notice that in the star schema, each
dimension is represented by only one
table, and each table contains a set of
• For example, the location dimension table
contains the attribute set {location key,
street, city, province or state, country}
Issues With Data Warehousing
• Underestimation of resources for data
• Hidden problems with source systems
• Required data not captured
• Increased end-user demands
• Data homogenization
• High demand for resources
• Data ownership
• High maintenance
• Long duration projects
• Complexity of integration
Related documents