Download Data Integration - The ETL Process

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Relational model wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Big data wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Functional Database Model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Database model wikipedia , lookup

Transcript
Module 4: BIC#4 – Data Integration Capability
Populating Data Warehouse (Data Mart)
Data Integration - The ETL Process
1
BI System Process
Data Source
• Flat Files
• Transactions DB
(OLTP)
• XML Files
• Excel Files
• Etc.
2
Populate the Repository
- ETL Process
- SSI Services
- Chapter 7/8 – Larson Book
Outline
 ETL Concept
 ETL Process: Three Step Process
3
ETL Process Defined
 Extraction, transformation, and load (ETL)
A data warehousing process that consists of extraction (i.e.,
reading data from a database), transformation (i.e., converting the
extracted data from its previous form into the form in which it
needs to be so that it can be placed into a data warehouse or
simply another database), and load (i.e., putting the data into the
data warehouse).
4
Steps involved
 Step 1:Data Profiling (Class Handout- ICA#4)
 Step 2: Source to Target Mapping (Class Handout –ICA#4)
 Step 3: Creating and Implementing Integration Service Package
 Populating the Dimension Tables (LBD#1, Larson Chapter8 –
ICA#5)
 Populating the Fact tables (LBD#2, Larson Chapter8 – Makeup HW)
5
ETL Process
Step 1: Data Profiling
Refer to the Class Handout for more details
6
Data Profiling
 There are three data sources for the Manufacturing Data
Mart
OLTP #1: Manufacturing Data: Transaction Data Stored in a
CSF File (Batch.csv).
2. OLTP #2: Accounting Database: Transaction Data Stored in a
SQL Server DB Engine
(KJMaxMinAccountingSystemDataSource). This database is on
the same server where your DB Engine accounts are located.
3. OLTP #3: Order Processing Data: Transaction Data Stored in
a SQL Server DB Engine
1.
(KJOrderProcessingSystemDataSource). This database is on the
same server where your DB Engine accounts are located.
7
ETL Process
Step 2: Source to Target Mapping
Refer to the Class ETL Handout for more details
8
Step 2: Source to Target Mapping
Source: Transaction Data
 Batch Flat File
 Accounting DB
9
Target: Manufacturing Datamart
ETL Process
Step 3: Creating and Implementing SQL Server Integration Service Package
Refer to LBD#1 and #2 in Chapter 8 for this Section
10
Summary
 ETL Concept
 Data Profiling
 Data Mapping – Source to Target
 Integration Package using SSIS
11