Download How to transform your data from source database to data warehouse

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Cairo University
Faculty of Computers and Information
Information Systems Department
SSIS package Example
How to transform your data from source database to data warehouse?
Consider the example of Sales process in AdventureWorks database:
Product (Production)
SpecialOfferProduct (Sales)
ProductID
Name
SpecialOfferID
ProductNumber
ProductID
MakeFlag
rowguid
FinishedGoodsFlag
ModifiedDate
SalesOrderHeader (Sales)
SalesOrderDetail (Sales)
SalesOrderID
SalesOrderID
RevisionNumber
OrderDate
SalesTerritory (Sales)
DueDate
TerritoryID
ShipDate
Name
Status
CountryRegionCode
OnlineOrderFlag
[Group]
SalesOrderNumber
SalesYTD
PurchaseOrderNumber
SalesLastYear
SalesPerson (Sales)
SalesOrderDetailID
CarrierTrackingNumber
OrderQty
ProductID
SpecialOfferID
UnitPrice
UnitPriceDiscount
Customer (Sales)
SalesPersonID
CustomerID
TerritoryID
TerritoryID
SalesQuota
AccountNumber
Bonus
CustomerType
CommissionPct
rowguid
ModifiedDate
SSIS steps:
1. Build your star schema on the sql server on the data warehouse as follows
TimeDim
SalesFact
ProductDim
OrderDate
ProductID
ProductID
OrderDay
OrderDate
ProductName
OrderMonth
Quantity
ProductSubcategoryName
OrderYear
Price
ProductCategoryName
Lab 3
1
2. We will transform the data from sources to the dimensions and fact in the data
warehouse one by one.
a. Transform products data to the products dimension.
i. Right click on the DW and select tasks  import data.
ii. Choose the source of the data (AdventureWorks Database)
iii. Choose the Destination of the data (the database containing the data warehouse
for example (DW))
iv. Choose to transform data by query (second option)
v. Write the following Query
SELECT
Production.Product.ProductID, Production.Product.Name AS
ProductName, Production.ProductSubcategory.Name AS ProductSubcategoryName,
Production.ProductCategory.Name AS CategoryName
FROM
Production.Product INNER JOIN
Production.ProductSubcategory ON Production.Product.ProductSubcategoryID =
Production.ProductSubcategory.ProductSubcategoryID INNER JOIN
Production.ProductCategory ON
Production.ProductSubcategory.ProductCategoryID =
Production.ProductCategory.ProductCategoryID
vi. Parse the query to check its validity  Next
vii. Choose the Product dimension in the destination column
viii. Choose the saving media you want and check the run immediately checkbox and
press Next.
b. Transform Order date to the Time dimension.
i. Right click on the DW and select tasks  import data.
ii. Choose the source of the data (AdventureWorks Database)
iii. Choose the Destination of the data (the database containing the data warehouse
for example (DW))
iv. Choose to transform data by Query (second option)
v. Write the following Query
Select distinct (OrderDate), DAY(OrderDate) AS OrderDay, MONTH(OrderDate) AS
OrderMonth, YEAR(OrderDate) AS OrderYear
FROM
Sales.SalesOrderHeader
Order by
OrderDate
vi.
vii.
viii.
ix.
Parse the query to check its validity  Next
Choose the Time dimension in the destination column
Press the button in the transform column that contains 3 dots (…)
Choose the saving media you want and check the run immediately checkbox and
press Next.
c. Transform Data of Ordering to the SalesFact Table.
i. Right click on the DW and select tasks  import data.
ii. Choose the source of the data (AdventureWorks Database)
iii. Choose the Destination of the data (the database containing the data warehouse
for example (DW))
iv. Choose to transform data by Query (second option)
v. Write the following Query
Lab 3
2
SELECT
Sales.SalesOrderDetail.ProductID, Sales.SalesOrderHeader.OrderDate,
SUM(Sales.SalesOrderDetail.OrderQty) AS Quantity, sum(UnitPrice*(1UnitPriceDiscount)*Sales.SalesOrderDetail.OrderQty) as Price
FROM
Sales.SalesOrderDetail INNER JOIN
Sales.SalesOrderHeader ON Sales.SalesOrderDetail.SalesOrderID =
Sales.SalesOrderHeader.SalesOrderID
Group by
Sales.SalesOrderDetail.ProductID, Sales.SalesOrderHeader.OrderDate
vi. Parse the query to check its validity  Next
vii. Choose the Sales Fact in the destination column
viii. Choose the saving media you want and check the run immediately checkbox and
press Next.
3. Make sure –using diagram – that the tables are connected successfully; this could be done
as follows:
a. New Database diagram
b. Choose our 3 tables to be added in the diagram
c. Hold the FK column from SalesFact table and drag it to its main PK table
i. First OrderDate from SalesFact to TimeDim
ii. Second ProductID from SalesFact to ProductDim
d. This will generate a window to ask for the mappings of columns and if there is any
extra specifications. Just check it and press OK.
4. You have transformed your data successfully now and the data warehouse is ready.
Notice that you may not build the tables first, and simplify use import using query to transfer data and
structure and you can modify the names later, however it is not recommended.
Important Notice:
If you have a problem with SQL Server Express Edition 2008 R2 SSIS import and export, you will
need to install: Microsoft .Net Framework 4 (Standalone Installer)
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=0a391abd-25c1-4fc0-919fb21f31ab88b7
Or could install it online using the following link:
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=9cfb2d51-5ff4-4491-b0e5b386f32c0992&displaylang=en
Solution was found at the following forum:
http://social.msdn.microsoft.com/Forums/en/sqltools/thread/e8be2b2a-57f9-485e-90df-2f857f3a0c55
Lab 3
3