Download Chapter 8 - Jerry Post

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Forecasting wikipedia , lookup

Coefficient of determination wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
All Powder Board and Ski
Microsoft Access Workbook
Chapter 8: Data Warehouses and Data Mining
Jerry Post
Copyright © 2003
1
Desired Sales Cube Dimensions
Sales Dimensions
State (ship)
Month
Category
Style
SkillLevel
Size
Color
Manufacturer
BindingStyle
WeightMax?
ItemMaterial?
WaistWidth?
2
Early Data: Spreadsheets
3
Create Customer and Employee
CustomerID and EmployeeID are missing from the old data.
Instead of relying on blank cell values, create a new customer called
“Walk-in” and a new employee called “Employee”
Write down the ID numbers generated for these anonymous entries.
If you use SQL, you can assign a value of zero to these entries.
INSERT INTO Customer (CustomerID, LastName)
Values (0,'Walk-in')
INSERT INTO Employee (EmployeeID, LastName)
Values (0,'Staff')
4
Extract Model Data
SELECT DISTINCT OldSales.ModelID, OldSales.ManufacturerID,
OldSales.Category, OldSales.Color, OldSales.ModelYear,
OldSales.Graphics, OldSales.ItemMaterial, OldSales.ListPrice,
OldSales.Style, OldSales.SkillLevel, OldSales.WeightMax,
OldSales.WaistWidth, OldSales.BindingStyle
FROM OldSales;
5
UNION Query for Models
SELECT DISTINCT ModelID, ManufacturerID, Category, …
FROM OldSales
UNION
SELECT DISTINCT ModelID, ManufacturerID, Category, …
FROM OldRentals
6
Insert Model Data into ItemModel
INSERT INTO ItemModel (ModelID, ManufacturerID, Category, Color,
ModelYear, Graphics, ItemMaterial, ListPrice, Style, SkillLevel, WeightMax,
WaistWidth, BindingStyle)
SELECT DISTINCT qryOldModels.ModelID, qryOldModels.ManufacturerID,
qryOldModels.Category, qryOldModels.Color, qryOldModels.ModelYear,
qryOldModels.Graphics, qryOldModels.ItemMaterial,
qryOldModels.ListPrice, qryOldModels.Style, qryOldModels.SkillLevel,
qryOldModels.WeightMax, qryOldModels.WaistWidth,
qryOldModels.BindingStyle
FROM qryOldModels;
7
Insert SKU Data into Inventory
INSERT INTO Inventory (ModelID, SKU, Size, QuantityOnHand)
SELECT DISTINCT qryOldInventory.ModelID, qryOldInventory.SKU,
qryOldInventory.Size, 0 As QuantityOnHand
FROM qryOldInventory;
Note the use of the column
alias to force a zero value for
QuantityOnHand for each row
8
Copy Sales Data
INSERT INTO Sales (SaleID, SaleDate, ShipState, ShipZIP, PaymentMethod)
SELECT DISTINCT OldSales.SaleID, OldSales.SaleDate, OldSales.ShipState,
OldSales.ShipZIP, OldSales.PaymentMethod
FROM OldSales;
Note that if you have added
data to your Sales table, your
existing SaleID values might
conflict with these
You can solve the problem by
adding a number to these
values so they are all larger
than your highest ID
INSERT INTO Sales (SaleID, SaleDate, ShipState, ShipZIP, PaymentMethod)
SELECT DISTINCT OldSales.SaleID+5000, OldSales.SaleDate,
OldSales.ShipState, OldSales.ShipZIP, OldSales.PaymentMethod
FROM OldSales;
9
Copy SaleItem Rows
INSERT INTO SaleItem (SaleID, SKU, QuantitySold, SalePrice)
SELECT DISTINCT OldSales.SaleID+5000, OldSales.SKU,
OldSales.QuantitySold, OldSales.SalePrice
FROM OldSales;
If you transformed the SaleID
in the prior step for the Sale
data, you must do the exact
same calculation for SaleID in
the SaleItem table
10
Query for PivotTable
Format SaleDate as year
and month: yyyy-mm
Include all desired sale dimensions
Compute Value as
quantity times price
11
PivotTable Form Wizard
12
PivotTable Screen
Place other
columns
Place columns
(month)
Place rows (State,
Category)
Place Value last
13
PivotTable
Right click
to select all
columns
and choose
Hide Details
14
PivotTable Groups
15
Time Series Analysis: Excel
16
GIS: Microsoft MapPoint
The PivotTable places the
data into rows and columns
A dynamic copy of this sheet is
used to remove the top rows
17
MapPoint Data Wizard
18
GIS Analysis of Sales
19
Sales by State for Regression
Note that some states are
missing from the list.
20
Regression Setup
You should include the label
row but be sure to check the
box to show you included it
21
Regression Results
Relatively high R-square
Population is a significant
predictor, income is not
22