Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 5 Study Sections 5.2, 5.3, 5.5, Pages:231-233 & Snowflake schema DATA WAREHOUSING Business Intelligence • Companies collect a large amount of data from their business operations. • To keep track of that information, a business uses disparate software applications , such as Excel, Access, etc. • Using multiple software makes it difficult to retrieve information in a timely manner and to perform analysis of the data. • Business Intelligence (BI) represents the tools and systems that play a key role in integrating and analyzing all corporate data. • • Generally illustrates intelligence in the areas of customer profiling, market research, product profitability (by product, region, year), etc. BI Architecture Consists of 3 system components – Data warehouse – Business analytics – Performance management (BPM) Data warehouse – A repository of cleaned and integrated historical /stable data for the entire business – Extracted from independent databases (internal & external) – Transformed (ie. cleaned and reformatted) - A subset of a warehouse limited to a business function is called a Data Mart (eg. Sales). Business analytics These are tools that help analyze the data towards finding solutions: – Reporting and queries • Multi-dimensional reports, eg. Pivot tables [see Exercise 8]; • SQL Queries [Exercise 9] • Cube analysis [Chapter 6] – Data, text and Web mining and other sophisticated mathematical and statistical tools for searching relationships [Chapter 7] Business Performance Management (BPM) BPM supports monitoring, measuring, and comparing of sales, profit, cost, profitability, and other performance indicators • Dashboard reports • Production reports Data Source Views Data Source = DW Analytic Tools SQL, Cubes Transaction Data Systems Extraction, Transformation, and Load (ETL) Process A data warehousing process consists of : • Extraction (i.e., reading data from a database), • Transformation (i.e., converting the extracted data from its previous form into the form in which it needs to be so that it can be placed into a data warehouse), and • Load (i.e., storing the data into the data warehouse) Data Integration and the Extraction, Transformation, and Load (ETL) Process ETL Newly integrated schema for the Data Warehouse Data from multiple Sources DW Schema Structures: Star Note that data is un-normalized DW Schema Structures: Snowflake Note that data is normalized Designing Fact Tables: Normalization • Normalization is the process of gathering attributes into tables to eliminate redundant data (the redundancy here is EquipID EquipType) Normalization Exercise • FIRST (Supplier#, City, CityCode, Part#, Qty) • Split the table into 3 different tables: – (Supplier#, City) – (City, CityCode) – (Supplier#, Part#,Qty) Although normalized databases have less data redundancies, they are less efficient in quickly processing the data. Hence, many DWs use Star schema. Data Marts provide ‘views’ of the data in the Data Warehouse (we will be working with this in our SQL exercises) Summary