Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Entity–attribute–value model wikipedia , lookup
Clusterpoint wikipedia , lookup
Operational transformation wikipedia , lookup
Data Protection Act, 2012 wikipedia , lookup
Data center wikipedia , lookup
Forecasting wikipedia , lookup
Data analysis wikipedia , lookup
Database model wikipedia , lookup
Information privacy law wikipedia , lookup
3D optical data storage wikipedia , lookup
Advanced Database Management Systems Lecture 16 Data Warehouse CS 424 DAVID A . SAMPAH ASHESI MARCH 2016 Outline Data Warehouse Reading for Topic: Elmasri & Navathe, Chapters 29 Connolly & Begg, Chapters 31 - 34 Data Warehouse Why? What is it? How designed? Data Warehouse – Why? Data scattered around the organisation in different locations and possibly different formats Many “Data islands” Want to bring it together for the purposes of supporting management decision making Data Warehouse – Definition A subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process Data Warehouse Subject-oriented as the warehouse is organized around the major subjects of the enterprise (such as customers, products, and sales) rather than the major application areas (such as customer invoicing, stock control, and product sales). Integrated because of the coming together of source data from different enterprise-wide applications systems. Time-variant because data in the warehouse is only accurate and valid at some point in time or over some time interval. Non-volatile as the data is not updated in real time but is refreshed from operational systems on a regular basis. New data is always added as a supplement to the database, rather than a replacement. Data Warehouse Central repository of corporate data (data warehouse), separate from the operational systems within the business Data is organised in accordance with business needs for decision making support (i.e., by subject rather than by event) and is read only Consistent and repeatable process for loading operational data – ETL process Will continue to grow as data is added to the data repository Unifies business views – one standard view! Several end-user tools available for the effective manipulation of data Data Warehouse – Architecture Client Access Tools Data Mart Meta Data DW Data Transformation (ETL) Operational Data External Data Data Transformation Includes: Removing unwanted data from the operational databases Converting to common definitions and data names Calculating summaries and derived data Establishing values for missing data The Meta Data contains the details of the data transformation process, as well as other important data. Data Warehouse Access Tools Query and reporting tools SQL Easy to use front end query tools that generate SQL statements In-house Applications developed for use with a particular Data Warehouse Executive information Systems (EIS) tools Management OLAP tools Support Tools Data mining tools Data Mart Subset of data warehouse, used by a particular unit of function Evolution of data warehouse: Data mart to data warehouse Data warehouse to data mart Access tools sometimes have their own “data marts” Data Warehouse – Architecture Client Access Tools Data Mart Meta Data DW Data Transformation (ETL) Operational Data External Data Data Warehouse Development Issues Data Transformation support is immature and somewhat restrictive – principal focus of key data warehousing/data analytics software developers (e.g., SAS) Data Warehouse Design is captured using Star Schema Database Design Star Schema – each star relating to a particular subject Star Schema Database Design == Star Schema ER Model Fact Table in the middle Measures or dimensions around the outside Only one level of dimensions More than one -> snowflake schema design De normalised form in star schema One fact table surrounded by as many dimension tables as required, which allows different perspectives of the data to be formed. Use of surrogate keys Example Star Schema for Batch facts Factory 1..1 1..* Employee 1..1 1..* 1..* 1..* 1..1 Product 1..* Batch 1..1 Machine 1..1 Time Summary Data Warehousing Concept Definition Architecture including Data transformation Data Access Tools Data Warehouse Design Star Schema Database Design Any Question?