Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Operational transformation wikipedia , lookup
Data Protection Act, 2012 wikipedia , lookup
Data center wikipedia , lookup
Database model wikipedia , lookup
Forecasting wikipedia , lookup
Data analysis wikipedia , lookup
3D optical data storage wikipedia , lookup
Data vault modeling wikipedia , lookup
Data Warehousing The Gap • “Drowning in data but starving for information” • Fragmented information systems – “one-thing-at-a-time” approach – Islands of information – Uncoordinated and inconsistent databases using different hw and sw platforms The Gap • Most systems support operational processing – No focus on analytical tools for decision making – Operational Processing: captures, stores, manipulates data to support daily operations of the organization – Informational processing: analysis of data Information • Kept in operational systems – Turn wheels of organization – 1 record at a time – Same tasks • Data warehouse – Watch wheels of organization turn – Thousands of records compressed into an answer set – Usage of DW changes kind of questions asked Goals • Make information easily accessible • Present information consistently • Adaptive and resilient to change • Secure and protect information • Foundation for improved decision making • Accepted by business users Components Data Staging Area Operational Source Systems Data Presentation Data Mart 1 EXTRACT TRANSFORM LOAD Data Mart 2 Data Access Tools Components • Operational source systems – Record/ capture day to day transactions – Queries are restricted – Little historical data • Data staging area (ETL) – Extract: copy source data – Transform: cleansing, combining, etc… – Load: transfer data into target data mart and index – May use relational db or flat files for storage – Mostly sorting and sequential processing Components • Data presentation – Data is organized, stored, and made available for direct querying by users, report writers, and other analytical applications – Data Marts using the dimensional model – Detailed atomic data Components • Data access tools – Capabilities that are provided to business users to use the presentation layer for analytical decision making – Tools for querying – May be an ad-hoc query tool or data mining software • Metadata – Data about the data – Used in guiding transformation Components • Data Marts – Logical and physical subset of a data warehouse’s presentation area – Flexible set of data based on the most atomic (granular) data possible to extract from an operational source and presented in a dimensional model that is most resilient when faced with unexpected queries – Data from single business process Components Data Staging Area Operational Source Systems Data Presentation Data Mart 1 EXTRACT TRANSFORM LOAD Data Mart 2 Data Access Tools Data Warehouse • Kimball Definition – Conglomeration of an organization’s data warehouse staging and presentation areas where operational data is specifically structured for query and analysis performance and ease-of-use Data Warehouse The Data Warehouse According to Kimbal Data Staging Area Operational Source Systems Data Presentation Data Mart 1 EXTRACT TRANSFORM LOAD Data Mart 2 Data Access Tools Data Warehouse What business users perceive as the data warehouse Data Staging Area Operational Source Systems Data Presentation Data Mart 1 EXTRACT TRANSFORM LOAD Data Mart 2 Data Access Tools Pitfalls • Tech over business • Fail to see importance of business sponsor • Multi-year project vs. iterative development • Allocate too much energy to construct normalized data structure and run out of budget before building presentation area Pitfalls • Backroom operational performance and ease of development over frontroom query performance and ease of use • Assume static business requirements • DW success = user acceptance