Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DATA WAREHOUSE CONCEPTS A Definition A Data Warehouse: • Is a repository for collecting, standardizing, and summarizing snapshots of transactional data contained in an organization’s operations or production systems • provides a historical perspective of information • is most often, but not exclusively, used for decision support applications and business information queries • can be more than one database • Is not a new concept Another Definition Decision Support: • • • • is a set of tools to easily access data is becoming a critical business tool is usually graphically oriented is empowering end users with tools to access vital business information • is moving lots of data down to the end user workstation • is a rapidly expanding area because of data warehousing efforts and projects Why a warehouse? For analysis and decision support, end users require access to data captured and stored in an organization’s operational or production systems This data is stored in multiple formats, on multiple platforms, in multiple data structures, with multiple names, and probably created using different business rules Why do we want a central data store Lif e LTD V oluntary Benf its Non-Medical Individual Disability Financial Account Management Underwriting Customer Service Sales/Marketing Financial Analysis A Good Reason for a Central Data Store Interesting Statistics 85% of the Fortune 1000 companies have, are implementing, or are looking at, data warehouses (Meta Group) 90% of all information processing organizations will be pursuing a data warehouse strategy in the next three years (Meta Group) The Decision Support industry will be a $1 Billion industry by 1997 (IDC & Forrester) Data Warehouse Evolution - Stage 0 Application End User Reports No end user access to production files Production Files Application Production Files “What we print” is “what you get” End User Reports Data Warehouse Evolution - Stage 1 Application Production Files End User Reports Snapshot File Application End User Reports Production Files Snapshot Files End users denied direct access to production files Snapshots or copies of production files are made available instead Solution: Provide end users access to production systems No Integration Between Systems a System developed in 1979 c b Purchased company system Purchased Package New Application development Rebuilt Application Data Warehouse Evolution - Stage 2 Document Document Document Document Desktop computer Document Document Document Document Document A Desktop computer Document Document Document b Mainframe Document Document Document Document Document Document c 4GL Desktop computer Server or Midreange Data Characteristics Type Production Warehouse Data Use Operational Detailed Real time, Latest value Relatively brief Dynamic Application wide Capture/update Coded Mgt Reporting Summary Multiple generations Forever Static Enterprise wide Read only Decoded Level of detail Currency Longevity Stability Scope of definition Data Operations Data values Transforming the Logical Model Brand Group Brand Product Shipment SKU Day Size Order Item Week Month Type customer Class Calendar Year Sales Rep time Route market District Region Key Differences - Part 1 Key differences between “data jails” (operational database) & warehouses • Subject orientation - operational systems are applicationsegmented (i.e. banks = auto loan, demand deposit accounting or mortgages). Subject areas for banks would be customer and each financial product • Level of integration - warehouses resolve years of application inconsistency in encoding/decoding, data name rationalization, etc • Update volatility - record at a time updates in operational database vs bulk loads in data warehouse • Time variance norms include: 30-90 days of transactions for operational system, 1-10 years for data warehouses Key Differences - Part 2 Characteristic Operational Warehouse Transaction volume Response time Updating Time Period Scope Activities High Very fast High volume Current Period Internal Focused, clerical operational Predictable, periodic Low to huge Reasonable Very Low Past to Future External Exploratory, analytical, managerial Can be Unpredictable, Ad hoc Queries Types of Warehouse Configurations Enterprise Division Functional • Financial • Personnel • Engineering/Product Departmental Special Project What’s Really Involved? Data Warehouse Components DB/2 V SA M Management Reporting Sales/Marketing Customer Relations Reserve Analysis Risk Analysis Mainframe Applications IMS DB2/2 PC Applications ??? Extract Programs Data Cleansers/Scrubbers Translators/Transformers Timing Tools Data Loading File Transfer Reserves Rates Customers Combined Data Warehouse Policies External Sources Claims Premiums DB/6000 Midrange DB/400 Decision Support Tools Typical Users of a Data Warehouse Decision Support Analysts, Business Analysts • Marketing, Actuaries, Financial, Sales, Executive Grocery Store attitudes • Going to the store, not knowing what they want • Close proximity says give me “everything” Explorers • Don’t know what they want • Search on a random basis, non-repetitively • Frequently finds nothing, but when they do, there are huge rewards Farmers • Know what they want • Non random searches, finds frequent “flakes of gold” • Finds small amounts of data Advanced Warehouse Topics Metadata repositories • Information about the data in the warehouse • Like a library card catalog • Data about when the information was created, what files accessed, how much data • Data about changes in business rules, processes • Context versus Content • “What does it mean?” Data Mining • Drilling down into databases with tools to find specific anomolies Online Analysis Processing (OLAP) • Really means summary data