Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sharing Enterprise Data Data administration Data downloading Data warehousing Data administration Organization-wide activity (the DBA of a particular database is only a part of this) Challenges: Many types of data exist Basic categories of data are not obvious The same data can have many names, descriptions, and formats Data are changed – often concurrently Political and organizational issues complicate operational issues Marketing Communicate existence of data administration to organization Explain reason for existence of standards, policies, and guidelines Describe in a positive light the services provided Data standards and policies Establish standard means for describing data items; standards include name, definition, description, processing restrictions, etc. Establish data proponents Establish organization-wide data policy; examples are security, data proponency, and distribution Forum for data conflict resolution Establish procedures for reporting conflicts Provide means for hearing all perspectives and views Have authority to make decision to resolve conflict Return on organization's data investment Focus attention on value of data investment Investigate new methodologies and technologies Take proactive attitude toward information management Downloading data for local processing Data downloading via file-sharing systems Data downloading via client-server systems Downloading: potential problems Coordination Conform downloaded data to database constraints Coordinate local updates with downloads Consistency Downloaded data should not be updated Applications need features to prevent updating Warn users of possible problems Access control Data may be replicated on many computers More difficult data access control procedures Risk of computer crime Disks and modem access are easy to conceal Illegal copying is difficult to prevent Data warehousing What if every department wants to download the organization’s data? The data management problem becomes immense Data warehouse: a centralized repository to facilitate management decision making and increase the value of the enterprise data assets Data warehouse architecture Integrated From Various Sources Operational Data appln A - m,f appln. B - male, female appln. C - x,y appln.. D - 1,0 Data Warehouse m, f Data in Data Warehouse National Sales by Month 85-98 Regional Sales by Week 83-98 Sales Detail 1998-99 Sales Detail 1992-98 Highly Summarized Lightly Summarized Current Detail Older Detail Data Time Variant Operational Data time horizon 60-90 days key may / may not have element of time can be updated Data Warehouse time horizon 5-10 years key contains element of time once snapshot is made data cannot be updated Non - volatile Change Replace Replace Insert Load Operational Data Data is updated on a record by record basis To support the recordby-record on line update, requires the technology to have very complex foundation Access Data Warehouse Data is not updated The physical design levels liberties can be taken to optimize the access of data Data warehouse components Data extraction tools Extracted data Metadata of warehouse contents Warehouse DBMS(s) Warehouse data management tools Data delivery programs End- user analysis tools User training courses and materials Warehouse consultants Data warehouse requirements Queries and reports with variable structure OLAP: On-Line Analytical Processing User- specified data aggregation User- specified drill down Graphical outputs Integration with domain- specific programs OLAP --to gain insight into data through fast, consistent, interactive access to wide variety of views --functionality characterized by dynamic multidimensional analysis of consolidated enterprise data Data Extraction --ability to capture, convert, & deliver data to various sources --provides fast disk-to-disk transfer capabilities and automate data compression Data Mining Tools -- helps by focusing end user attention on a smaller subset of data -- subset is determined by data mining “discovery”process, which is done in advance of indepth analysis Executive Information System -- for senior executives with little computing experience -- available on demand with whatever level of detail ( drill-down) -- add value, improve strategic & financial control, market & economical information, better competitive analysis Financial & Marketing Analysis -- provides end user with highly value added report like accounts receivable / payable, ledger mgmt., cost control cost budgeting & planning, -- in marketing - product pricing, demand analysis, estimation -- use non-technical language, run queries in fast, reliable manner.. Report & Query Tools -- most important & widely used -- emphasize generating value added reports -- user have flexibility to use either common English/ SQL -- support graphical interface Example FINGERHUT 150 catalog mailings in 1997 based on statistically predicted consumer response 30 million customers, 14% annual growth database captures 1400 pieces of information about a household demographics, purchasing histories Data warehouse challenges Inconsistent data Tool integration E.g., spreadsheets versus databases… Lack of warehouse data management tools E.g., different timing, different domains... In-house software development (expensive) Ad-hoc requirements Data warehousing Is it as good an idea as it seemed? What about the Internet? Data mart: limit the scope of the warehouse