Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CHAPTER 4 Data Management: Warehousing, Access, and Visualization 4-1 In this Chapter, the outlines are: MSS foundation Many new concepts Object-oriented databases Intelligent databases Data warehouse Data mining Online analytical processing Multidimensionality Internet / Intranet / Web Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-2 4.1 Opening Vignette: Data Warehouse and DSS at Group Health Cooperative Group Health Cooperative is a large nonprofit HMO based in Seattle, Washington. The company owns hospital, inpatient centers, and primary care centers. The company also has contracts with many health care providers and also acts as an insurance company. In 1997, it served over 550,000 enrollees. A stream of 2-3 million records of data is processed monthly. Managing such a vast amount of data is a very difficult task. Even more difficult is the use of data for decision support. Before the use of DSS, costs were escalating, customer services were ineffective, use of resources was poor, and so was the quality of some service. The company realized that the only solution was to develop a Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-3 a comprehensive database and DSS approach to cover the whole spectrum of health services. Such an approach would allow for effective and efficient data-driven decision making. The basic idea was to create a single database (called a data warehouse) that would support DSS by linking data related to cost, efficiency of resource use, outcomes, and health status in a comprehensive corporate information system. The data come from existing data collection applications (TPS), such as clinic registration, laboratory, and pharmacy. An attempt was made to avoid data redundancy and to combine the data into meaningful information for decision making. The system was initiated in 1989 and it is constantly being updated and improved. It is used to generate periodic reports such as Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-4 – – – – – Population-based reports summarized by clinic and by practice Productivity reports Utilization management reports Reports by consumer groups or payer groups Statistical reports, such as by age and gender The data warehouse is also used for many DSS, EIS, and MIS applications such as: – Allocating costs down to the level of use by the patient – Holistic cost approach to answer queries such as how cost reduction in one area affects costs in another areas. – Cost comparisons for negotiating prices with business partners – Comprehensive query system – Creating an EIS for monitoring key indicators such as cost per patient day in a hospital Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-5 Once the data warehouse was completed and data-based reporting and access tools were in place, the possibilities were endless. For example, the number of inpatient days is an important performance indicator. It was reduced by 7% by sending patients to outpatient services, resulting in millions of dollars in savings. Performance measurements are now generated periodically. Trend analysis is performed as well as comparisons with other HMOs. Considering these comparisons, target and plans for improvement were established. An example of the success of the DSS was the winning of a military contract valued at about $1 billion over a 5-year period. A database used specifically for the bid on this contract was created in only 2 days because it was extracted from the data warehouse. The warehouse was created with the SAS data warehouse software (from the SAS Institute Inc.), which is compatible with SAS’s DSS tools. Each group gets an individually tailored report. Such reports contain data from many Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-6 sources and because of the data warehouse ,it is possible to generate the reports rapidly and accurately. What this opening vignette demonstrated is how a large company uses a centralized database to support decision making. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-7 4.2 Data Warehousing, Access, Analysis, and Visualization What to do with all the data that organizations collect, store, and use? (Information overload!) Solution Data warehousing Data access Data mining Online analytical processing (OLAP) Data visualization Data sources Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-8 4.3 The Nature and Sources of Data Data: data items about things, events, activities, and transactions are recorded, classified, and stored, but are not organized to convey any specific meaning. It can be numeric, alphanumeric, figures, sounds, or image. Information: Data organized to convey meanings. It confirms something the recipient knows or may have “surprise” value by telling something not know.The recipient interprets the meaning and draws inferences and conclusions. An applications process data items so that the results are meaningful for an intended action or decisions. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-9 4.3 The Nature and Sources of Data Knowledge: Data items organized and processed to convey understanding, experience, accumulated learning, and expertise. A set of data items processed to extract critical implications and to reflect past experience and expertise provides the recipient with organizational knowledge and has very high potential value. DSS Database, data warehouse, may include data, information, and knowledge. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-10 4.3 The Nature and Sources of Data Data forms: – – – – – – Documents Pictures Maps Sound Animation Video Can be hard or soft Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-11 4.3 The Nature and Sources of Data Internal Data are stored in one or more places. These data are about people, products, service, and process. For example, data about the employees and their pay are usually stored in the corporate database. Data about equipment and machinery may be stored in the maintenance department database. External There are many sources of external data. They range from commercial databases to data collected by sensors and satellites. Data are available on CD-ROM, on the Internets, as films, and as music or voices. Pictures, diagrams, atlases, and TV are also source of data. Government reports (either computerized or not) are a major source of external data. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-12 4.3 The Nature and Sources of Data Personal Data The MSS users or other corporate employees may contribute their own expertise by creating personal data. These include subjective estimates of sales opinions about what the competitor are like to do or interpretations of new articles. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-13 4.4 Data Collection, Problems, and Quality Methods for collecting data Raw data can collected manually or by instruments and sensors. Representative data collection methods are – time studies (during observation), – surveys (using questionnaire), – observation (e.g.using video cameras), and – soliciting information from experts (e.g. Using interview) Data Problems (Table 4.1) All computer-based systems depend on data. The quality and integrity of the data are critical for the MSS to avoid the GIGO syndrome. MSS depend on data because complied data that make up information and knowledge are the heart of any decision-making system. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-14 Data are not correct Data are not timely Data are not measured or indexed properly Fall, 2006 Raw data were entered inaccurately Develop a systematic way to ensure the accuracy of raw data data were generated carelessly Whenever derived data are carefully monitor both the data values and the manner in which the data were generated. Modify the systems for generating data The method for generating the data is not rapid enough to meet the need for the data Raw data are gathered accDevelop a system for rescaling or ording to a logic or periodirecombining the improperly city that is not consistent indexed data with the purposes of the analysis Detailed model contains Develop simpler or more highly so many coefficients aggregated models that it is difficult to develop and maintain No one ever stored data Whether or not it is useful now, needed now. store data for future use. This may be impracitical because of the cost of storing and maintaining data. The data may not be found when they are needed. Required data never existed Make an effort to generate the data or to estimate them if they concern the future. All right Reserved YAO Zhong, School of E&M, BUAA 4-15 4.4 Data Collection, Problems, and Quality Quality: determines usefulness of data – Intrinsic data quality – Accessibility data quality – Representation data quality Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-16 4.4 Data Collection, Problems, and Quality Uniformity Version Completeness check Conformity check Genealogy check (drill down) Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-17 4.5 The Internet and Commercial Database Services For external data – – The Internet: major supplier of external data, Decision maker can access the home pages of vendor, clients, and competitors, view and download information, or conduct research. Commercial Data Banks: sell access to specialized databases An online database service sell access to specialized DB. Such a service can add external information to the MSS in a timely manner and at a reasonable cost. All that is necessary to retrieve data from such a service is a company terminal, modem , telephone, password, and some service fees. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-18 4.5 The Internet and Commercial Database Services Several thousand service are currently available, most of are accessible via Internet. Example, – CompuServe and The Source – CompuStat – Dow Jones Information Service – Interactive Data Corporation – Lockheed Information Service – Mead Data Central. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-19 4.5 The Internet and Commercial Databases Servers Use Web Browsers to: – Access vital information by employees and customers – Implement executive information systems – Implement group decision support systems (GDSS) – Database management systems provide data in HTML, on Web servers directly The big three relational database management system vendorsInformix, Oracle, Sybase and DB2-have reworked their core database products to accommodate a world of client/server, browser/server and Internet/intranet application Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-20 4.5 The Internet and Commercial Databases Servers Use Web Browsers to: Besides, Powersoft is also readying software - The Internet Toolkit for Powerbuilder 5.0, and add-on to the PowerBuilder Windows application tool-for developing Web-enabled C/S application Web-site and database integration suppliers include: – – – – – Fall, 2006 Spider Technology, Haht Software, Next Software Inc. NetObject Inc. OneWave Corp. All right Reserved YAO Zhong, School of E&M, BUAA 4-21 4.6 Database Management Systems in DSS DBMS: Software program for entering (or adding) information into a database; updating, deleting, manipulating, storing, and retrieving information A DBMS + modeling language to develop DSS DBMS to handle LARGE amounts of information DSS is often working with both data and models Small DSS can be build by either enhanced DBMS or spreadsheet. Spreadsheet vs. DBMS? Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-22 4.7 Database Organization and Structure Relational databases This form is prominent in DBMS areas. 2D tabular to present the data logical structure. Easy learn and maintains. Hierarchical databases Organizing data items in a top-down form, creating logical links between related data items. It looks like a tree. Network databases Network form to represent data logical structure. Advantage is save store space. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-23 4.7 Database Organization and Structure Object-oriented databases Complex application involves complex database, such as CIMS. Any forms of above three types of databases can not satisfy the requirements. OODBMS is based on the principle of OOP. OODMBS combine the characteristics of an OOP such as C++ or SmallTalk Language with a mechanism for data storage and access. It allows one to analyze data at a conceptual level that emphasizes the natural relationships between objectives. Abstraction is used to establish inheritance hierarchies and objective encapsulation allows the database designer to store both conventional data and procedural code within the same objects. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-24 4.7 Database Organization and Structure Object-oriented databases (see Leung, 2002) OODBMS defines data as objects and encapsulates data along with their relevant structure and behaviors. The system uses a hierarchy of classes and subclasses of objects. Structure, in terms of relationships, and behavior, in terms of methods and procedures, are contained within an objects. OODMBS especially useful in distributed DSS for very complex application Multimedia-Based Database MMDBMS manage data in a variety of formats, in addition to the standard text or numeric fields. These formats include images such as digitized photos or forms of bit-mapped graphics such maps or .pic files, hypertext images, video clips, sound, and virtual reality. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-25 4.7 Database Organization and Structure At best, no more than 15 percent of all corporate information is digitized according to Gartner Group, Inc. At least 85% of all corporate information resides outside the computer in documents, maps, photos, images, and videotapes. For organizations to build application to take advantage of rich data types, the DBMS must accommodate them. Oracle, Informix, and Sybase store rich multimedia data types as binary large objects. Most PC systems (as clients) are capable of supporting the display or playback of files in these formats. It is logical, but not easy, to expand database management capabilities to include these objects into management support systems. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-26 4.7 Database Organization and Structure Document-based Database Organizations are drowning(溺死)in paper. To alleviate the paper storage and shuffling, document-based database were developed. These are also knows as electronic document management (EDM) systems. They are used for information dissemination, form storage and management, shipment, expert license processing, and workflow automation. Distributed database A group of data have been distributed physically on different sites in various computer networks and logically belong to one system. The connected sites in the network is of individual process capabilities and can execute the local applications. At the same time, each site also can running the system wide applications through the communication subsystem. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-27 4.8 Data Warehousing Definition: (SAS Institute, 1996) – Physical separation of a company’s operational decision support environments. – At the heart of many companies lies a store of operational data, usually derived from critical mainframe-based online transaction processing (OLTP) systems, such as order entry applications. The OLTP systems are built with COBOL and they operate in Customer Information Control System (CICS) environment. OLTP systems for financial and inventory management and control also produce operational data. In this case, data access, application logic tasks, and data representation logic are tightly coupled together, usually in nonrelational database. These nonrelational data stores are not very conductive to data retrieval for4-28 All right Reserved YAO Zhong, School of E&M, BUAA Fall, 2006 4.8 Data Warehousing decision support application. Deriving information for decision support analysis from an operational data store can be a self-defeating activity that requires too much time and programming expertise. It may negatively affect the performance of the critical transactional system. However, equally critical to a company’s success, decision support information must be made accessible to management. W.H.Inmon(1992)definition:“data warehouse is a set of supporting subject-oriented, integrated, time-dependent and permanent decision process. Purpose: to establish a data repository making operational data accessible in a form that is readily acceptable for decision support and EIS application. As part of this new accessibility, a process must transform detail-level operational data to a relational form, which makes them more amenable to decision right Reserved YAO Zhong, School of E&M, BUAA 4-29 processing. Fall, 2006support All 4.8 Data Warehousing Only data needed for decision support come from the TPS (operational environment). As data pass into the data warehouse, they are transformed and integrated into a consistent structure. Data warehouses allows for the storage of metadata, which can include data summaries that are easier to search for and index. The data are then placed directly into a data repository at the current level of detail, where they are eventually summarized, archived into the older detail data level, or purged. Moving DSS information off the mainframe presents a company with an opportunity to restructure its DSS strategy. The company can reinvent the way that can shape and form their DSS data. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-30 4.8 Data Warehousing Any EIS requires having good summarized data in different forms. Traditional EIS and DSS application often failed because the underlying data were difficult or impossible to access. Also, manipulation of the data has traditionally been a part of the EIS, instead of a preliminary data preparation activity. Data warehousing (information warehousing): solves the data access problem Data warehouse combines various data sources into a single resource for end-user access End users perform ad hoc query, reporting analysis and visualization Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-31 4.8 Data Warehousing Data warehousing involves combining a variety of technology vendor’s product into an integrated solution. There can be several data warehouse in one company. DW benefits: Increase in knowledge worker productivity Supports all decision makers’ data requirements Provide ready access to critical data Insulates operation databases from ad hoc processing Provides high-level summary information Provides drill down capabilities – Improved business knowledge – Competitive advantage Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-32 4.8 Data Warehousing Enhances customer service and satisfaction – Facilitates decision making – Help streamline business processes DW Architecture and Process two-tier architecture three-tier architecture (figure 4.3) Data from internal (legacy) sources and external sources are extracted, scrubbed, filtered, and summarized via special software before insertion in the data warehouse. The data then are processed again and deposited in an additional special MD database (3 tiers), organized for easy MD presentation. The DSS and EIS users can query the new server and perform analysis. In a two-tier architecture, there is no MD database or server. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-33 Data Warehouse Architecture and Process Repository EIS Legacy System DB Server EIS/DSS Server DSS External System Data warehouse MultiD DB EIS Three tiers Data Warehouse Architecture by McFadden and Watson [1996] Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-34 4.8 Data Warehousing Data Warehouse Components Fall, 2006 Large physical database: This is an actual, physical database into which all the data for the data warehouse are gathered, along with the metadata and the processing logic used to scrub, organize, package, and preprocess the data for end-user access. Logical data warehouse: contains all the metadata, business rules, and processing logic required to scrub, organize, package, and preprocess the data. In addition, it contains the information required to find and access the actual data, wherever they actually reside. All right Reserved YAO Zhong, School of E&M, BUAA 4-35 4.8 Data Warehousing Fall, 2006 Data mart: A data mart is a subset of the enterprise-wide data warehouse. Typically it performs the role of a departmental, regional, or functional data warehouse. As part of the iterative data warehouse process, the organization builds a series of data marts over time and eventually links them via an enterprise-wide logical data warehouse. The Metadata* : Data about data stored within the DW. Decision support systems (DSS) and executive information system (EIS): These are not data warehouse but applications that use the data warehouse. All right Reserved YAO Zhong, School of E&M, BUAA 4-36 4.8 Data Warehousing Data Warehouse Suitability: Data warehousing is most appropriate for organizations where: Data are stored in different systems. An information-based approach to management is in use. There is large, diverse customer base. The same data are represented differently in different systems. Data are stored in highly technical, difficult to decipher formats. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-37 4.8 Data Warehousing Characteristics of Data Warehousing 1. Data organized by detailed subject with information relevant for decision support 2. Integrated data :data in different locations may be enclosed differently. For example, gender data may be encoded 0 and 1 in one place, and m and f in another. DB2, ORACLE, Informix, Sybase, SQL Server, Access, etc. all are integrated into one DW. 3. Time-variant data: for 5-10 years data can used for tends, forecasting, and comparison. 4. Non-volatile data: Once entered into DW, data are not changed or updated. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-38 4.9 OLAP: Data Access and Mining, Querying, and Analysis Online analytical processing (OLAP) – DSS and EIS computing done by end-users in online systems – Versus online transaction processing (OLTP) OLTP is performed by end-users, whereas OLAP is done by IS professionals. OLAP’s activities are generating queries, requesting ad hoc reports, conducting statistical analyses, and building multimedia applications. To facilitate OLAP, data warehouse is necessary. OLTP with database. Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-39 4.9 OLAP: Data Access and Mining, Querying, and Analysis OLAP Activities Generating queries Requesting ad hoc reports Conducting statistical and other analyses Developing multimedia applications Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-40 4.9 OLAP: Data Access and Mining, Querying, and Analysis Corporate database OLAP Server Spreadsheets Statistical? ?package ? Web-enabled OLAP Software Fall, 2006 Multidimensional Database Client PCs Data are retrieved from corporate database and staged in an OLAP ? ? ? Multidimensional database for retrieval by front-end systems All right Reserved YAO Zhong, School of E&M, BUAA 4-41 4.9 OLAP: Data Access and Mining, Querying, and Analysis OLAP uses the data warehouse and a set of tools, usually with multidimensional capabilities Query tools Spreadsheets Data mining tools Data visualization tools Two types realized modes – – Fall, 2006 MOLAP (Multidimensional OLAP): EXPRESS (MIT) and System W (Comshare) ROLAP: (Relational OLAP): Metaphor All right Reserved YAO Zhong, School of E&M, BUAA 4-42 4.9 OLAP: Data Access and Mining, Querying, and Analysis – Using SQL for Querying – SQL (Structured Query Language) Data language English-like, nonprocedural, very user friendly language Free format Example: SELECT FROM WHERE Fall, 2006 Name, Salary Employees Salary >2000 All right Reserved YAO Zhong, School of E&M, BUAA 4-43 4.10 Data Mining Data mining is a term used to describe knowledge discovery in databases, knowledge extraction, data archaeology, data exploration, data pattern processing, data dredging, information harvesting, and software. Data Mining Applications: – – Fall, 2006 Knowledge discovery in databases Knowledge extraction Data archeology Data exploration Data pattern processing Data dredging Information harvesting All right Reserved YAO Zhong, School of E&M, BUAA 4-44 4.10 Data Mining Major Data Mining Characteristics and Objectives Data are often buried deep Client/server architecture Sophisticated new tools--including advanced visualization tools--help to remove the information “ore” End-user miner empowered by data drills and other power query tools with little or no programming skills Often involves finding unexpected results Tools are easily combined with spreadsheets, etc. Parallel processing for data mining Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-45 4.10 Data Mining Fall, 2006 Data Mining Application Areas Marketing (Retailing and sales) Banking e-Commerce Manufacturing and production Brokerage and securities trading Insurance Computer hardware and software Government and defense Airlines Health care Broadcasting Law enforcement All right Reserved YAO Zhong, School of E&M, BUAA 4-46 4.10 Data Mining Intelligent Data Mining Use intelligent search to discover information within data warehouses that queries and reports cannot effectively reveal Fall, 2006 Find patterns in the data and infer rules from them Use patterns and rules to guide decision making and forecasting Five common types of information that can be yielded by data mining: 1) association, 2) sequences, 3) classifications, 4) clusters, and 5) forecasting All right Reserved YAO Zhong, School of E&M, BUAA 4-47 4.10 Data Mining Main Tools Used in intelligent Data Mining Case-based Reasoning: Using historical cases, the case-base reasoning approach can be used to recognize patterns. Neural Computing: Neural computing is a machine learning approach by which historical data can be examined for pattern recognition. Thus, one can go through large databases identify potential customers of a new product. (a brief introduction, doesn’t requirement ) Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-48 4.10 Data Mining Intelligent Agents: one of the most promising approaches to retrieving information from databases, especially external ones, is the use of intelligent agents. As vast amounts of information are becoming available through the Internet, finding the right information is becoming more difficult. IA is an autonomous agent, which is a system situation within and a part of an environment that senses that environment and acts in on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future. Other Tools • Decision trees • Rule induction • Data visualization Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-49 4.10 Data Mining Often used technologies in data mining Associate rules Clustering Artificial Neural Networks Fall, 2006 Decision trees Multivariate regression All right Reserved YAO Zhong, School of E&M, BUAA 4-50 4.11 Data Visualization and Multidimensionality Data Visualization Technologies Fall, 2006 Digital images Geographic information systems Graphical user interfaces Multidimensions Tables and graphs Virtual reality Presentations Animation All right Reserved YAO Zhong, School of E&M, BUAA 4-51 4.11 Data Visualization and Multidimensionality Multidimensionality Fall, 2006 3-D + Spreadsheets (OLAP has this) Data can be organized the way managers like to see them, rather than the way that the system analysts do Different presentations of the same data can be arranged easily and quickly Dimensions: products, salespeople, market segments, business units, geographical locations, distribution channels, country, or industry Measures: money, sales volume, head count, inventory profit, actual versus forecast Time: daily, weekly, monthly, quarterly, or yearly All right Reserved YAO Zhong, School of E&M, BUAA 4-52 4.11 Data Visualization and Multidimensionality Multidimensionality Limitations – Extra storage requirements – Higher cost – Extra system resource and time consumption – More complex interfaces and maintenance Multidimensionality is especially popular in executive information and support systems Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-53 4.12 Geographic Information Systems (GIS) A computer-based system for capturing, storing, checking, integrating, manipulating, and displaying data using digitized maps Spatially-oriented databases Useful in marketing, sales, voting estimation, planned product distribution Available via the Web Can use with GPS Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-54 4.13 Virtual Reality An environment and/or technology that provides artificially generated sensory cues sufficient to engender in the user some willing suspension of disbelief Can share data and interact Can analyze data by creating a landscape Useful in marketing, prototyping aircraft designs VR over the Internet through VRML Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-55 4.14 Business Intelligence (BI) on the Web Concept: (Greene, 1996) (1) processed information (data>Information-> Intelligence); (2) Decision maker makes decisions; (3) The impact of the environment faced by the business on the business operations. Architecture: (1) Data Warehouse; (2) Analysis Tools: OLAP and Data Mining; (3) EIS. Functionality: BI Can capture and analyze data from Web Tools deployed on Web – – Fall, 2006 IBM the database for the web generation, Building and managing DB2 Warehouse and Gaining Insights CA: eBusiness Intelligence All right Reserved YAO Zhong, School of E&M, BUAA 4-56 Summary Fall, 2006 Data for decision making come from internal and external sources The database management system is one of the major components of most management support systems Familiarity with the latest developments is critical Data contain a gold mine of information if they can dig it out Organizations are warehousing and mining data Multidimensional analysis tools and new enterprisewide system architectures are useful OLAP tools are also useful All right Reserved YAO Zhong, School of E&M, BUAA 4-57 Summary (cont’d.) New data formats for multimedia DBMS Internet and intranets via Web browser interfaces for DBMS access Built-in artificial intelligence methods in DBMS References recommendation: – Building the Data Warehouse W.H.Inmon 机械工业出版社 – 数据仓库技术及联机分析处理 王珊等编著 科学出版社 Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-58 Individual Assigment Surfing on the Internet to check what the data/information sources can be provided by each of the data banks in our class discussed (Section 4.5). What is Data Warehouse, Data Mining Technology, Intelligent Agents, BI? Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-59 Supplement Materials about the data warehouse Fall, 2006 All right Reserved YAO Zhong, School of E&M, BUAA 4-60