Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Supporting Decision Making A Framework for IS Management Introduction (2) Most computer systems support decision making because all software programs involve automating decision steps that people would take Decision making is a process that involves a variety of activities, most of which handle information A wide variety of computer-based tools and approaches can be used to confront the problem at hand and work through its solution Introduction (3) Computer technologies that support decision making Decision support system (DSSs) Data mining Executive information systems (EISs) Expert systems (ESs) Agent-based modeling Multidisciplinary foundations for DS technologies Database research, artificial intelligence, statistical inference, human-computer interaction, simulation methods, software engineering etc. Case Example---A Problem-Solving Scenario Using an EIS to discover a sales shortfall in one region Investigate several possible causes Economic conditions Competitive analysis Written sales reports A data mining analysis Result: no clear problems revealed Decision Support Systems---History Two contributing areas of research in 1950s1960s Organizational decision making in CMU Interactive computer systems in MIT Middle 1970s: single user and model-oriented DSS Middle and late 1980s: EIS, GDSS, ODSS 1990s: Data warehousing and OLAP Late 1990s-2000s Data mining Web-based analytical applications What is a DSS? A DSS aims to use IT to relieve humans of some decision making or help us make more informed decisions Systems that support, not replace, managers in their decision-making activities DSSs are defined as: Computer-based systems That help decision makers Confront ill-structured problems Through direct interaction With data and analysis models DSS Architecture (1) DSS Architecture (2) The Dialog Component The Data Component Linking the user to the system Data sources --- use all the important data sources within and outside the organization in the form of summarized data (DW & DM) The Model Component Models provide the analysis capabilities for a DSS Using a mathematical representation of the problem, algorithmic processes are employed to generate information to support decision making A Taxonomy of DSS Using the mode of assistance as the criterion A model-driven DSS A communication-driven DSS A data-driven DSS or data-oriented DSS A document-driven DSS A knowledge-driven DSS Executive Information System (1) The emphasis of EIS is on graphical displays and easy-to-use user interfaces EIS can be viewed as a DSS that: Provides access to summary performance data Uses graphics to display and visualize the data in an easy-to-use fashion, and Has a minimum of analysis for modeling beyond the capability to "drill down" in summary data to examine components Executive Information System (2) EISs aim to provide both internal and external information relevant to meeting the strategic goals of the organization Gauge company performance Scan the environment EIS and data warehousing technologies are converging in the marketplace The term EIS has lost popularity in favor of Business Intelligence Data Mining: Motivations The explosive growth of data: from TB to PB Data collection and data availability Major sources of abundant data Automated data collection tools, database systems, Web, computerized society Business: Web, e-commerce, transactions, stocks, … Science: remote sensing, bioinformatics, … Society and everyone: news, digital cameras, YouTube We are drowning in data, but starving for knowledge! “Necessity is the mother of invention”—Data mining— Automated analysis of massive data sets What Is Data Mining? Data mining (knowledge discovery from data) Alternative names Extraction of interesting patterns or knowledge from huge amount of data Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc. Watch out: Is everything “data mining”? Simple search and query processing (Deductive) expert systems Knowledge Discovery (KDD) Process Data mining—core of knowledge discovery process Pattern Evaluation Data Mining Task-relevant Data Data Warehouse Data Cleaning Data Integration Databases Selection Architecture: A Typical Data Mining System Graphical User Interface Pattern Evaluation Data Mining Engine Database or Data Warehouse Server data cleaning, integration, and selection Data World-Wide Other Info Database Warehouse Web Repositories KnowledgeBase Data Mining: Confluence of Multiple Disciplines Database Technology Machine Learning Pattern Recognition Statistics Data Mining Algorithm Visualization Other Disciplines Why Not Traditional Data Analysis? Tremendous amount of data High-dimensionality of data Micro-array may have tens of thousands of dimensions High complexity of data Algorithms must be highly scalable to handle TB of data Data streams and sensor data Time-series data, temporal data, sequence data Structure data, graphs, social networks and multi-linked data Heterogeneous databases and legacy databases Spatial, spatiotemporal, multimedia, text and Web data Software programs, scientific simulations New and sophisticated applications Multi-Dimensional View of Data Mining (1) Data to be mined Relational, data warehouse, transactional, stream, object-oriented/relational, active, spatial, timeseries, text, multi-media, heterogeneous, legacy, WWW Knowledge to be mined Characterization, discrimination, association, classification, clustering, trend/deviation, outlier analysis, etc. Multiple/integrated functions and mining at multiple levels Multi-Dimensional View of Data Mining (2) Techniques utilized Database-oriented, data warehouse (OLAP), machine learning, statistics, visualization, etc. Applications adapted Retail, telecommunication, banking, fraud analysis, bio-data mining, stock market analysis, text mining, Web mining, etc. Data Mining Functionalities (1) Multidimensional concept description: characterization and discrimination Frequent patterns, association, correlation vs. causality Generalize, summarize, and contrast data characteristics, e.g., dry VS. wet regions Diaper Beer [0.5%, 75%] Classification and prediction Construct models (functions) that describe and distinguish classes or concepts for future prediction E.g., classify countries based on (climate), or classify cars based on (gas mileage) Predict some unknown or missing numerical values Data Mining Functionalities (2) Cluster analysis Outlier analysis Class label is unknown: Group data to form new classes, e.g., cluster houses to find distribution patterns Maximizing intra-class similarity & minimizing interclass similarity Outlier: Data object that does not comply with the general behavior of the data Noise or exception? Useful in fraud detection, rare events analysis Trend and evolution analysis Trend and deviation: e.g., regression analysis Periodicity analysis Major Issues in Data Mining (1) Mining methodology Mining different kinds of knowledge from diverse data types, e.g., bio, stream, Web Performance: efficiency, effectiveness, and scalability Pattern evaluation: the interestingness problem Incorporation of background knowledge Handling noise and incomplete data Parallel, distributed and incremental mining methods Integration of the discovered knowledge with existing one: knowledge fusion Major Issues in Data Mining (2) User interaction Data mining query languages and ad-hoc mining Expression and visualization of data mining results Interactive mining of knowledge at multiple levels of abstraction Applications and social impacts Domain-specific data mining & invisible data mining Protection of data security, integrity, and privacy Artificial Intelligence (1) AI is a group of technologies that attempts to mimic our senses and emulate certain aspects of human behavior such as reasoning and communication 1956, a conference in Dartmouth College John McCarthy, Marvin Minsky, Allen Newell and Herbert Simon ( MIT, CMU and Stanford) 1965, H. A. Simon: "machines will be capable, within twenty years, of doing any work a man can do" 1967, Marvin Minsky: "Within a generation ... the problem of creating 'artificial intelligence' will substantially be solved" Heavily funded by DARPA Artificial Intelligence (2) They had failed to recognize the difficulty of some of the problems they faced: The lack of raw computing power The intractable combinatorial explosion of their algorithms, The difficulty of representing commonsense knowledge and doing commonsense reasoning, The incredible difficulty of perception and motion The failings of logic First AI Winter In 1974, DARPA cut off all undirected, exploratory research in AI Artificial Intelligence (3) In the early 80s, the field was revived by the commercial success of expert systems By 1985 the market for AI had reached more than a billion dollars. Minsky and others warned the community that enthusiasm for AI had spiraled out of control and that disappointment was sure to follow Second AI Winter The collapse of the Lisp Machine market in 1987 Artificial Intelligence (4) In the 90s AI achieved its greatest successes Artificial intelligence was adopted throughout the technology industry, providing the heavy lifting for Data mining Logistics Medical diagnosis … Expert System An expert system is an automated type of analysis or problem-solving model that deals with a problem the way an "expert" does The process involves consulting a base of knowledge or expertise to reason out an answer based on the characteristics of the problem Architecture of an ES Description of a problem Inference Engine User User Interface Advice and explanation Knowledge Base Knowledge Representation In AI, the primary aim of knowledge representation is to store knowledge so that programs can process it and achieve the verisimilitude of human intelligence The representation theory has its origin in cognitive science Knowledge can be represented in a number of ways Case-based reasoning Artificial neural networks Stored as rules Case-based Reasoning (1) Case-based reasoning The process of solving new problems based on the solutions of similar past problems A case consists of a problem, its solution, and, typically, annotations about how the solution was derived Case-based Reasoning (2) Case-based reasoning as a four-step process Retrieve: given a target problem, retrieve cases from memory that are relevant to solving it Reuse: map the solution from the previous case to the target problem Revise: test the new solution, if necessary, revise it. Retain: After the solution has been successfully adapted to the target problem, store the resulting experience as a new case in memory Supervised vs. Unsupervised Learning Supervised learning Supervision: The training data (observations, measurements, etc.) are accompanied by labels indicating the class of the observations New data is classified based on the training set Unsupervised learning The class labels of training data is unknown Given a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data Artificial Neural Network (1) An interconnected group of artificial neurons Using a mathematical or computational model for information processing based on a connectionistic approach to computation. An adaptive system that changes its structure based on external or internal information that flows through the network. ANNs can be used to model complex relationships between inputs and outputs or to find patterns in data Non-linear statistical data modeling or decision making tools Artificial Neural Network (2) Training set: (1) high salary, owns a house, has a dog, [profitable customer] (2) less than 3 years on job, prior bankruptcy, owns a dog, [deadbeat] ...... Rule-based Systems (1) Knowledge stored as rules The most commonly used form of rules is the ifthen statement e.g. IF some condition THEN some action A rule-based inference model: decision tree Each internal node (non-leaf node) denotes a test on an attribute Each branch represents an outcome of the test Each leaf node holds a class label Rule-based Systems (2) Training dataset for decision tree buys_computer age <=30 <=30 31…40 >40 >40 >40 31…40 <=30 <=30 >40 <=30 31…40 31…40 >40 income student credit_rating high no fair high no excellent high no fair medium no fair low yes fair low yes excellent low yes excellent medium no fair low yes fair medium yes fair medium yes excellent medium no excellent high yes fair medium no excellent buys_computer no no yes yes yes no yes no yes yes yes yes yes no Rule-based Systems (3) Decision tree buys_computer age? <=30 overcast 31..40 student? no no yes yes yes >40 credit rating? excellent fair yes Agent-based Modeling Simulate the behavior that emerges from the decisions of a large number of distinct individuals Computer generated agents, each making decisions typical of the decisions an individual would make in the real world Trying to understand the mysteries of why businesses, markets, consumers, and other complex systems behave as they do Toward the Real-Time Enterprise The essence of the phrase real-time enterprise is that organizations can know how they are doing at the moment Digitization and automation of some crucial enterprise activities traditionally completed by people Esp. information analysis Better sense-and-response Real-time Reporting Real-time reporting is occurring on a whole host of fronts including: Enterprise nervous systems Straight-through processing To reduce distortion in supply chains Real-time CRM A network that connects people, applications and devices To coordinate company operations To automate decision making relating to customers, and Communicating objects To gain real-time data about the physical world E.g. radio frequency identification device (RFID) The Dark Side of Real Time Object-to-object communication could compromise privacy Knowing the exact location of a company truck every minute of the day is an invasion the driver's privacy In the era of speed, a situation can become very bad very fast E.g. "circuit breaker" to stop deep dives in NYSE