Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
VIRTUAL PRESENCE Authors: Voislav Galić, [email protected] Dušan Zečević, [email protected] Đorđe Đurđević, [email protected] Veljko Milutinović, [email protected] http://galeb.etf.bg.ac.yu/~vm/tutorial 1/48 SUMMARY - Introduction to Virtual Presence - Data Mining for Virtual Presence - A New Software Paradigm - Selected Case Studies Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 2/48 INTRODUCTION TO VP - Definitions - VP applications - Psychological aspects Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 3/48 DATA MINING FOR VP - Definitions - What can Data Mining do? - Growing popularity of Data Mining - Algorithms Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 4/48 SOFTWARE AGENTS - A new software paradigm - Standardization -FIPA specifications - Agent management - Agent Communication Language Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 5/48 CASE STUDIES • GoodNews (CMU*) – Categorization of financial news articles • iMatch (MIT**) – help students find resources they need – advanced, agent-based system architecture • “Tourist city” in the future (ETF***) – represents a qualitative step forward in the domain of maximization of customer satisfaction – technologies: • Data Mining • Software Agents (mobile) * Carnegie Mellon University, Pittsburgh, USA ** Massachusetts Institute of Technology, USA *** Faculty of Electrical Energinering, University of Belgrade, Serbia and Montenegro Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 6/48 CONCLUSION This tutorial will attempt to familiarize you with: - The concept of VP (Virtual Presence) as a new technological challenge - The new paradigms and technologies that will bring the VP to everyday life: - Data Mining - Software Agents Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 7/48 INTRODUCTION Virtual presence will arguably be one of the most important aspects of personal communication in the twenty-first century Definition Virtual presence is a term with various shades of meanings in different industries, but its essence remains constant; it is a new tool that enables some form of telecommunication in which the individual may substitute their physical presence with an alternate, typically, electronic presence Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 9/48 How to Accomplish it? • The presence is accomplished through the Internet, video, or other communications, perhaps even psychically one day • Technological advance will sophisticate virtual presence, altering the very meaning of the word “presence” • The ability to conduct everyday tasks by being virtually or electronically present Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 10/48 VP Applications • in government – “Sunshine laws” – Voting • in business – Online board meetings – Shareholder voting online • in education – interactive lectures and courses • in medicine – Telemedicine (Diagnostics, Remote surgery) – Risks (Privacy) • in everyday life – Telecommuting/Telework – Software agents as our virtual “shadows” Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 11/48 Psychological Aspects • Cyberspace and Mind • Presence in Virtual Space Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 12/48 DATA MINING Knowledge discovery is a non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data Many Definitions • Data mining is also called data or knowledge discovery • It is a process of inferring knowledge from large oceans of data • Search for valuable information in large volumes of data • Analyzing data from different perspectives and summarizing it into useful information Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 14/48 What Can Data Mining Do? • DM allows you to extract knowledge from historical data and predict outcomes of future situations • Optimize business decisions and improve customers’ satisfaction with your services • Analyze data from many different angles, categorize it, and summarize the relationships identified • Reveal knowledge hidden in data and turn this knowledge into a crucial competitive advantage • Predict cross-sell opportunities and make recommendations etc. Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 15/48 The Power of Data Mining • Having a database is one thing, making sense of it is quite another • It does not rely on narrow human queries to produce results, but instead uses AI related technology and algorithms • Data mining produces usually more general (=more powerful) results than those obtained by traditional techniques • Using more than one type of algorithm to search for patterns in data Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 16/48 Reasons for the Growing Popularity of Data Mining • Growing Data Volume • Low Cost of Machine Learning • Limitations of Human Analysis … Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 17/48 Tasks Solved by Data Mining • • • • • • • Predicting Classification Detection of relations Explicit modeling Clustering Market basket analysis Deviation detection Data mining includes three major components, with corresponding algorithms: –Clustering (Classification) –Association Rules –Sequential Analysis Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 18/48 Classification Algorithms • • • • • • • • Statistical algorithms Neural networks algorithms Genetic algorithms Nearest neighbor method Rule induction Data visualization Decision tree building algorithms Parallel algorithms Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 19/48 Association Rule Algorithms • Association rule implies certain association relationship among the set of objects in a database • These objects “occur together”, or “one implies the other” • Formally: X Y, where X and Y are sets of items (itemsets) • Key terms – Confidence – Support • The goal – to find all association rules that satisfy user-specified minimum support and minimum confidence constraints • Apriori algorithm and its variations • Distributed / Parallel algorithms Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 20/48 Sequential Analysis • Sequential Patterns • The problem – finding all sequential patterns with user-specified minimum support • Elements of a sequential pattern need not to be: – consecutive – simple items • Algorithms for finding sequential patterns – “count-all” algorithms – “count-some” algorithms Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 21/48 Conclusion • Various applications (market, banking, sports) • Drawbacks of existing algorithms – Data size – Data noise – Query complexity • The infrastructure has to be significantly enhanced to support larger applications • Solutions – Adding extensive indexing capabilities – Using new HW architectures to achieve improvements in query time Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 22/48 THE NEW SOFTWARE PARADIGM All software agents are programs, but not all programs are agents Many Definitions • Computational systems that inhabit some dynamic environment, sense and act autonomously and realize a set of goals or tasks for which they are designed • Hardware or (more usually) software-based computer system that enjoys the following properties: - Reactive (sensing and acting) Autonomous Goal-oriented (pro-active purposeful) Temporally continuous Communicative (socially able) Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović - Learning (adaptive) Mobile Flexible Character 24/48 What Problems do Agents Solve ? • Client/server network bandwidth problem • In the design of a client/server architecture • The problems created by intermittent or unreliable network connections • Attempts to get computers to do real thinking for us Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 25/48 The New Software Paradigm • Unless special care has been taken in the design of the code, two software programs cannot interoperate • The promise of agent technology is to move the burden of interoperability from software programmers to programs themselves This can happen if two conditions are met: – A common language (Agent Communication Language – ACL) – An appropriate architecture • They draw on and integrate many diverse disciplines of computer science and other areas Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 26/48 FIPA Specifications • The Foundation for Intelligent Physical Agents (FIPA), established in 1996 in Geneva • FIPA specifications: – – – – – – – – – Agent Management Agent Communication Language Agent/Software Integration Agent Management Support for Mobility Human-Agent Interaction Agent Security Management Agent Naming FIPA Architecture Agent Message Transport etc. Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 27/48 Agent Management • Provides the normative framework within which FIPA agents exist and operate • Establishes the logical reference model for the creation, registration, location, communication, migration and retirement of agents - The entities contained in the reference model are logical capability sets and do not imply any physical configuration - Additionally, the implementation details of individual APs and agents are the design choices of the individual agent system developers Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 28/48 Components of the Model •Agent - computational process - fundamental actor on an AP •Directory Facilitator - yellow pages software to other agents as a physical process has a life cycle - supported are: by the AP that has tofunction be managed •Agent-register Management System - white pages services to other agents -deregister - maintains -modify a directory of AIDs which contain transport addresses •Message Transport -search - supported function Service are: -register - communication method between agents -deregister •Agent-modify Platform -searchinfrastructure in which agents can be deployed - physical -get-description -operations for underlying AP •Software - all non-agent, executable collections of instructions accessible through an agent Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 29/48 Agent Life Cycle • FIPA agents exist physically on an AP and utilize the facilities offered by the AP for realising their functionalities • In this context, an agent, as a physical software process, has a physical life cycle that has to be managed by the AP The state transitions of agents can be described as: - create invoke destroy quit suspend Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović - resume wait wake up move* execute* 30/48 Agent Communication Language • The specification consists of a set of message types and the description of their meanings • Requirements: – Implementing a subsetparameters: of the pre-defined message types and protocols • Pre-defined message – Sending and receiving the not-understood message :sender • Communicative acts: – Correct implementation of communicative acts :receiver confirm defined in the specification disconfirm :content – Freedom to use communicative acts with other names, inform :reply-with not defined in the specification not-understood :in-reply-to – Obligation of correctly generating messages in the transport form query-if :language – Language must be able to express propositions, objects and actions query-ref – The use of Agent Management Content Language and ontology :ontology refuse etc. :reply-by :protocol Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 31/48 Communication Examples asks agent for its available - Agent-ji Agent refuses to i jreserve a ticket for i, i, believing that agent jservices: thinks that a shark is a d1: Agent i asks agent j if j is registered with domain server (query-ref since i there mammal, are insufficient attempts tofunds change in i's j's account: belief: (query-if :sender -i bid it can reserve trains, planes and (disconfirm - Agent j Auction replies (refuse :sender inotthat Agent i did understand an query-if message (inform :receiver jAgent :sender i :sender j i confirms to agent j that it is, automobiles: :receiver j not because it did recognize the ontology: :sender agent_X :content :receiver i in fact,j true :receiver (inform :content that it is snowing today: (not-understood auction_server_Y (iota ?x :receiver (available-services j ?x)) :content :content shark) :sender j (mammal (confirm (registered (server d1) (agent j)) :sender i :content …)() :receiver i r09 i :sender :reply-with (action :receiver j (reserve-ticket j LHR, MUC, (price (bid good02) 150) 27-sept-97)) :content :receiver ac12345) j ) (insufficient-funds :content ((query-ifround-4 :sender j :receiver i …) :in-reply-to (= (iota ?x (available-services jsnowing ?x)) )" :content "weather( today, ... ) (unknownbid04 (ontology www))) :reply-with ((reserve-ticket train) :language Prolog (inform :language sl) :language sl :language sl plane) ) (reserve-ticket j ) :sender :ontology auction (reserve automobile)) :receiver i ) ) :content (not (registered (server d1) (agent j))) …) :in-reply-to r09 ) Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 32/48 GoodNews A system that automatically categorizes news reports that reflect positively or negatively on a company’s financial outlook Introduction • Correlation between news reports on a company’s financial outlook and its attractiveness as an investment • Text categorization – very difficult domain for the use of machine learning – Very large number of input features – High level of noise (metaphors, irony,…) – Large percent of irrelevant features • A new text classification algorithm – “Domain Experts” • Two types of data – (Human-)labeled – Unlabeled • The algorithm classifies financial news into the predefined five categories • FCP (Frequently Co-located Phrase) the building element for the categorization algorithm Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 34/48 Categorization • The algorithm categorizes each given news article into the predefined categories – GOOD – strong and explicit evidences of the company’s financial status • …shares of ABC company rose 2 percent… – GOOD, UNCERTAIN – predictions and forecasts of future profitability • … ABC company predicts fourth-quarter earnings will be high… – NEUTRAL – nothing is mentioned about the financial well-being of the company • … ABC announced plans to focus on products based on recycled materials… – BAD, UNCERTAIN – predictions of future loses • … ABC announced today that fourth-quarter results could fall short of expectations… – BAD – explicitly bad evidences • … shares of ABC fell $0.57 to $44.65 in early NY trading… Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 35/48 Co-located Phrase • The proposed algorithm labels the “unlabeled” news articles through voting process among experts that are FCP’s • Definition – a co-located phrase is a sequence of nearby, but not necessarily consecutive words – …shares of ABC rose 8.5%… (shares, rose): GOOD – …ABC presented its new product… (present, product): NEUTRAL class + “share & gains | rose”, “profit | revenue & rose” +/? “except | forecasts & earnings” +/- “alliance & company”, “deal | present & product” -/? “short & expectation” Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović selected FCP “share & down | lost”, “profit | sales & decrease” 36/48 Conclusion • Problems with construction of the training (i.e. labeled) data set – “inter-indexer inconsistency” • Problems with small sets of labeled (training) data – Very expensive labeled data, while unlabeled data are cheaply available • The accuracy is around 75% (total of 2000 news articles); • Comparison of a few different methods (picture) Naive-Bayes v Domain Experts Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 37/48 iMatch The vision of each MIT student having a personal software agent, which helps to manage its owner's academic life Introduction • The aim - bring together MIT students and staff who may usefully collaborate with each other – completing final projects – studying for exams – tutoring one another • Facilitate students and faculty matching for: – Research – Teaching – Internship Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 39/48 Ceteris Paribus Preference • Ceteris paribus relations express a preference over sets of possible outcomes • All possible outcomes are considered to be describable by some (large) set of binary features (true or false) – The specified features are instantiated to either true or false – Other features are ignored I prefer train I prefer ice cream I prefer airplane I prefer chocolate I prefer cell phone I prefer e-mail Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 40/48 CPP Agent Configuration • Specify a domain for preference – Agent methods of communication and notification – Different security settings of different servers • Preference statements themselves – How to get users to easily adjust C.P. rules (graphical interface) – Pose hypothetical preference questions to user to help complete the preferences of an ambivalent user • People will only put down their true profile, if they know that the system is secure Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 41/48 Conclusion • Benefit MIT students by matching them to appropriate resources • Static interest matching – Group together similar users for specific context – This enables viewing a human user as a resource for dynamic resource discovery (locate experts, enthusiasts,...) • Dinamic interest matching – Location and/or temporal specific resource matching As students and their agents move from one physical location to another, iMatch services for matching the closest resources can be offered • Help students manage their lives Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 42/48 The near future… The focus of the research is on e-tourism after the year 2005, but the applications of the proposed infrastructure are multifold Introduction • The assumptions: – after the year 2005, each tourist in Europe will be equiped with a cell phone of the power same or better than the Pentium IV – whenever a tourism-based service or product is purchased, a mobile agent is assigned to that cell phone PC, to monitor the behaviour of the customer – all tourist cell phone PCs create an AD-HOC network around the points of touristic attractions, and link to a data mine that collects all information of interest Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 44/48 How to accomplish it? • The information of interest is not collected by asking the customer to fill out the forms, but by monitoring the behaviour of the customer • The collected information, sorted in the data mine, is made available to other tourists, as an on-line ownerindependent source of information about the given services and/or products Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 45/48 What can it do… • If a tourist would like to know, at that very moment, what restaurant has good food/atmosphere and happy customers, he/she can access the data mine (via the Internet) and can obtain the information that is linked to that very moment, and is not created by the owner of the business, but by the customers • Accessing the given restaurant’s website has two drawbacks: – the information is not fresh - periodically updated – the information is made by the owner of the restaurant, and therefore not completely objective Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 46/48 Conclusion • Consequently, the proposed approach works much better, and represents a qualitative step forward in the domain of maximization of customer satisfaction • This may mean that the privacy of the customers is jeopardized, however, if the monitored behaviour is non-personalized, and if the customer obtains a discount based on the fact that mobile agents are welcome, the privacy stops to be an issue, and people will sign up voluntarily Voislav Galić, Dušan Zečević, Đorđe Đurđević, Veljko Milutinović 47/48 THE END Quatenus nobis denegatum diu vivere, relinquamus aliquid, quo nos vixisse testemur References: http://www.marconi.com http://www.blueyed.com http://www.fipa.org http://www.rpi.edu http://research.microsoft.com http://imatch.lcs.mit.edu ……… Authors: Voislav Galić, [email protected] Dušan Zečević, [email protected] Đorđe Đurđević, [email protected] Veljko Milutinović, [email protected] http://galeb.etf.bg.ac.yu/~vm/tutorial