Download CALO项目研究进展

CALO项目研究进展 2008年10月大纲  引言  CALO系统结构  主要研究内容 • OAA • SPARK • IRIS • PTIME • SR/AR  展望 2 引言 (1)  项目背景 • DARPA, 2003, PAL(Perceptive Assistant Learns, 2003~2008) • SRI, CALO(Cognitive Assistant that Learns and Organizes)  Latin word "calonis", which means "soldier’s servant".  项目目的 • The goal of the project is to create cognitive software systems, that is, systems that can reason, learn from experience, be told what to do, explain what they are doing, reflect on their experience, and respond robustly to surprise. 3 引言 (2)  研究领域 • Artificial Intelligence, Machine Learning, Natural Language Processing, Knowledge Representation, Human-computer Interaction, Flexible Planning, and Behavioral Studies  组织结构 • 美国斯坦福国际研究院（Stanford Research Institute International，简称SRI International） • HTTP://www.ai.sri.com/project/CALO, HTTP://caloproject.sri.com/ • 22家研究机构, 250科研人员 4 引言 (3) 5 引言 (4) 6 CALO系统结构 (1) 7 CALO系统结构 (2) 8 CALO系统结构 (3)  ORGANIZE AND MANAGE INFORMATION • 通过收集各种用户信息（电子邮件、月历、文件、项目、联系人等），学习出用户所处环境中潜在的关系模型，为更高层次的学习打基础。  PREPARE INFORMATION PRODUCTS • CALO自动将与项目相关的资料如邮件、文档、网页等打包以便用户在会议上使用。  OBSERVE AND MEDIATE INTERACTIONS • 包括电子邮件交互、会议交互、多方式的人机交互等，电子邮件交互包括对邮件的摘要、分类及排定回复的优先次序等，会议交互包括对会议记录进行评注等，多方式的人机交互指综合运用语音、手写、笔势、GUI界面操纵等多种方式进行人机交互。 9 CALO系统结构 (4)  MONITOR AND MANAGE TASKS • 对涉及多个子系统和参与者的复杂任务进行协调和管理。  SCHEDULE AND ORGANIZE IN TIME • 帮助用户安排日程、发现时间上的冲突并给出解决建议、代表用户和其他人协商会议时间等，并能够学习用户的习惯和具有可调整地自主性(用户对日程安排的参与程度)。  ACQUIRE AND ALLOCATE RESOURCES • 发现新的信息来源，学习以及推理角色和专家信息。 10 核心技术  OAA  SPARK  IRIS  PTIME  SR/AR 11 自底向上 OAA (1)  OAA (Open Agent Architecture) http://www.openagent.com 12 http://www.ai.sri.com/oaa/ An Case  场景： • Perrault通过麦克风通知CALO系统: 当关于安全的邮件到达时立刻通知我; • Cheyer写了一封标题为“security alert”的邮件给Perrault; • Perrault在办公室接到了电话，语音提示他有新邮件到达，要他输入密码; • Perrault通过电话按键输入密码后，系统通过电话播放了邮件的内容。 DEMO 13 Collaboration Process (1) 14 Collaboration Process (2) 15 Collaboration Process (3) 16 Collaboration Process (4) 17 Collaboration Process (5) 18 Collaboration Process (6) 19 OAA (2)  Characteristics [Martin, AAI99] [Cheyer, AAMAS01] • Open  agents can be created in many languages and interface with existing systems • Extensible  agents can be added or replaced on the fly • User friendly  high-level, natural expression of delegated tasks • Developer friendly  Unified approach to service provision, data management, and task monitoring 20 • Multimodal  handwriting, speech, gestures, and direct manipulation can be combined together • Reusable  Unanticipated sharing across many applications OAA (3)  ICL (Interagent Communication Language) • A layer of conversational protocol defined by event types, similar with KQML. • A content layer consists of the specific goals, triggers, and data elements, similar with KIF. • Based on an extension of the Prolog language.  Event • All communications between agents occur in the form of events.  Trigger  Provide a general mechanism for specifying some action to be taken when some set of conditions is met. 21 OAA (4)  Facilitation • Delegation, optimization, interpretation  Declarations of solvables • solvable(GoalTemplate, Parameters, Permissions) • solvable(send_message(email, +ToPerson, +Params), [type(procedure), callback(send_mail)], []) • solvable(last_message(email, -MessageId), [type(data), single_value(true)], [write(true)]) 22 SPARK (1)  SPARK (SRI Procedural Agent Realization Kit) • PRS, and shares the same Belief Desire Intention (BDI) model of rationality. • Support the construction of large-scale, practical agent systems, and contains sophisticated mechanisms for encoding and controlling agent behavior. • Has a well-defined semantic model that is intended to There is a need for agent systems that can scale to real world applications, support reasoning about the agents' knowledge and yet retain the clean semantic underpinning of more formal agent frameworks. [Morley,execution. AAMAS04] [Morley, AAAI04] http://www.ai.sri.com/~spark/ 23 SPARK (2) Overall Architecture for a SPARK Agent 24 SPARK (3)  Belief • A Knowledge base of beliefs about the world and itself that is updated both by sensory input from the external world and by internal events.  Procedures • provide declarative representations of activities for responding to events and for decomposing complex tasks into simpler tasks.  Intentions • At any given time the agent has a set of intentions, which are procedure instances that it is currently executing. 25 SPARK (4)  Executor • Is SPARK’s core. Its role is to manage the execution of intentions. • It does this by repeatedly selecting one of the current intentions to process and performing a single step of that intention. • Steps generally involve activities such as performing tests on and changing the KB, adding tasks, decomposing tasks by applying procedures, or executing primitive actions. 26 IRIS (1)  IRIS: Integrate. Relate. Infer. Share. • Semantic Desktop [Cheyer, Semantic Web05] • CALO is an artificial intelligence application for which IRIS serves as the semantic desktop user interface.  Integrate • Information resources • A knowledge base • User interface framework http://www.openiris.org/ 27 IRIS (2) 28 IRIS (3)  Relate • IRIS is used to semantically integrate the tools of knowledge work. • Clib (the Component Library Specification)  CALO’s ontology  Consists of definitions for everyday objects and events.  Use OWL as the data representation. 29  Infer IRIS (4) • One of the key differentiators of IRIS, compared to many semantic desktop systems, is the emphasis on machine learning and the implementation of a plug-andplay learning framework. • A typical use case  Email Harvesting.  Contact/Expertise Discovery.  Learn from Files.  Project Creation.  Classification According to Project. 30  Higher-level Reasoning IRIS (5)  Share • Shared structures are essential for both end-user applications, such as team decision making and project management, • and for infrastructural components such as machine learning algorithms, which improve when given larger data sets to work on. 31 PTIME (1)  PTIME (Personalized Time Management) [Berry, AAAI05] • PTIME will unobtrusively learn user preferences through a combination of passive learning, active learning, and advicetaking; • As above result, over time the user will become more confident of PTIME’s ability, and will thus let it make more decisions autonomously; • And as autonomy increases, PTIME will learn when to involve the user in its decisions. 32 PTIME (2) [Berry, AAMAS06] 33 PTIME (3)  Three components of PTIME • Process Controller (Heart of PTIME)  A SPARK agent that captures possible interactions.  Manages PTIME’s processes, tasking and coordinating the activities of the Constraint Reasoner and Preference Learner. • Constraint Reasoner  Explore conflict resolution options using relaxation, event bumping, and explanation techniques. • Preference Learner  Is an unobtrusive, online learner where the user’s selections from suggested alternatives provide feedback to the learning algorithm. 34 PTIME (4)  Research Directions [Berry, AAAI05] • Soft CSP design [Venable, IJCAI05]  Simple Temporal Problem (STP)  Disjunctive Temporal Problem (DTP)  Simple Temporal Problem with Uncertainty (STPU)  Disjunctive Temporal Problem with Uncertainty (DTPU) • Negotiation: Process Design for Conflict Resolution • Learning for Adjustable Autonomy 35 SR/AR (1)  SR/AR (Situation Assessment / Activity Recognition) [Hung 05] • Empower CALO with the ability to interpret and make sense of what is going on in its environment.  Tcchnical Challenges • Large, dynamic and relational state space. • Large sources of temproal and multi-model data. • Semantic gaps, uncertainty. 36 37 SR/AR (2)  Research Work • T1: Methods for state estimation in relational domains, including dealing with unknown number of objects and their identity, relevance determination and focus of attention. • T2: Methods for inference and learning in continuous time complex dynamic processes. • T3: Methods for active learning, strategic user querying and fast inference in large HMM. • T4: Methods for learning and recognizing hierarchical activity models from desktop activity traces. • T5: Methods for location-based activity recognition. 38 SR/AR (3)  Research Work • T6: Methods for learning and recognizing activities, gestures and relevant objects from low-level physical sensors. • T7: Methods for state estimation in communicative activities. • T8: Methods for tracking the progress of the CALO plan, including possible failures and missed deadlines. 39 SR/AR (4)  T1: Situation assessment in relational domain • Develop a language for representing domain in which the number of objects and their identity is unknown ---- BLOG (Bayesian LOGic) and DBLOG (Dynamic BLOG). • Propose an approach based on probabilistic relational models that does not insist on making a complete propositionalization of the domain at inference time.  T2: Continous time modeling in complex dynamic processes • From DBN to CTBN (Continuous Time Bayesian Network). 40 SR/AR (5)  T3: Active learning, strategic user querying, and fast inference in large HMM • Have implemented active learning for HMMs and obtained promising results on user activity data from an instrumented desktop. • Will extend these results to the domain of general graphical models, including DBNs.  T4: Learning and recognizing user’s activities from desktop traces • Typical user’s activities have an inherent hierarchical structure. • The main challenge for CALO is to chain the related events together, and infer the hidden sub-activity and activity at the high-level. • Efficient inference algorithms and semi-supervised learning approach in abstract and hierarchical hidden Markov models, with continuous time Bayesian network 41 SR/AR (6)  T5: Location-based activity recognition • Develop techniques that can reliably estimate the location (Location information is extracted from WiFi signal strength). • Develop methods for learning and inferring higher-level patterns of movement and activities from the data generated by a locationaware CALO. • From RMNs (Relational Markov Networks) to RFGs (Relational Factor Graphs).  T6, T7 and T8 • HHMM (Hierarchical Hidden Markov Models) [Nguyen, CVPR05] • ProPL (Probabilistic Process Language) 42 SR/AR (7) 43 展望 (1)  Transfer Learning [Dietterich 05] • Replacing an employee  Employee A is leaving an organization and being replaced by employee B. Can B’s CALO demonstrate transfer based on learning that took place in A’s CALO? • Moving to a new job  An employee leaves organization A and moves to a new organization B. Can his CALO demonstrate transfer learning from experiences in A to capabilities in B? 44 展望 (2)  Some learning mechanisms for transfer learning • Hierarchical Bayesian learning • Shared parameter models • Instance weighting • Abstraction regularization • Cascading classifiers • Attribute Weights and Low Dimensional Representations 45 展望 (3-CALO Learning) Relational: Learn relationships among entities Sequential: Learn the dynamic structure of ongoing activity of the user Category: Learn relevant groupings for observed information Language: Learn new Information from text and utterances Procedural: Learn to handle new tasks through planning Observation Reflection Inference Long-Term Memory Factual: Reason to learn new facts Perceptual: Learn to associate images and sounds with other knowledge Interaction Situational/Episodic Memory Advice: Learn from the user 46 展望 (4-Using CALO Learning) Jean Learn when to interact Mary Harry Learn important relationships Inference Timeline John Interact MMTM Notice Learn to handle new tasks Plan Anticipate Associate people with roles and places Learn to adapt to new situations Act t t 47 Now 展望 (5-Technical Challenges) Robust mixed-initiative multitasking in a changing environment Enduring improvement through learning Timeline Introspect Interact MMTM Plan Notice Integration of heterogeneous cognitive components t Anticipate Act Now Establishing and maintaining trust Knowing what’s out there 48 Seamless use across platforms Thanks！ 49 参考文献 (1)  [Morley, AAMAS04] Morley, D. and Myers, K. The SPARK Agent Framework. In Proc. of the Third Int. Joint Conf. on Autonomous Agents and Multi Agent Systems (AAMAS-04), New York, NY, pp. 712719, July 2004.  [Morley, AAAI04] Morley, D. and Myers, K. Balancing Formal and Practical Concerns in Agent Design. In Proc. of the AAAI Workshop on Intelligent Agent Architectures: Combining the Strengths of Software Engineering and Cognitive Systems, 2004.  [Cheyer, Semantic Web05] Cheyer, A. and Park, J. and Giuli, R. IRIS: Integrate. Relate. Infer. Share. In 1st Workshop on The Semantic Desktop. 4th International Semantic Web Conference, p. 15, Nov 2005.  [Berry, AAMAS06] Berry, P. and Conley, K. and Gervasio, M. and Peintner, B. and Uribe, T. and Yorke-Smith, N. Deploying a Personalized Time Management Agent, in Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multi Agent 50 Systems (AAMAS’06) Industrial Track, Hakodate, Japan, May 2006. 参考文献 (2)  [Berry, AAAI05] Berry, P. and Gervasio, M. and Uribe, T. and Pollack, M. and Moffitt, M. A Personalized Time Management Assistant, in AAAI 2005 Spring Symposium Series, Stanford, CA, Mar 2005.  [Venable, IJCAI05] Venable, K. B. and Yorke-Smith, N. Disjunctive Temporal Planning with Uncertainty, in Proceedings of Nineteenth International Joint Conference on Artificial Intelligence (IJCAI’05), Edinburgh, UK, pp. 1385–1386, Aug 2005.  [Nguyen, CVPR05] Nguyen, N. and Phung, D. and Venkatesh, S. and Bui, H. Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model, in IEEE International Conference on Computer Vision and Pattern Recognition, 2005.  [Duong, CVPR05] Duong, T. and Bui, H. and Phung, D. and Vekatesh, S. Activity recognition and abnormality detecting with the switching hidden semi-Markov model, in IEEE International Conference on Computer Vision and Pattern Recognition, 2005. 51 参考文献 (3)  [Hung 05] Hung Bui. Situation Assessment and Activity Recognition. Technique Report, SRI International, 2005.  [Dietterich 05] Tom Dietterich, Girish Acharya. Transfer Learning Activity for Years 3-5. Technique Report, SRI International, 2005.  [Martin, AAI99] Martin, David L. and Cheyer, Adam J. and Moran, Douglas B. The Open Agent Architecture: A Framework for Building Distributed Software Systems. Applied Artificial Intelligence, vol. 13, no. 1-2, pp. 91-128, January-March 1999.  [Cheyer, AAMAS01] Cheyer, Adam and Martin, David. The Open Agent Architecture. Journal of Autonomous Agents and Multi-Agent Systems, vol. 4 , no. 1, pp. 143-148, March 2001. 52

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download CALO项目研究进展