Download sai-avatar1.doc

A dissertation Report on AVATAR SIMULATED BOT: WITH SEMANTIC MEMORY AND CONGNITIVE APPROACH SAI - SIMULATED ARTIFICIAL INTELLIGENCE BOT A dissertation Report on AVATAR SIMULATED BOT: WITH SEMANTIC MEMORY AND COGNITIVE APPROACH PARUL INSTITUTE OF ENGINEERING AND TECHNOLOGY (MCA) VADODARA (2011–2012) by:095250693053 Anitha .K. Swamy 095250693058 Patel Nikisha A ACKNOWLEDGEMENT The 5th Semester Dissertation is a golden opportunity for learning, doing research work, understanding & implementation of new technologies and self development. We consider ourselves very lucky and honored to have so many wonderful people lead us through in completion of this dissertation. Mr. V.N.Acharya, HOD, MCA Dept. and our Dissertation Guide Falguni Ranadive monitored our progress and arranged all facilities to make dissertation easier. She was always so involved in the entire process, shared his knowledge, and encouraged us to think. Thank You, Dear Madam. We choose this moment to acknowledge her contribution gratefully. Last but not the least, there were so many who shared valuable information that helped in successful completion of this dissertation. I would like to thank all of them on behalf of us both team members. TABLE OF CONTENTS GLOSSARY OF IMPORTANT TERMS AND ABBREVIATIONS AVATAR SAI ARTIFICIAL INTELLIGENCE EMBODIED AGENTS AIML: ARTIFICIAL MARKUP LANGUAGE AIML, or Artificial Intelligence Markup Language, is an XML dialect for creating natural language software agents. 1.)LIST OF FIGURES AND CHART Then following figures shows the concept of backpropogation. A simple agent program which maps every possible percepts sequence to a possible action the agent can perform or to a coefficient, feedback element, function or constant that affects eventual actions: Agent function is an abstract concept as it could incorporate various principles of decision making like calculation of utility of individual options, deduction over logic rules, fuzzy logic, etc. The program agent, instead, maps every possible percept to an action. Artificial embodied agents schematically as above are often described Simple embodied agents Simple reflex agents act only on the basis of the current percept, ignoring the rest of the percept history. The agent function is based on the condition-action rule: if condition then action. This agent function only succeeds when the environment is fully observable Self-Learning Embodied agents Learning has an advantage that it allows the agents to initially operate in unknown environments and to become more competent than its initial knowledge alone might allow. The most important distinction is between the "learning element", which is responsible for making improvements, and the "performance element", which is responsible for selecting external actions. The learning element uses feedback from the "critic" on how the agent is doing and determines how the performance element should be modified to do better in the future. The performance element is what we have previously considered to be the entire agent: it takes in percepts and decides on actions. The last component of the learning agent is the "problem generator". It is responsible for suggesting actions that will lead to new and informative experiences. 2.) MAIN TEXT 2.1 ABSTRACT: This dissertation introduces a unified data acquisition, processing and synthesis framework for the creation of biomechanically correct human avatars 2.2 KEY TERMS : Avatar, Simulated Artificial Intelligence bot,artificial intelligence,bot,chatbot assistant, virtual assistant ,semantic memory, embodied conversational agent. 2.3 INTRODUCTION: One of the most significant roles played by technology is connecting people and mediating their communication with one another. Remote conversations were unthinkable before but are now routinely conducted with devices ranging from two-way personal computers to videoconference systems. Building technology that mediates conversation presents a number of challenging research and design questions. Apart from the fundamental issue of what exactly gets mediated, two of the more crucial questions are how the person being mediated interacts with the mediating layer and how the receiving person experiences the mediation (see Figure 1). This literary introduction provides the framework of mediated conversation by means of automated avatars. A lot of efforts in constructing interfaces based on natural language have been devoted to creating a simulated pseudo bot that the user seems to feel that the bot understands the meaning of the words. Since the famous “Eliza” program of Weizenbaum chatterbots attempt to discover keywords and sustain dialog by asking pre-prepared questions without understanding the subject of conversation or the meaning of individual words. This is quite evident from the Loebner prize chatterbot competition, popularity of bots based on AIML language, and the general lack of progress in text understanding and natural language dialogue systems. Cheating has obviously its limitations and it is doubtful that good natural language interfaces may be built this way. An alternative approach used by humans requires various types of memory systems to facilitate concept recognition,building episodic relations among concepts, and storing the basic information about the world, descriptions of objects, concepts, relations and possible actions in the associative semantic memory. Although the properties of semantic memory may be partially captured by semantic networks so far this has been demonstrated only in narrow domains , and it is not easy to see how to create a large-scale semantic network that could be used in an unrestricted dialog with a chatterbot. In this paper cognitive inspirations are drawn upon to make a first step towards creation of avatars equipped with semantic memory that will be able to use language in an intelligent way. This requires ability to ask questions relevant to the subject of discourse, questions that constrain and narrow down possible ambiguities. Very ambitious projects that use a sophisticated frame-based knowledge representation, have been pursued for decades and can potentially become useful in natural language processing, although this has yet to be demonstrated. However, the complexity of the knowledge-based reasoning in large systems make them unsuitable for real-time tasks, such as quick analysis of large amounts of text found on the web pages, or simultaneous interactions with many users. An alternative strategy followed here is to start from the simplest knowledge representation for semantic memory and to find applications where such representation is sufficient. Drawing on its semantic memory an avatar may formulate and may answer many questions that would require exponentially large number of templates in AIML or other such languages. Endowing avatars with linguistic abilities involves two major tasks: building semantic memory model, and providing all necessary means for natural communication. This paper describes our attempts to create Humanized Interface based on a 3D human face model, with speech synthesis and recognition, which is used to interact with Web pages and local programs, making the interaction much more natural than typing these actions are based primarily on the information in its semantic memory. Building such memory is not a simple task and requires development of automatic and manual data collection and retrieval algorithms, using various tools for analysis of natural language sources. Avatar :SAI-SIMUALTED ARTIFICIAL INTELLIGENCE BOT In computing,an avatar is the graphical representation of the user or the user's alter ego or character. It may take either a dimensional form, as in games or virtual worlds, or a two-dimensional form as an icon in Internet forums and other online communities. It can also refer to a text construct found on early systems such as MUDs. It is an object representing the user. The term "avatar" can also refer to the personality connected with the screen name, or handle, of an Internet user. Avatar chatbot is any fictional character not controlled by a person who created this usually means a character controlled by the computer through artificial intelligence. In artificial intelligence, an embodied agent, also sometimes referred to as an interface agent, is an intelligent agent that interacts with the environment through a physical body within that environment. Agents that are represented graphically with a body, for example a human or a cartoon animal, are also called embodied agents, Embodied conversational agents are embodied agents (usually with a graphical front-end as opposed to a robotic body) that are capable of engaging in conversation with one another and with humans employing the same verbal and nonverbal means that humans do (such as gesture, facial expression, and so forth). One of the trends of recent years has been the humanizing of digital channels, giving a face to things which are not human. This has led to the creation of avatars (also known as bots or chatter-bots) artificial intelligences with which users can “converse”. The success of such bots varies greatly, there are few which respond in a convincingly human way, it is no great mystery why they are commonly referred to as “Bots” often resulting in a stilted, mechanical interaction where straying off a recognized path can lead to poor responses However, this has not stopped their spread across the commercial world with several high profile companies adopting them as part of their customer services. An avatar such as SAI BOT render to the idea of an artificial intelligence able to respond in an intelligent manner to your questions is indeed an exciting one. However, do these bots really manage it? Or are they just human faced Avatars disguising a search engine beneath? Or such as a intelligent virtual assistant may basically consist of a dialog system, an avatar, as well an expert system to provide specific expertise to the user. Other components An automated online assistant also has an expert system that provides specific service, whose scope depends on the purpose of it. Also, servers and other maintaining systems to keep the automated assistant online may also be regarded as components of it. History An avatar is a user’s visual embodiment in a virtual environment. The term, borrowed from Hindu mythology where it is the name for the temporary body a god inhabits while visiting earth, was first used in its modern sense by Chip Morningstar who along with Randall Farmer created the first multi-user graphical online world Habitat in 1985 (Damer 1998). Habitat was a recreational environment where people could gather in a virtual town to chat, trade virtual props, play games and solve quests Users could move their avatars around the graphical environment using cursor keys and could communicate with other online users by typing short messages that would appear above their avatar. Habitat borrowed many ideas from the existing text-based MUD environments, but the visual dimension added a new twist to the interactions and attracted a new audience (Morningstar and Farmer 1990). Avatar-based systems since Habitat have been many and varied, the applications ranging from casual chat and games to military training simulations and online classrooms. In electronic media such as chat bots, this usually means a character controlled by the computer through artificial intelligence. Web 2.0 Avatars, powered by Digital Conversations, provide a level of immersion not found in these bots. Why? Because Digital Conversations are scripted just like any good book. And like books they are designed to guide a user, through high quality dialogue and interactions, to an outcome. Along with this, the ability to understand user interactions through DecisionMetrics means that these Web 2.0 Avatars can be adapted to emergent demands as they appear. The dialogue can be improved and built up as and when needed. High quality dialogue, clear concise options for a user to choose and a humanized avatar all combine to create an immersive experience, with the psychological appeal of interacting with a character or object . The key to immersion and believability is high quality dialogue, and it is high quality dialogue that Digital Conversations has been created for. FUNAMENTAL CONCEPTS ARTIFICIAL INTELLIGENCE: Artificial intelligence (AI) is the intelligence of machines and the branch of computer science that aims to create it. Traditionally it is defined as the field of "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its chances of success. John McCarthy, who coined the term in 1956, defines it as "the science and engineering of making intelligent machines." Classified and statistical learning methods employed for functioning of the avatar Neural networks Main articles: Neural network and Connectionism A neural network is an interconnected group of nodes, akin to the vast network of neurons in the human brain. The study of artificial neural networks began in the decade before the field AI research was founded, in the work of Walter Pitts and Warren McCullough. Other important early researchers were Frank Rosenblatt, who inventedthe perceptron and Paul Werbos who developed the back propagation algorithm. The main categories of networks are acyclic or feedforward neural networks (where the signal passes in only one direction) and recurrent neural networks (which allow feedback). Among the most popular feedforward networks are perceptrons, multilayer perceptrons and radial basis networks. Among recurrent networks, the most famous is the Hopfield net, a form of attractor network, which was first described by John Hopfield in 1982. Neural networks can be applied to the problem of intelligent control (for robotics) or learning, using such techniques as Hebbian learning and competitive learning.Hierarchical temporal memory is an approach that models some of the structural and algorithmic properties of the neocortex. Intelligent agent paradigm An intelligent agent is a system that perceives its environment and takes actions which maximize its chances of success. The simplest intelligent agents are programs that respond to real life entity based questions such agents are symbolic and logical,neural networks and others may use new approaches. The paradigm also gives researchers or users ability to communicate with such embodied agents (use this for neural net descript )Artificial Neural Networks (ANNs) are a new approach that follow a different way from traditional computing methods to solve problems. Since conventional computers use algorithmic approach, if the specific steps that the computer needs to follow are not known, the computer cannot solve the problem. That means, traditional computing methods can only solve the problems that we have already understood and knew how to solve. However, ANNs are, in some way, much more powerful because they can solve problems that we do not exactly know how to solve. That's why, of late, their usage is spreading over a wide range of area including, virus detection, robot control, intrusion detection systems, pattern (image, fingerprint, noise..) recognition and so on. ANNs have the ability to adapt, learn, generalize, cluster or organize data. There are many structures of ANNs including, Percepton, Adaline, Madaline, Kohonen, BackPropagation and many others. Probably, BackPropagation ANN is the most commonly used, as it is very simple to implement and effective. In this work, we will deal with BackPropagation ANNs. Backpropogation is used here as we are using aiml to create a pattern recognizing chatbot which will later on after long duration and consistent input of data by users will then result to unsupervised learning via output weights of backpropgation network being inputted as input of the neural net’s layers We also make use of genetic algorithm tech. Digital Conversation A Digital Conversation is a scripted dialogue (in other words it is dialogue written by a human, just like the script of a movie) which takes place between a person and a computer via any digital medium from web browsers and PDAs to mobile phones and Interactive television. 2.4 LITERATURE SURVEY 2.5 MAJOR THESES AND HYPOTHESES PRESENTED Reference from : Towards Avatars with Artificial Minds: Role of Semantic Memory W The first step towards creating avatars with human-like artificial minds is to give them humanlike memory structures with an access to general knowledge about the world. This type of knowledge is stored in semantic memory. Although many approaches to modeling of semantic memories have been proposed they are not very useful in real life applications because they lack knowledge comparable to the common sense that humans have, and they cannot be implemented in a computationally efficient way. The most drastic simplification of semantic memory leading to the simplest knowledge representation that is sufficient for many applications is based on the Concept Description Vectors (CDVs) that store, for each concept, an information whether a given property is applicable to this concept or not. Unfortunately even such simple information about real objects or concepts is not available. Experiments with automatic creation of concept description vectors from various sources, including ontologies, dictionaries, encyclopedias and unstructured text sources are described. Haptek-based talking head that has an access to this memory has been created as an example of a humanized interface (HIT) that can interact with web pages and exchange information in a natural way. A few examples of applications of an avatar with semantic memory are given, including the twenty questions game and automatic creation of word puzzles. Reference : Avatar online augmented conversation (map chat alppimcation) for thesis and hypothesis Major thesis (ABSTRACT) This thesis is concerned with both of these questions and proposes a theoretical framework of mediated conversation by means of automated avatars. This new approach relies on a model of face-to-face conversation, and derives an architecture for implementing these features through automation. First the thesis describes the process of face-to-face conversation and what nonverbal behaviors contribute to its success. It then presents a theoretical framework that explains how a text message can be automatically analyzed in terms of its communicative function based on discourse context, and how behaviors, shown to support those same functions in face-toface conversation, can then be automatically performed by a graphical avatar in synchrony with the message delivery. An architecture, Spark, built on this framework demonstrates the approach in an actual system design that introduces the concept of a message transformation pipeline, abstracting function from behavior, and the concept of an avatar agent, responsible for coordinated delivery and continuous maintenance of the communication channel. A derived application, MapChat, is an online collaboration system where users represented by avatars in a shared virtual environment can chat and manipulate an interactive map while their avatars generate face-to-face behaviors. A study evaluating the strength of the approach compares groups collaborating on a route-planning task using MapChat with and without the animated avatars. The results show that while task outcome was equally good for both groups, the group using these avatars felt that the task was significantly less difficult, and the feeling of efficiency and consensus were significantly stronger. An analysis of the conversation transcripts shows a significant improvement of the overall conversational process and significantly fewer messages spent on channel maintenance in the avatar groups. The avatars also significantly improved the users’ perception of each others’ effort. Finally, MapChat with avatars was found to be significantly more personal, enjoyable, and easier to use. The ramifications of these findings with respect to mediating conversation are discussed. Contributions and Organization of Thesis First the background about human face-to-face conversation is reviewed in chapter 2 along with a review of computer mediated communication and computational models of conversation. Then the theoretical framework describing the augmentation of online conversation based on a model of face-to-face conversation is introduced in chapter 3. This model lists the essential processes that need to be supported and how nonverbal behavior could be generated to fill that role. The theory is taken to a practical level through the engineering of an online conversation system architecture called Spark described in chapter 4 and then an actual implementation of this architecture in the form of a general infrastructure and a programming interface is presented in chapter 5. A working application for online collaborative route planning is demonstrated in chapter 6 and the implementation and approach evaluated in chapter 7. Possible follow-up studies, application considerations and interesting issues are discussed in Chapter 8. Finally future work and conclusions in chapters 9 and 10 place the approach in a broader perspective, reflecting on general limitations as well as on the kinds of augmented communication this work makes possible. Results and analysis This thesis on MapChat application makes contributions to several different fields of study: To the field of computer mediated communication, the thesis presents a theory of how textual real-time communication can be augmented by carefully simulating visual face-to-face behavior in animated avatars. The thesis demonstrates the theory in an implemented architecture and evaluates it in a controlled study. To the field of human modeling and simulation, the thesis presents a set of behaviors that are essential to the modeling of conversation. It is shown how these behaviors can be automatically generated from an analysis of the txt to be spoken and the discourse context. To the field of HCI, the thesis presents a novel approach to augmenting an online communication interface through real-time discourse processing and automated avatar control. For avatar-based systems, it provides an alternative to manual control and performance control of avatars. For any communication system it introduces the idea of a communication proxy in the form of a personal conversation agent remotely representing participants. To the field of systems engineering, the thesis presents a powerful way to represent, transmit and transform messages in an online real-time messaging application. Design While the first study addresses the question how seeing the animated avatar bodies affected the communication, follow-up studies could ask other but related questions. Here three different study designs are suggested, all using the task and outcome measures described above. Study I Hypothesis: “Groups using the new face-to-face paradigm do better on a judgment task over those groups that use the stateofthe art in online collaboration.” Goal: Compare the face-to-face avatar paradigm with a shared workspace paradigm currently representing the stateofthe art in online collaboration. This comparison has the potential to demonstrate the power of a new paradigm and uses a system most people are familiar with as a reference. Method: Two sets of groups solve the judgment task, one using a popular collaboration system such as NetMeeting that integrates a text chat with a shared whiteboard and another using a Spark based system. In NetMeeting the inventions to be ordered would be images on the whiteboard. In the 3D avatar environment, they would be objects on the table in front of the avatars. Study II Hypothesis: “Groups that use avatars modeling conversational behavior in avatars do better on a judgment task than groups that use minimally behaving avatars” Goal: To show the impact of modeling appropriate behavior by demonstrating that the effects found with the new animated avatars so far are not due to their mere presence but to their carefully crafted behavior. Method: Two sets of groups solve the judgment task, both using a Spark based system, but for one group all behaviors are turned off except for lip movement when speaking and random idle movement. Study III 147 Hypothesis: “Groups that use the animated avatars and groups that interact face-to-face show improvement in judgment task performance over groups that use text only chat. Groups interacting face-to-face show the greatest improvement.” Goal: To show that the avatars that animate typical face-toface behavior actually move the performance of online collaboration closer to that of actual face-to-face. Method: Three sets of groups solve the judgment task, one face-to-face, one using the Spark based system with the avatars visible and one with no avatars visible. The follow-up studies should be between-subject studies to get around a possible learning effect. A between-subject study would also alleviate problems associated with scheduling groups of subjects for return visits and the possibly shifting group membership. Implementation For the study that has been conducted, the implemented system and animation was deemed “good enough” by experts to represent the theoretical model. As mentioned in the MapChat technical evaluation, there were still a few issues, especially with time lag. Furthermore, the animations themselves felt a little “stick-figure-like” because Pantomime is currently only capable of rendering joint rotations of stiff segments with no natural deformation of the body. All of these technical issues are under constant improvement. Lag times improve as computers get faster, and several parts of the MapChat implementation are being fixed and optimized as a result of running the user study. Using Pantomime to control a skinned character animation rendering engine instead of Open Inventor has been successfully tested, so future animations in Pantomime may see drastic improvement in 2.6 DATA AND ANALYSIS 2.7 RESULTS AND DISCUSSIONS Overview The goal of the work presented in this dissertation is to augment online conversation by employing avatars that model face-to-face behavior. The goal can be divided into 3 parts: ( Bullet these points )understand what a person means to communicate (input), define a set of processes crucial for successful interaction the set of behaviors that support them (model), and finally coordinate those behaviors in a real-time performance (output put figure here of our avatar). Future work can expand on each of these parts. 9.2 Input and interpretation Speech While text is will most likely continue to be the most popular messaging and chat medium, voice-over-IP technology is providing increasingly higher quality voice conferencing for applications ranging from shared whiteboards to games. There are certainly situations where voice is the best option, such as when hands are not free to type. It is therefore important to consider what it would take to augment a speech stream using the approach presented here. discussion The long-term goal of our project is to create an avatar that will be able to use natural language in a meaningful way. An avatar based on the Haptek head has been created, equipped with semantic memory, and used as an interface to interactive web pages and software programs implementing word games. Many technologies have to be combined to achieve this. Such avatars may be used as a humanized interfaces for natural communication with chatterbots that may be placed in virtual environments. Several novel ideas have been developed and presented here to achieve this: 1) Knowledge representation schemes should be optimized for different tasks. While general semantic memories have been around for some time [9, 10] they have not led to large-scale NLP applications. For some applications much simpler representation afforded by the concept description vectors is sufficient and computationally much more efficient. 2) CDV representation facilitates some applications, such as generation of word puzzles, while graph representation of general semantic memory cannot be easily used to identify a subset of features that should be used in a question. Binary CDV representation may be systematically extended towards frame-based knowledge representation, but it would be better to use multiple knowledge representation schemes optimized for different applications. Although reduction of all relations stored in semantic memory to the binary CDV matrix is a drastic simplification some applications benefit from the ability of quick evaluation of the information content in concept/ feature subspaces. Careful analysis is needed to find the simplest representations that facilitate efficient analysis for different applications. Computational problems due to a very large number of keywords and concepts with the number of relations growing into millions are not serious if sparse semantic matrix representation is used. 3) An attempt to create semantic memory in an automatic way, using definitions and assertions from Wordnet and ConceptNet dictionaries, Sumo/Milo ontologies and other information sources has been made. Although analysis of that information was helpful, creating full description even for simple concepts, such as finding all properties of animals, proved to be difficult. Only a small number of relevant features have been found, despite the large sizes of databases analyzed. Thus we have identified an important challenge for lexicographers: creation of a full description of basic concepts. Information found in dictionaries is especially brief and without extensive prior knowledge it would not be possible to learn much from them. The quality of the data retrieval (search) depends strongly on the quality of the data itself. Despite using machine readable dictionaries with verification based on dictionary glosses still spurious information may appear, for example strange relations between keywords and concepts which do not appear in real world. This happens in our semantic memory in about 20% of all entries and is much more common if context vectors are generated by parsing general texts. In most cases it is still possible to retrieve sensible data from such semantic memory. One way to reduce this effect is to parse texts using phrases and concepts rather than single words. The quality of semantic memory increases gradually as new dictionaries and other linguistic resources are added. It is also designed to be fine-tuned 2.8 CONCLUSION AND FUTURE DIRECTIONS Conclusions This paper describes the use of personalisation and a Personal Assistant application in a platform which aims to support an open active mobile multimedia environment, with a range of applications and services specifically geared to young people. An important aspect of the platform is the provision of context-aware services, including location awareness. The importance of the latter has been recognised in several studies The effectiveness of any system of personalisation depends to a large extent on the information captured in the user profile. To this end the Personal Assistant plays a key role in maintaining the user profile. To improve the interface with the user exchangeable avatars are included, which can be controlled by the user. This provides a further level of personalisation and, it is hoped, will make the task of interaction more enjoyable. In conclusion, the Youngster approach combines innovative technology and service developments in a platform, which has been developed for young people. Future directions An avatar based on Haptek head has been equipped with semantic memory and used as an interface to chatterbots, interactive web pages and software programs implementing word games. This avatar may be used as humanized interface for natural communication with text-based web pages, or placed in virtual environments in cyberspace. Semantic memory stored in relational database is used efficiently in many applications after reduction to a sparse CDV matrix. A chatterbot used with avatar equipped with semantic memory may ask intelligent questions knowing the properties of objects mentioned in the dialog. Most chatterbots try to change the topic of conversation as they do get lost in the conversation Information found in dictionaries is especially brief. and without extensive prior knowledge it would not be possible to learn much from them. The quality of the data retrieval (search) depends strongly on the quality of the data itself. Despite using machine readable dictionaries with verification based on dictionary glosses still spurious information may appear, for example strange relations between keywords and concepts which do not appear in real world. This happens in our semantic memory in about 20% of all entries and is much more common if context vectors are generated by parsing general texts. In most cases it is still possible to retrieve sensible data from such semantic memory. One way to reduce this effect is to parse texts using phrases and concepts rather than single words. The quality of semantic memory increases gradually as new dictionaries and other linguistic resources are added. It is also designed to be fine-tuned during its usage. Use of avatar IV. USE OF AVATARS One particular device that is being used to improve the interface with the user is the use of avatars. Ideally an avatar may be viewed as an interface to a system with a virtual representation of personal presence and personal characteristics. In this way it provides a user-friendly environment and helps the user to interact in an easier way with the system. Avatars can provide more than just an interface to a system. More advanced avatars can be adaptive to behave differently in different situations and hence are better able to attract and hold the user's attention. In some cases they could even take on the characteristics of a pet, or more appropriately in this case, a tamagotchi-like entity. There are a number of instances of the use of avatars in applications with varying degrees of success. The most well known (and least liked!) example is the Microsoft Office Assistant that pops up when you least expect it to offer you help in writing a letter. Similar examples include Prody Parrot and Bonzi Buddy, both of which interact with the user in a variety of ways (including telling jokes, playing games, reading email and providing reminders). More advanced avatars such as Haptek's Virtual Friend are now beginning to appear although the resources required for this are beyond those currently available on mobile phones and PDAs. In the case of youngsters the use of avatars is potentially very appealing. However, once again the importance of personalisation comes into this. Providing a standard avatar for all users is unlikely to be effective. Instead in the Youngster platform a more flexible mechanism is provided by which the youngsters can select their own avatars and insert them at appropriate points in the interface. They may even develop their own, exchange them with friends, or possibly even buy new avatars to use in the system. Thus instead of a paper clip, a parrot or even a Homer Simpson charicature, they may include their own realisation of Britney Spears or the like. This notion of exchangeable avatars not only provides the opportunity for personalisation by the user but also can be used to handle avatars on different devices. Within the Youngster project we have focused on producing avatars for both Java phones and PDAs. Because of the limitations on Java phones the avatars are more restricted than on PDAs making it all the more important to allow youngsters to create their own. The idea of having multiple profiles for different modes of operation or identities of the user can be coupled with the use of different avatars for different profiles thereby adding to the interest level for youngsters. 3) REFERENCES http://en.wikipedia.org/wiki/Avatar_(computing) http://en.wikipedia.org/wiki/Embodied_agent http://en.wikipedia.org/wiki/Automated_online_assistant http://en.wikipedia.org/wiki/Artificial_intelligence http://en.wikipedia.org/wiki/Natural_language_processing http://en.wikipedia.org/wiki/Digital_conversation http://www.existor.com/ http://alice.pandorabots.com/ http://www.pandorabots.com/botmaster/en/home http://www.alicebot.org/documentation/ptags.html http://www.chatbots.org/ http://www.chatbots.org/research/ http://www.chatbots.org/journals/ http://www.chatbots.org/papers/ http://www.chatterbotcollection.com/category_contents.php?id_cat=20 http://www.chatterbotcollection.com/item_display.php?id=6 http://www.alicebot.org/documentation/ http://www.alicebot.org/TR/2001/WD-aiml/ http://alicebot.blogspot.com/2009/03/basics-ii-simple-wildcard-reductions.html http://www.pandorabots.com/botmaster/en/aiml-converter-intro.html http://pandorabots.com/pandora/pics/spellbinder/desc.html http://www-05.ibm.com/de/media/downloads/beyond-advertising.pdf https://www.virtualeternity.com/ http://intellitar.com/ Intellitar™ is the developer of the Intelligent Avatar Platform™ (IAP) and creator of Virtual Eternity™. Founded in 2008, Intellitar is delivering the first intelligent and interactive technology platform for building and creating life-like avatars or an "Intellitar". The IAP allows a user to create and train his or her Intellitar to accurately reflect the personality, voice, look, knowledge, and life experiences of its creator. These Intellitar's can improve and expand the online experience for businesses and individual users in a more realistic and life-like manner. http://www.intellitar.com/Arwyn.php http://en.wikipedia.org/wiki/ELIZA http://www.evolver.com/ See www.loebner.net/Prizef/loebner-prize.html www.haptek.com See www.microsoft.com/speech/default.mspx See www.a-i.com See www.20q.net R. Wallace, The Elements of AIML Style, ALICE A. I. Foundation (2003), see www.alicebot.org Towards Avatars with Artificial Minds: Role of Semantic Memory W_lodzis_law Duch,† Julian Szyma´nski,‡ and Tomasz Sarnatowicz§ School of Computer Engineering, Nanyang Technological University, Singapore J. Weizenbaum, Computer Power and Human Reason: From Judgment to Calculation. W. H. Freeman & Co. New York, NY, 1976. Image-Based Avatar Reconstruction by Maria-Cruz Villa-Uriol1, Alba P´erez2, Falko Kuester1 and Nader Bagherzadeh1 Visualization and Interactive Systems Group 2Robotics and Automation Laboratory School of Engineering University of California Avatar Augmented Online Conversation by Hannes Högni Vilhjálmsson Submitted to the Program in Media Arts and Sciences,School of Architecture and Planning, on May 2, 2003 in Partial Fulfillment of the Requirements of the Degree of Doctor of Philosophy at the Massachusetts Institute of Technology Allen, J. (1995). Natural Language Understanding. Redwood City, CA, The Benjamin/Cummings Publishing Company, Inc. Andre, E., T. Rist, et al. (1998). "Integrating reactive and scripted behaviors in a life-like presentation agent." Proceedings of AGENTS'98: 261-268. “Visual pattern recognition by moment invariants,” IRE Transactions on Information Theory, vol. 8, no. 2, pp. J. P. Barreto, K. Daniilidis, N. Kelshikar, R. Molana, and X. Zabulis. EasyCal, http: //www.cis.upenn.edu/∼sequence/research/downloads/EasyCal/, January 2004. D. Friedman, A. Steed, and M. slater. Spatial social behavior in second life. In Intelligent Virtual Agents, Roger D. 1999. Simulation: The Engine Behind the Virtual World. eMatter 1, Simulation 2000 series (12). Also available at http://www.modelbenders.com/papers/sim2000/SimulationEngine.PDF. A.M. Collins, M.R. Quillian, Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior 8, 240-7, Hubal, R.C., Frank, G.A., and Guinn, C.I. AVATALK Virtual Humans for Training with Computer Generated Forces pdf The aim of this study has been to describe what the avatar is, how it structures our play and our participation with a fictional world, and how avatar-based singleplayer computer games are different from other kinds of singleplayer games; the avatar exploits the concretising realism of the computer as a simulating machine, and situates us in a gameworld via prosthetic and fictional embodiment. 4) APPENDICES http://www.intellitar.com/ Intellitar™ is the developer of the Intelligent Avatar Platform™ (IAP) and creator of Virtual Eternity™. Founded in 2008, Intellitar is delivering the first intelligent and interactive technology platform for building and creating life-like avatars or an "Intellitar". The IAP allows a user to create and train his or her Intellitar to accurately reflect the personality, voice, look, knowledge, and life experiences of its creator. These Intellitar's can improve and expand the online experience for businesses and individual users in a more realistic and life-like manner. From here 2nd portion of editing starts the figures go in list of figures and charts ok as I have done in sewquence and write the exact words I did in different I mean calibri font ok Photo phitting of core avatar face Core face model for avatar Wireframe view before adding WireFrame view for skin and with features adjustment of features Flat shading mode view after adjustment of features With Lores skin Incorporating model’s features such as eyes teeth skin etc.. Model with medres skin form…complete 3d face Generating a peculiar face for the avatar rendering it to morphing according to basic races here Asian race has been given more priority for the avatar’s look Rendering of avatar’s face from core avatar face to give a realistic look genetic algorithm is employed as a means of morphing employing randomness techinique of genetic algorithm of Basic Shape morphing of avatar’s feautres Basic Colour morphing for avatar Model with closed smile morphing Model with eyes blinking and open smile morphing Phoneme aah recitation Avatar’s core 3D face model with added features importing skin and hair for the realistic avatar creation ready to be used with channelized phonetic recitation capability & morphing of basic facial features to be able to converge with speech enabled avatar (Now niki u can add pics from our site of our avatar ok u add text urself) (The documentation remaining part from here starts) (Anbstract) Our goal involoves e a baseline AI engine, and the necessary tools to train and modify the artificial intelligence to reflect a more personal and uniquely created AI brain for user. The more content a user provides to its AI brain, the responses generated by the AI are more dynamic and begin to reflect the personality of the user. The graphical user web interface used for Avatar, supports the creation of an emotional speech database. Stimuli to elicit emotions can be provided by the interface, for example by reading a set of emotional sentences. which facilitates the real experience of the emotions. However, the sentences can also be personalized so as to help the reader to better immerse into emotional states. This procedure reduces the effort of building a prototypical personalized emotion recognizer to just a few minutes. (Put this under SAI’s Fundamental concept used after intelligent agens and before digital conversation ) Framework Autonomous Facial Expression Synthesis by FaceGenModeller and VirtualEternity’s Speech and Expression Technology SAI Bot is based on the classical Evie : Owned by Existor Portal’s Avatar Bot framework. Sensory Input Processing By Brain Actuator Response The model translates into the following: Sensory Input: Text, Speech, Music, Video, Still Images, Touch. Processing: Recognizing Input and synthesizing appropriate expression. Actuator Response: Showing selected expression on simulated face model. o The sensory input is any kind of stimulus provided to the Avatar bot by text. o The brain is where the input is processed and a response is synthesized. o The actuator is responsible for manifestation of the response. 3.2.2 Brain The function of SAI’s brain is to o Recognize and match input with patterns stored in the memory. o Synthesize response based on match found o Send signals to actuators responses to manifest the selected expression The technology behind the brain is described in depth in section 3.3.2 Training and Storage: - The brain is trained and stores a large number of input patterns. - Input patterns will be different corresponding to different stimuli. For example text input patterns will be language elements - All such patterns are then mapped to responses in the form of output response which basically works on unsupervised learning element of neural net employing backproprgation. - Expressions are quantified and stored in the robot’s memory. - Training is thus provided in terms of pattern creation for various stimuli, and expression creation and storage for response. 3.2.3 Response Quantification of a Facial Expression The face is a structure made of bones, muscles and skin. A graphical simulation uses the following replacements: Bones: A wireframe model to define facial structure Muscles: Contraction and Expansion motion provided to the wireframe Skin: Texture map on top of the wireframe Any expression is created by a particular orientation of bones and manipulation of skin as driven by muscles. Thus, the actuators are the facial muscles. Hence a facial expression can be quantified by storing the amount of contraction or expansion of each of the muscles of the face. Philosophy and Goals o Create conversationally interactive and environment sensitive animatronic characters o Progress towards the next generation of animatronic character technology Inputs: Visual, auditory, and proprioceptive sensory inputs Outputs: Vocalizations, facial expressions, communicative cues Hardware o Processors: insert urself Operating Systems o Windows NT Software o SolidWorks digital design software via FaceGenModeller 3.5 o Virtualeternity commercial portal’s technology is used to encode a non-linear interactive narrative o Sphinx: Open source voice recognition software and also Virtualeternity commercial portal’s speech – to-text synthesis technology via Myevoice inbuilt o visual perception system via FaceGenModeller and CharacterBuilder Features in avatar incorporated Avatar There are no limits. Artificial Intelligence is communication. Natural language is universal. Face Like an Instant Messenger, purely textual conversation can be very effective and the Avatar has speech and the effect is more powerful. Avatar SAI Bot can provide information in a natural and pleasing way. The avatar is created from a portfolio of images according to the requisite by the user Character SAI with dynamic, AI-controlled reactions and emotions, adds another layer of engagement. Voice The voices we use with our avatars are licensed from our partners such as MyEvoice partner of VirtualEternity portal And Similarly, the Speech recognition technology used o Movable Eyelids and eyebrows and speaking phonemes o visual perception concept o the software enables to create a complete environment and stage designed for interactive show. Learning AI The unique, universal contextual machine learning techniques are key. Memory of what happened when, where and why enable predictions of what should come next. From past conversations, whether with the public or specialist trainers, the semantic memory predicts the thing most appropriate for the Bot to say. In the same way it predicts the time reactions and emotions for an avatar to display, and a virtual typing style to present. With enough learning, these techniques enable lifelike, entertaining interactions on any subject, in any language. And all that is needed for the learning to work is interaction itself, setting up a positive feedback loop. Scripted AI SAI B when used in commercial applications calls for more than lifelike probabilities. It calls for certainty in the delivery and gathering of information, and the completion of processes. So we can write the script. Any software can provide a branching tree of possibilities, loops, asides, searches and more. Any software could provide fixed outputs along the way. Our outputs - what the Evie says - are highly dynamic, reflective of the users' needs and their language style. The real difference, however, is understanding. People can say things in nearly infinite numbers of ways, yet we humans understand seemingly without effort. Machines have usually failed at this task - for example, picking up on a few words and all too often getting the sense wrong. Emotional AI You will see Evie expressing emotions as you type, and as she responds. Our AI has learned how to do this from user feedback over several years. Words describing reactions to input, and emotions felt as she replies, including intensity, are converted to dynamic subtly-changing motion in avatar 3.3.2 Text Based Stimulus using A.I.M.L. 3.3.2.1 Introduction to A.I.M.L. In its current implementation uses a modified version of Artificial Intelligence Markup Language for text input recognition, processing and response synthesis. A.I.M.L. lies at the center of the robot’s brain. All intelligence is quantified, stored and processed using A.I.M.L. as the common format for intelligence data exchange. Here is a block diagram of the functioning of A.I.M.L. Block Diagram of functioning of A.I.M.L. Responder acts as an interface between human and the A.I.M.L. engine whose function is to interpret the data from responder and generate a response which can be sent back to the human via the responder. Bezz uses a modified version of A.I.M.L. which has been enhanced to include expression data along with text response. The expression data is parsed by the 3D simulation and appropriate expression is manifested. The A.I.M.L. program and the facial simulation program are thus connected by a pipe. Mapping the Classical Evie Existor’s bot Framework Elements of AIML AIML contains several elements. The most important of these are described in further detail below. [edit]Categories Categories in AIML are the fundamental unit of knowledge. A category consists of at least two further elements: the pattern and template elements. Here is a simple category: <category> <pattern>WHAT IS YOUR NAME</pattern> <template>My name is John.</template> </category> When this category is loaded, an AIML bot will respond to the input "What is your name" with the response "My name is John." [edit]Patterns A pattern is a string of characters intended to match one or more user inputs. A literal pattern like What Is your name will match only one input, ignoring case: "what is your name". But patterns may also contain wildcards, which match one or more words. A pattern like What is your name * will match an infinite number of inputs, including "what is your name", "what is your shoe size", "what is your purpose in life", etc. The AIML pattern syntax is a very simple pattern language, substantially less complex than regular expressions and as such not even of level 3 in the Chomsky hierarchy. To compensate for the simple pattern matching capabilities, AIML interpreters can provide preprocessing functions to expand abbreviations, remove misspellings, etc. [edit]Template A template specifies the response to a matched pattern. A template may be as simple as some literal text, like which will substitute the user's any information (if known) into the sentence. Template elements include basic text formatting, conditional response (if-then/else), and random responses. Templates may also redirect to other patterns, using an element called srai. This can be used to implement synonymy, as in this example (where CDATA is used to avoid the need for XML escaping): <category> <pattern>WHAT IS YOUR NAME</pattern> <template><![CDATA[My name is <bot name="name"/>.]]></template> </category> <category> <pattern>WHAT ARE YOU CALLED</pattern> <template> <srai>what is your name</srai> </template> </category> A hello world example in A.I.M.L. <category> <pattern>How are you today</pattern> <template> I'm fine, Thank you. </template> </category> <category> tag encloses all nodes. There can be many <category> tags in an A.I.M.L. file. <pattern> tag indicates what the user is expected to type. <template> is the corresponding response. Extension of A.I.M.L for Expressions A new custom tag <xp> is used to encode expression data into an A.I.M.L. file. Example: <category> <pattern>How are you today</pattern> <template> I'm fine, Thank you. <xp> smile </template> </category> The A.I.M.L. engine code was also modified to parse out expression data and send to the facial simulation program. Thus an extensive A.I.M.L. set with corresponding facial expressions is created as the knowledge of the robot. The A.I.M.L. based pattern matching and response is the robot’s intelligence. The output is a facial expression corresponding to text response both of which are driven by the initial text input A.I.M.L. related information can be found at the A.L.I.C.E. AI Foundation website Creators’ Thoughts Our goal is to simulate life, or provide the illusion of life. To that end, we are doing just that by trying to simulate a human (or anthropormorphic) entertainment experience through an Avatar : SAI (Simulated Artificial Intelligence Bot) put this under list of figures text-to speech processing Overview of a typical TTS system A text-to-speech system (or "engine") is composed of two parts:[3] a front-end and a back-end. The front-end has two major tasks. First, it converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words. This process is often called text normalization, pre-processing, or tokenization. The front-end then assigns phonetic transcriptions to each word, and divides and marks the text intoprosodic units, like phrases, clauses, and sentences. The process of assigning phonetic transcriptions to words is calledtext-to-phoneme or grapheme-to-phoneme conversion. Phonetic transcriptions and prosody information together make up the symbolic linguistic representation that is output by the front-end. The back-end—often referred to as the synthesizer—then converts the symbolic linguistic representation into sound. In certain systems, this part includes the computation of the target prosody (pitch contour, phoneme durations),[4] which is then imposed on the output speech.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download sai-avatar1.doc