* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Symbol Grounding Problem has been solved. So
Incomplete Nature wikipedia , lookup
Visual Turing Test wikipedia , lookup
Speech-generating device wikipedia , lookup
Ecological interface design wikipedia , lookup
Artificial intelligence in video games wikipedia , lookup
Computer vision wikipedia , lookup
Computer Go wikipedia , lookup
Human-Computer Interaction Institute wikipedia , lookup
Knowledge representation and reasoning wikipedia , lookup
Intelligence explosion wikipedia , lookup
Wizard of Oz experiment wikipedia , lookup
Ethics of artificial intelligence wikipedia , lookup
Existential risk from artificial general intelligence wikipedia , lookup
Human–computer interaction wikipedia , lookup
History of artificial intelligence wikipedia , lookup
The Symbol Grounding Problem has been solved. So what’s next? Luc Steels1,2 1 Sony Computer Science Laboratory Paris, 2 VUB AI Lab, Vrije Universiteit Brussel November 5, 2006 In the nineteen eighties, a lot of ink was spent on the question of symbol grounding, largely triggered by Searle’s Chinese Room story. Searle’s article had the advantage of stirring up discussion about when and how symbols could be about things in the world, whether intelligence involves representations or not, what embodiment means and under what conditions cognition is embodied, etc. But almost twenty years of philosophical discussion have shed little light on the issue, partly because the discussion has been mixed up with arguments whether artificial intelligence was possible or not. Today I believe that sufficient progress has been made in cognitive science and AI that we can move forward and study the processes involved in representations instead of worrying about the general framework with which this should be done. 1 Symbols about the world As suggested by Glenberg, De Vega and Graesser [4], let us start from Peirce and the (much longer) semiotic tradition which makes a distinction between a symbol, the objects in the world with which the symbol is associated, for example for purposes of reference, and the concept (interpretant) associated with the symbol (see figure 1). Here we are interested primarily in triads for which there is a method that constrains the use of a symbol for the objects with which it is associated. The 1 Figure 1: The semiotic triad relates a symbol, an object, and a concept applicable to the object. The method is a procedure to decide whether the concept applies or not. method could be a classifier, a perceptual/pattern recognition process that operates over sensori-motor data to decide whether the object ’fits’ with the concept. If an effective method is available, then we call the symbol grounded. There are a lot of symbols which are not about the real world but about abstractions of various sorts and so they will never be grounded this way. Some concepts are also about other concepts. The Semiotic Map Together with these semiotic relations among the entities in the triad, there are some additional relations that provide more pathways for semantic navigation between concepts, objects, and symbols. Objects occur in a context and have various domain relationships with each other so that they can be grouped into sets. Symbols co-occur with other symbols in texts and speech, and this statistical structure can be picked up using inductive methods (as in the Landauer-Dumais [9] Latent Semantic Analysis proposal). Concepts may have semantic relations among each other, for example because they tend to apply to the same objects. There are also relations between methods, for example because they both use the same feature of the environment or use the same technique for classification. Humans effortlessly use all these links and make jumps far beyond what would be logically warranted. The various links between objects, symbols, concepts and their methods, constitute a semiotic map in the form of a giant network, which gets dynamically expanded and reshuffled every time the individual interacts with the 2 world or others. Individuals navigate through this network for purposes of communicaton: When a speaker wants to draw the attention of an addressee to an object, he can use a concept whose method applies to the object, then choose the symbol associated with this concept and render it in speech or some other medium. The listener gets the symbol, uses his own vocabulary to retrieve the concept and hence the method, and applies the method to decide which object might be intended. Thus if the speaker says ”Could you pass me the wine please”, then ”the wine” refers to a bottle on the table and the interaction (the language game) is a success if the addressee indeed takes up the bottle and gives it - or maybe poors some wine in the speaker’s glass. Clearly, grounded symbols play a crucial role in communication about the real world, but the other links may also play important roles. For example, the sentence earlier was ”Could you pass me the wine please” wereas in fact the speaker wanted the bottle that contained the wine. The speaker could also have said ”Could you pass me the Bordeaux please” and nobody would think that she requested to pass the city of Bordeaux. In this paper we will focus however only on grounded symbols as this is the main issue in the symbol grounding debate. Symbols in Computer Science The term symbol had a venerable history in philosophy and linguistics and I have tried to sketch its meaning. But it was hijacked in the late fifties by computer scientists (more specifically AI researchers) and adopted in the context of programming language research. The notion of a symbol in (so called symbolic) programming languages like LISP or Prolog is quite precise. It is a pointer (i.e. an address in computer memory) to a list structure which involves a string known as the Print Name (to read and write the symbol), possibly a value bound to this symbol, possibly a definition of a function associated with this symbol, and a list of properties and values associated with the symbol. The function of symbols can be recreated in other programming languages (like C++) but then the programmer has to take care of reading and writing strings and turn them into internal pointers and he has to introduce other datastructures to associate information with the symbol. In a symbolic programming language all that is done automatically. So a symbol is a very useful computational abstraction (like the notion of an array or structure) 3 and a typical LISP program might involve hundreds of thousands of symbols, created on the fly and reclaimed for memory (garbage collected) when the need arises. Programs that use symbols are called symbolic programs and almost all sophisticated AI technology but also a lot of web technology in search engines rests on this elegant but enormously powerful concept. But clearly this notion of symbol is not related to anything I discussed in the previous paragraphs. Unfortunately when philosophers and cognitive scientists who are not knowledgable about computer programing read the AI literature they come with the bagage of their own field and assume immediately that if one uses a symbolic programming language, you subscribe to all sorts of philosophical doctrines. All this has given rise to what is probably the greatest terminological confusion in the history of science. The debate about the role of symbols in cognition or intelligence is totally unrelated to whether one uses a symbolic programming language or not and the rejection of so called ’traditional AI’ is mostly based on misunderstandings. A neural network for example may be implemented technically using symbols, but these symbols are computer science symbols and not the philosophical type of symbols. The Symbol Grounding Problem Let me return now to the question originally posed by Searle: Can a robot deal with grounded symbols? More precisely, is it possible to build an artificial system that has a body, sensors and actuators, signal and image processing and pattern recognition processes, and information structures to store and use semiotic maps, and uses all that, for example in communication about the real world. Without hesitation, my answer is yes. Already in the early seventies, AI experiments like Winograd’s SHRDLU or Nilsson’s Shakey achieved this. Shakey was a robot moving around and accepting commands to go to certain places in its environment. To act appropriately upon a command like ’go to the next room and approach the big pyramid standing in the corner’, it is necessary to perceive the world, construct a world model, parse sentences and interpret them in terms of this world model, and then make a plan and execute it. Shakey could do all this. It was slow, but then it used a computer with roughly the same power as the processors we find in modern day hotel door knobs, and about as much memory as we find in yesterday’s mobile 4 phones. So what was all the fuzz created by Searle about? Was Searle’s paper (and subsequent philosophical discussion) based on ignorance or on a lack of understanding of what was going on in these experiments? Probably partly. It has always been popular to bash AI because that puts one in the glorious position of defending humanity. But one part of the criticism was justified: all was programmed by human designers. The semiotic relations were not autonomously established by the artificial agent but carefully mapped out and then coded by human programmers. In the case of natural intelligence, this is clearly not the case. The mind/brain is capable to develop autonomously an enormous repertoire of concepts to deal with the environment and to associate them with symbols which are invented, adopted, and negotiated with others. This early AI research did however make it very clear that the methods needed to ground symbols had to be hugely complicated, domain-specific, and contextsensitive, if they were to work at all. Continuing to program these methods by hand was therefore out of the question and that route was more or less abandoned. So the key question for symbol grounding is actually another question, well formulated by Harnad [7]: If someone claims that a robot can deal with grounded symbols we expect that this robot autonomously establishes the semiotic map that it is going to use to relate symbols with the world. Semiotic Dynamics Very recent developments in AI have now shown that this second question can also be resolved. We now know how to conceive of artificial systems that autonomously set up and coordinate semiotic landscapes and there is a growing number of concrete realisations achieving this [11]. That is why I believe we can now say that the symbol grounding problem is solved. The breakthrough idea is to set up a particular kind of ’semiotic’ dynamics. Each agent is endowed with the capacity to invent new concepts which initially act like hooks. The concepts get progressively more meaning by associating methods with them that relate the concepts to the world or by associating symbols with them to coordinate their use within the group. Agents interact with each other, for example through language games, and as part of the interaction they adapt both their grounding methods and the relation to 5 Figure 2: Results of experiments showing the coordination of color categories in a cultural evolution process. Agents play guessing games naming color distinctions and coordinate both their lexicons and color categories through the game. The graph shows category variance with (bottom) and without (top graph) language. Category variance is much less if agents use language to coordinate their categories. The color prototypes of all agents are shown in the right top corner with (right) and without (left) use of language. symbols. One way to implement this adaptation is to have weighted links that get updated and changed based on success in interactions. If all this is done right, we see a gradual convergence towards shared semiotic systems among the agents, in the sense that their concepts become sufficiently coordinated to make successful communication possible. It has been shown convincingly that interaction is crucial for coordinating the conceptual repertoires of agents. There is no global control, no telepathy and no prior (innate) programming of concepts for the agents, and hence coordination can only take place through adaptation based on local one-onone interaction (see for example figure 2 from [15]). Amazingly, it is possible to orchestrate the interaction between groups of agents so as to spontaneously self-organise communication systems without human intervention. Research has been steadily pushing up the complexity of these communication systems from simple lexical one-word utterances to complex grammatical multi-word utterances with rich underlying conceptualisations [14]. 6 Alignment in dialogue These fascinating developments are in sink with observations gaining ground (again) in various cognitive studies, showing that conceptualisations as used for language are not only culture- and individual-specific but also imposed on reality instead of statistically derived from it. More specifically: Studies of perceptually grounded categories, like color or sound distinctions, have shown that there are important differences between cultures [3] and that these differences have co-evolved with the languages used in these cultures. Similar studies have shown that there are also tremendous individual differences both in categorisation and in the use of symbols, even within the same community [17]. So the idea that there is an objective, universally shared unique ’meaning’ for human symbols is a fallacy. This may explain why neural networks, which usually implement some form of statistical induction, have never been able to go beyond toy examples with carefully prepared data sets. It also explains why projects like CYC that try to define ’the’ basic human concepts appearing in language or thought once and for all chase a holy grail that can never be found. If conceptualisations and the use of symbols is really individual and culture-based as well as domain and situation specific, how come we humans are ever able to communicate and share thoughts at all? The answer is in a phenomenon that psycholinguists are currently studying intensely, namely alignment (see for example [2], [10]). For the purposes of a specific dialogue or interaction, humans adapt their conceptualisations and language surprisingly quickly, so that efficient communication becomes possible. They do this at all levels: phonetic (the structure of the sign), lexicon, grammar, but also conceptual. The experiments with semiotic dynamics on autonomous robots or software agents can be seen as a way to operationalise the cognitive mechanisms that are needed to explain how these alignment phenomena as observed by psycholinguists are computationally possible. It is perhaps better to talk about co-ordination instead of alignment. The idea is that agents do not really align (in the sense of make equal) their semiotic systems - which is basically impossible because there is no telepathic way in which one can inspect the brain states of the other and the semiotic map of each individual is based on a history of millions of situated interactions - but that they sufficiently co-ordinate their classifiers and use of symbols temporarily to make 7 joint action and dialogue possible. The crucial insight is that the grounding of symbols is not a once and for all affair, but a matter of progressive and continuous coordination. Classifiers operationalising the use of a symbol are constantly adapted by the agents, both to deal better with the environment and to yield more success in communication with others within the same community. Another fascinating development is that we currently see systems emerging on the web that try to orchestrate and support human semiotic dynamics. They take the form of social tagging sites (such as Flickr, Delicio.us, Connotea, Last.Fm, etc.) that create peer-to-peer networks among users and allow them to associate tags (i.e. symbols) with objects that they upload and thus make available to others. The semiotic dynamics that is seen in the collaborative interaction among hundreds of thousands of users resembles the semiotic dynamics seen in human populations or populations of artificial agents, such as winner-take-all phenomena of a dominant tag, new tags emerging, shifts in tag use, etc. [5], [16]. 2 Representations and Meanings We now come to the second issue: representations. The notion of representation has had a venerable history in philosophy, art history, etc. before it was hijacked by computer scientists, where it became used without all the bagage that philosophers associate with the notion of representation. But then representations returned into cognitive science through the backdoor of AI, resulting in some rather confusing debates. Let me try to trace this shift of meaning, starting with the original pre-computational notion of representation, as we find for example in the famous essay on the hobbyhorse by Gombrich [6] and we still find in cultural studies. Defining Representations A representation is a stand-in for something else, so that it can be made present again, ’re’-’present’-ed. Anything can be a representation of anything by fiat. For example, a pen can be a representation of a boat or a person or upward movement, or whatever. The magic of representations happens simply when one person decides to establish that x is a representation for y, 8 and others could possibly agree with this or understand that this representational relation holds. There is nothing in the nature of an object that makes it a representation or not, it is rather the role the object plays in subsequent interaction. Of course it helps a lot if the representation has some properties that help others to guess what it might be a representation of. But a representation is seldom a picture-like copy of what is represented. It is clear that symbols are a particular type of representation, one where the physical object used has a rather arbitrary relation to what is represented. It therefore follows that all the remarks made earlier about symbols are valid for representations as well. The term symbol seems more often used for a building block of representations, whereas a representation can be more complex as it is typically a combination of symbols which compositionally express more global structured meanings. Human representations (and more complex symbols like grammatical sentences) have some additional features. First of all they seldom represent physical things, but rather meanings. A meaning is a feature which is relevant in the interaction between the person and the thing being represented. For example, a child may represent a fire engine by a square painted red. Why red? Because this is a distinctive important feature of the fire engine for the child. To represent something we select a number of important meanings and we select ways to invoke these meanings in ourselves or others. The invocation process is always indirect and requires inference. Without knowing the context, personal history, prior use of representations, etc. it is often very difficult to decipher representations. Second, human representations typically involve perspective. Things are seen and represented from particular points of view and often these perspectives are mixed together. A nice example of a representation is shown in figure 3. Perhaps you thought that this drawing represents a garden, with the person in the middle watering the plants. Someone else might have interpreted this as a kitchen with pots and pans and somebody cooking a meal. As a matter of fact, the drawing represents a British doubledecker bus (the word “baz” is written on the drawing). Once you adopt this interpretation, it is easy to guess that the bus driver must be sitting on the right hand side in his enclosure. The conductor is shown in the middle of the picture. The top shows a set of windows of increasing size and the bottom a set of wheels. Everything is enclosed in a big box which covers the whole page. This drawing has been made by a child, Monica, when she was 4,2 years old. Clearly the drawing is not a realistic depiction of a bus. Instead it ex9 Figure 3: Drawing by a child called Monica. Can you guess what meanings this representation expresses? presses some of the important meanings of a bus from the viewpoint of Monica. Some of the meanings have to do with recognising objects, in this case, recognise whether a bus is approaching so that you can get ready to get on it. A bus is huge. That is perhaps why the drawing fills the whole page, and why the windows at the top and the wheels at the bottom are drawn as far apart as possible. A bus has many more windows than an ordinary car, so Monica has drawn many windows. Same thing for the wheels. A bus has many more wheels than an ordinary car and so many wheels are drawn. The exact number of wheels or windows does not matter, there must only be enough of them to express the concept of ’many’. The shape of the windows and the wheels has been chosen by analogy with their shape in the real world. They are positioned at the top and the bottom as in a normal bus viewed sideways. The concept of ‘many’ is expressed in the visual grammar invented by this child. Showing your ticket to the conductor or buying a ticket must have been an important event for Monica. It is of such importance that the conductor, who plays a central role in this interaction, is put in the middle of the picture and drawn prominently to make her stand out as foreground. Once again, features of the conductor have been selected that are meaningful, in the sense 10 of relevant for recognising the conductor. The human figure is schematically represented by drawing essential body parts (head, torso, arms, legs). Nobody fails to recognise that this is a human figure, which cannot be said for the way the driver has been drawn. The conductor carries a ticketing machine. Monica’s mother, with whom I corresponded about this picture, wrote me that this machine makes a loud “ping” and so it would be impossible not to notice it. There is also something drawn on the right side of the head, which is most probably a hat, another characteristic feature of the conductor. The activity of the conductor is also represented. The right arm is extended as if ready to accept money or check a ticket. Composite objects are superimposed and there is no hesitation to mix different perspectives. This happens quite often in children’s drawings. If they draw a table viewed from the side, they typically draw all the forks, knives, and plates as separate objects ‘floating above’ the table. In figure 3, the bird’s-eye perspective is adopted so that the driver is located on the right-hand side in a separate box and the conductor in the middle. The bus is at the same time viewed from the side, so that the wheels are at the bottom and the windows near the top. And then there is the third perspective of a sideways view (as if seated inside the bus), which is used to draw the conductor as a standing up figure. Even for drawing the conductor, two perspectives are adopted: the sideways view for the figure itself and the bird’s-eye view for the ticketing machine so that we can see what is inside. Multiple perspectives in a single drawing are later abandoned as children try to make their drawings ‘more realistic’ and hence less creative. But it may reappear again in artists’ drawings. For example, many of Picasso’s paintings play around with different perspectives on the same object in the same image. The fact that the windows get bigger towards the front expresses another aspect of a bus which is most probably very important to Monica. At the front of a doubledecker bus there is a huge window and it is great fun to sit there and watch the street. The increasing size of the windows reflects the desirability to be near the front. Size is a general representational tool used by many children (and also in medieval art) to emphasise what is considered to be important. Most representations express not only facts but above all attitudes and interpretations of facts. The representations of very young children which seem to be random marks on a piece of paper are for them almost purely emotional expressions of attitudes and feelings, like anger or love. 11 Human representations are clearly incredibly rich and serve multiple purposes. They help us to engage with the world and share world views and feelings with others. But let us now turn to the computer science notion of representation. C-representations and m-representations In the fifties when higher level computer programming started to develop, computer scientists began to adopt the term ’representation’ for datastructures that held information for an ongoing computational process. For example, in order to calculate the amount of water left in a tank with a hole in it, you have to represent the initial amount of water, the flow rate, the amount of water after a certain time, etc., so that calculations can be done over these representations. Information processing can be understood as designing representations and orchestrating the processes for the creation and manipulation of these representations. Clearly computer scientists (and ipso facto AI researchers) are adopting only one aspect of representations: the creation of a ’stand-in’ for something into the physical world of the computer so that it could be transformed, stored, transmitted, etc. They are not thinking of the much more complex process of meaning selection, representational choice, composition, perspective taking, inferential interpretation, etc. which are all an important part of human representation-making. Let me use the term c-representations to mean representations as used in computer science, and m-representations for the original meaning-oriented use of representations in the humanities. In building AI systems there can be no doubt that c-representations are needed to do anything at all, and also among computational neuroscience researchers it is now accepted that there must be something like c-representations used in the brain, i.e. information structures for visual processing, motor control. Mirror neurons are a clear example of c-representations for actions. So the question is not whether or not c-representations are used in cognition but rather what their nature might be. As AI research progressed, a debate arose between advocates of ‘symbolic’ c-representations and ‘non-symbolic’ c-representations. Simplified, symbolic c-representations use building blocks that correspond to categories and they are the hall-mark of logic-based approach to AI, whereas non-symbolic crepresentations use continuous values and are therefore much more directly 12 related to the real world. For example, we could have a sensory channel picking up the wavelength of light, and numerical values on this channel (a non-symbolic c-representation), or we could have categories like ‘red’, ‘green’, etc. (a symbolic c-representation). Similarly we could have an infrared or sonar channel that reflects numerically the distance of the robot to obstacles (a non-symbolic c-representation), or we could have a discrete symbol representing ’obstacle-seen’ (a symbolic c-representation). When the connectionist movement came up in the nineteen-eighties, they de-emphasised symbolic c-representations in favor of the propagation of (continuous) activation values in neural networks (although some researchers used these networks later in an entirely symbolic way - as pointed out in the discussion by [4] of Roger, et.al.). When Brooks wrote a paper on ’Intelligence without representation’ [1] in the nineties, he wanted to argue that real-time robotic behavior could often be better achieved without symbolic c-representations. For example, instead of having a rule that says if obstacle then move back and turn away from obstacle, we could have a dynamical system that couples directly the change in sensory values to a change on the actuators. For example the more infrared reflection from obstacles, the slower the robot moves or begins to make a turn. But does this mean that all symbolic c-representations have to be rejected? This does not seem to be warranted, particularly not if we are considering natural language processing, expert problem solving, conceptual memory, etc. At least from a practical point of view it makes much more sense to design and implement such systems using symbolic c-representations and mix symbolic and non-symbolic c-representations when building embodied AI systems. But what about m-representations? There is no question that humans are able to create m-representations and judging from the great importance of text, art, film, dialogue, etc. in all human societies, representation-making and interpreting appears of great important to us. We have seen that models are beginning to appear how representations about the world are made and interpreted. But the fact that a system produces m-representations or is able to interpret them does not necessarily mean that this system uses them internally. The process to produce representations (including symbols and language) need not be a mere translation process. In fact it seems clear that it is a true construction process about which the outcome is uncertain, with a strong interaction between the final form and the conceptualisations that are 13 being expressed. This is similar to the way that a bird does not have a preconceived image of the nest (like an architectural design for a building) but goes through the movements of nest building which results in a particular nest embedded in the specific environment. However, as I argued elsewhere [12], there is a further possibility in which the brain (or an artificial system) might be able to construct and entertain m-representations, namely by internalising the process of creating mrepresentations. Rather than producing the representation in terms of external physical symbols (sounds, gestures, lines on a piece of paper) an internal image is created and re-entered and processed as if it was perceived externally. The inner voice that we each hear while thinking is a manifestation of this phenomenon. In such a case, we see the emergence of an ’intelligence with representation’ [13], in the full rich sense of m-representations. The power that such representations bring to the community in terms of memory, aid in communication, etc. then becomes available to the individual brain and the different components in the brain. 3 Notions of Embodiment We have seen that both the notion of symbol and representation are used by computer scientists in a quite different way than by cognitive scientists in general, muddling the debate. This has also happened I think with the term embodiment. In the technical literature (e.g. in patents), the term embodiment refers to the implementation, i.e. the physical realisation, of a method or idea. Computer scientists more commonly use the word implementation. An implementation of an algorithm is a definition of that algorithm in a form so that it can be physically instantiated on a computer and actually run, i.e. that the various steps of the algorithm find their analog in physical operations. It is an absolute truth in computer science that any algorithm can be implemented in many different physical media and on many different types of computer architectures and in different programming languages, even though of course it may take much longer (either to implement or to execute) in one medium versus another one. It is therefore natural that computer scientists apply this way of thinking to the brain as well. Their goal is to identify the algorithms and study them through various kinds of implementations which may not at all be brain-like. They also might talk about neural implementation or neural embodiment, 14 meaning the physical instantiation of a particular process with the hardware components and processes available to the brain. But most computer scientists know that the distance between an algorithm (even a simple one) and its embodiment, for example in an electronic circuit, is huge, with many layers of complexity intervening. And the translation through all these layers is never done by hand but by compilers and assemblers. It is very well possible that we will have to follow the same strategy to bridge the gap between high level cognitive models and low level neural implementations. There has been considerable resistance both from biologists and from philosophers for accepting this notion of implementation, i.e. the idea that the same process (or algorithm) can be instantiated (embodied) in many different media. They seem to believe that the (bio-)physics of the brain must have unique characteristics with unique causal powers. For example, Penrose has argued that intelligence is due to certain (unknown) quantum processes which are unique to the brain and therefore any other type of implementation can never obtain the same functionality. It is of course possible in theory to build physical systems that can implement computational steps much more efficiently or can implement something that is not implementable in another medium. But so far there is no serious evidence that cognitive processes might require them. There is another notion of embodiment which is also relevant in this discussion. This refers quite literally to ’having a body’ for interacting with the world, i.e. having a physical structure, dotted with sensors and actuators, and the necessary signal processing and pattern recognition to bridge the gap from reality to symbol use. Embodiment in this sense is a clear precondition to symbol grounding, and as soon as we step outside the realm of computer simulations and embed computers in physical robotic systems then we achieve some form of embodiment, even if the embodiment is not the same as human embodiment. It is entirely possible and in fact quite likely that the human body and its biophysical properties for interacting with the world make certain behaviors possible and allow certain forms of interaction that are unique. This puts a limit on how similar artificial intelligence systems can be to human intelligence. But it only means that artificial intelligence is necessarily of a different nature due to the differences in embodiment not that artificial intelligence itself is possible. 15 4 So What’s Next? The main goal of this essay was to clarify some terminology and attempt to indicate where we are with respect to some of the most fundamental questions in cognition. I boldly stated that the symbol grounding problem is solved, by that I meant that we now understand enough to create systems in which groups of agents self-organise a symbolic communication system that is grounded in their interactions with the world, and these systems may act as models to understand how humans manage to self-organise their communication systems. But this does not mean that we know all there is to know, on the contrary. The discussion of representations made it clear that symbols are a special case of representations, and that representation-making is a much more richer process than just having internal datastructures holding information about the world. Much is still to be learned about the enormously complex processes that humans effortlessly use to build up and navigate their semiotic networks in the process of creating and interpreting representations and I believe that our research effort should now be focused on understanding all this better. This can either be through psychological observations of representation-making in action (for example in dialogue or drawing) or through AI models of the cognitive processes that appear necessary to deal with representations. Acknowledgement This research was funded and carried out at the Sony Computer Science Laboratory in Paris with additional funding from the EU FET ECAgents Project IST-1940 and at the University of Brussels, VUB AI Lab. I am indebted to the participants of the Garachico workshop on symbol grounding for their tremendously valuable feedback and to the organisers Manuel de Vega, Art Glenberg and Arthur Graesser for orchestrating such a remarkably fruitful workshop. References [1] Brooks, R. (1991) Intelligence Without Representation. Artificial Intelligence Journal (47), 1991, pp. 139-159. [2] Clark, H. and S. Brennan (1991) Grounding in communication. In: Resnick, L. J. Levine and S. Teasley (eds.) Perspectives on Socially Shared Cognition. APA Books, Washington. p. 127-149. 16 [3] Davidoff, J. (2001). Language and perceptual categorisation. Trends in Cognitive Sciences, 5(9):382 - 387. [4] Glenberg, De Vega, Graesser (2005) Framing the debate. Discussion paper for the Garachico Workshop on Symbols, Embodiment, and Meaning. [5] Golder, S. and B. Huberman (2006) The Structure of Collaborative Tagging. Journal of Information Science. To appear. [6] Gombrich, E.H. (1969) Art and Illusion: A Study in the Psychology of Pictorial Representation Princeton, NJ: Princeton University Press [7] Harnad, S. (1990) The Symbol Grounding Problem. Physica D 42: 335346. [8] Kay, P. and Regier, T. (2003). Resolving the question of color naming universals. Proceedings of the National Academy of Sciences, 100(15):908509089. [9] Landauer, T.K. and S.T. Dumais (1997) A solution to Plato’s problem: the Latent Semantic Analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104, 211-240. [10] Pickering, M.J. and S. Garrod (2004) Toward a mechanistic psychology of dialogue. Behavioral and Brain Sciences. 27, 169 - 225. [11] Steels, L. (2003) Evolving grounded communication for robots. Trends in Cognitive Science 7(7), 308–312 [12] Steels, L. (2003) Language Re-entrance and the Inner Voice. Journal of Consciousness Studies, 10(4-5):173-185 2003. [13] Steels, L. (2004) Intelligence with Representation. Phil Trans Royal Soc Lond A. Volume 361, Number 1811 / October 15, 2003, pp. 2381- 2395. [14] Steels, L. (2005) The Emergence and Evolution of Linguistic Structure: From Minimal to Grammatical Communication Systems. Connection Science. Vol 17 (3). [15] Steels, L. and T. Belpaeme (2005) Coordinating Perceptually Grounded Categories through Language. A Case Study for Colour. Behavioral and Brain Sciences 24,6, 469–489. 17 [16] Steels, L. and P. Hanappe (2006) A Semiotic Dynamics approach to Emergent Semantics. Journal of Data Semantics, to appear. [17] Michael A. Webster and Paul Kay (2005) Variations in color naming within and across populations. Behavioral and Brain Sciences, Volume 28, Issue 04, August 2005, pp 512 - 513 18