Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
UPMC MASTER Sciences de l’Ingénieur Spécialité SSIR A RENVOYER PAR FICHIER ATTACHE AVANT LE 15 NOVEMBRE 2008, A : [email protected] ET [email protected] Stage N° Ne rien inscrire dans la case Stage N° - Nom du laboratoire ou de l'entreprise : INRIA Bordeaux Sud-Ouest - Adresse complète : 351, cours de la libération, 33405 Talence - Personne à contacter : Pierre-Yves Oudeyer Tél : 0524574023 - Télécopie : …………………..…… E Mail : [email protected] Titre du stage : Development and evaluation of a competence-based curiosity-driven learning framework, and linkage with the theory of options. - Le stage nécessite-t-il un prérequis ? - Le sujet peut-il être traité par un étudiant de nationalité étrangère ? - Le stage pourra-t-il se poursuivre par un doctorat ? - Le stage est-il rémunéré ? Oui X Oui X Peut être Oui X Non Non Non Contenu : Prerequisites: good knowledge of reinforcement learning, scientific programming (Matlab, Scilab, …), C++. Context: Intrinsic motivations, associated with curiosity and spontaneous exploration, have been identified by psychologists as crucial for cognitive development in humans (Deci and Ryan, 1985). In the recent years, a growing number of artificial intelligence and robotics researchers have tried to implement intrinsic motivation systems in robots. One of the main objectives is to enable the autonomous, incremental and progressive formation of new skills that were neither pre-specified nor preprogrammed by a human engineer. These works have taken place in the framework of developmental/epigenetic robotics (Lungarella et al., 2006), which studies the mechanisms that can allow for the life-long learning of various tasks in unknown changing environments. Objectives: Intelligent Adaptive Curiosity (IAC), developed in (Oudeyer et al., 2007a), was a first attempt at implementing an intrinsic motivation system functioning in high-dimensional continuous sensorimotor spaces (see http://playground.csl.sony.fr). The basic idea consisted in having the robot be interested in situations in which its predictions about what may happen improve maximally fast. Following the terminology proposed in (Oudeyer and Kaplan, 2007b), this was a knowledge-based predictive curiosity-driven learning architecture. The goal of this internship is to develop and evaluate a new kind of curiosity-driven learning architecture, called “competence-based”. In this kind of architecture, the robot will be motivated by activities in which its mastery or competence at doing certain things improve. Furthermore, this algorithm shall be formulated in the “options framework” developed in reinforcement learning by (Sutton et al., 1999), and shall be regarded and evaluated as an automatic option creation algorithm. Indeed, the autonomous creation of options is a crucial and difficult research questions which might benefit from competence-based curiosity driven learning. A first prototype of the algorithm will be developed and tested in a simple abstract simulated sensorimotor space. After it has proven to be functional in this space, an experiment and evaluation will be conducted using a real robot based on the Bioloid kit. References: Deci E. et Ryan, R. (1985). Intrinsic Motivation and Self-Determination in Human Behavior. Plenum Press. Lungarella M., Metta G., Pfeifer R. and Sandini G. (2003). Developmental robotics: a survey. Connection Science, 15(4):151190. Oudeyer P-Y, Kaplan F. and Hafner, V. (2007a) Intrinsic Motivation Systems for Autonomous Mental Development, IEEE Transactions on Evolutionary Computation, 11(2), DOI: 10.1109/TEVC.2006.890271 Oudeyer P-Y., Kaplan F. (2007b) What is intrinsic motivation? A typology of computational approaches, Frontiers in Neurorobotics. http://www.csl.sony.fr/~py/oudeyer-kaplan-neurorobotics.pdf Sutton, R.; Precup, D. & Singh, S. (1999) Between MDPSs and semi-MDPs: A framework for temporal abstraction in reinforcement learning Artificial Intelligence, 1999, 112, 181-211. Autres remarques :