Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Master in Artificial Intelligence (UPC-URV-UB) Master’s Thesis proposal General Information Master’s Thesis Title: Orientation: A study of symbolic techniques for regular grammatical inference and its applications professional research M.Sc. Th. Advisor’s Dept. LSI – UPC & University: René Alquézar and Alberto Sanfeliu M.Sc. Th. Advisor: M.Sc. Th. Advisor e-mail: [email protected] Observations: Toni Cebrián Student Name: M.Sc. Thesis Description Main issues / Brief Description: This thesis will study the state of the art in the regular grammatical inference (RGI) field and perform a comparative analysis of the main algorithms, with a particular interest in the case of probabilistic deterministic automata (PDFA). In order to accomplish this task, a survey of the existing RGI techniques will be conducted followed by a qualitative and/or quantitative analysis whenever a implementation may be easily obtained or developed. This study will test the implemented algorithms against existing reference databases or against a newly created dataset for testing purposes. From the conclusions of this study we hope to devise new ideas and improvements over the existing techniques. Another aim will be to seek and identify some real applications of the RGI techniques, mainly in the area of robotics. Detailed Description: Regular grammatical inference (RGI) is the field that covers the theory and methods for learning regular languages in any of their possible representations (regular grammars, deterministic finitestate automata, non-deterministic finite-state automata) from string samples (either only positive or both positive and negative) and/or queries to a teacher [1]. An interesting extension is the inference of stochastic regular languages represented by probabilistic deterministic finite-state automata [2-4], which allows the introduction of probabilities in the language modelling. In this thesis, a study of the existing symbolic RGI techniques will be conducted followed by a qualitative and/or quantitative analysis of the most relevant algorithms proposed for learning stochastic regular languages (e.g. ALERGIA [2], ALISA [4]). Connectionist-based RGI techniques [5-6] will be excluded from the study. In order to carry out an experimental comparison, the implementation of the algorithms will have to be obtained or developed. We rely also on the collaboration of two research groups in the field: the GRFIA of the Universitat d’Alacant led by Jose Oncina and the LARCA group of the UPC led by Ricard Gavaldà. The experimental study will test the implemented algorithms against existing reference databases or against a newly created dataset for testing purposes. Therefore, the search and selection of the benchmarks is one of the former tasks to accomplish. From the conclusions of this study we hope to devise new ideas and improvements over the existing techniques, which could settle the foundations for a further deeper research in a subsequent Ph.D. thesis. Despite its theoretical importance and methodological developments, few real applications of the RGI techniques have been reported, mainly in the areas of speech recognition and natural language processing. We are specially interested in finding some real application in the area of robotics; hence, the search of such application will be another goal of the thesis. For that purpose, the possibility of incorporating active reinforcement learning techniques might be investigated. References: [1] Higuera, C. d. l. (2005). A bibliography study of grammatical inference. Pattern Recognition, 38, 1332 – 1348. [2] Carrasco, R.C., Oncina, J., (1994). Learning Stochastic Regular Grammars by Means of a State Merging Method, Lecture Notes in Artificial Intelligence, vol. 862, pp. 130-152. [3] Castro, J., Gavaldà, R. (1998). Towards feasible PAC-learning probabilistic deterministic finite automata. Proc. ICGI’98, Saint Malo, France. [4] Thollard, F., Clark A. (2004). Learning Stochastic Deterministic Regular Languages. Proc. ICGI’04, Athens, Greece, [5] Alquezar, R., & Sanfeliu, A. (1995). An algebraic framework to represent finite state machines in single-layer recurrent neural networks. Neural Computation, 7 (5), 931-949. [6] Alquézar, R. (1997). Symbolic and connectionist learning techniques for grammatical inference. Ph.D. thesis Univ. Politècnica Catalunya. Other comments: Barcelona, March 15th 2010