Download Master in Artificial Intelligence (UPC-URV-UB)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Field research wikipedia , lookup

Transcript
Master in Artificial Intelligence (UPC-URV-UB)
Master’s Thesis proposal
General Information
Master’s Thesis Title:
Orientation:
A study of symbolic techniques for regular grammatical inference
and its applications
professional
research
M.Sc. Th. Advisor’s Dept.
LSI – UPC
& University:
René Alquézar and Alberto Sanfeliu
M.Sc. Th. Advisor:
M.Sc. Th. Advisor e-mail: [email protected]
Observations:
Toni Cebrián
Student Name:
M.Sc. Thesis Description
Main issues / Brief Description:
This thesis will study the state of the art in the regular grammatical inference (RGI) field and
perform a comparative analysis of the main algorithms, with a particular interest in the case of
probabilistic deterministic automata (PDFA). In order to accomplish this task,
a survey of the existing RGI techniques will be conducted followed by a qualitative and/or
quantitative analysis whenever a implementation may be easily obtained or developed.
This study will test the implemented algorithms against existing reference databases or
against a newly created dataset for testing purposes. From the conclusions of this study we hope to
devise new ideas and improvements over the existing techniques. Another aim will be to seek and
identify some real applications of the RGI techniques, mainly in the area of robotics.
Detailed Description:
Regular grammatical inference (RGI) is the field that covers the theory and methods for learning
regular languages in any of their possible representations (regular grammars, deterministic finitestate automata, non-deterministic finite-state automata) from string samples (either only positive
or both positive and negative) and/or queries to a teacher [1]. An interesting extension is the
inference of stochastic regular languages represented by probabilistic deterministic finite-state
automata [2-4], which allows the introduction of probabilities in the language modelling.
In this thesis, a study of the existing symbolic RGI techniques will be conducted followed by a
qualitative and/or quantitative analysis of the most relevant algorithms proposed for learning
stochastic regular languages (e.g. ALERGIA [2], ALISA [4]). Connectionist-based RGI techniques
[5-6] will be excluded from the study.
In order to carry out an experimental comparison, the implementation of the algorithms will have
to be obtained or developed. We rely also on the collaboration of two research groups in the field:
the GRFIA of the Universitat d’Alacant led by Jose Oncina and the LARCA group of the UPC led
by Ricard Gavaldà.
The experimental study will test the implemented algorithms against existing reference databases
or against a newly created dataset for testing purposes. Therefore, the search and selection of the
benchmarks is one of the former tasks to accomplish. From the conclusions of this study we hope
to devise new ideas and improvements over the existing techniques, which could settle the
foundations for a further deeper research in a subsequent Ph.D. thesis.
Despite its theoretical importance and methodological developments, few real applications of the
RGI techniques have been reported, mainly in the areas of speech recognition and natural language
processing. We are specially interested in finding some real application in the area of robotics;
hence, the search of such application will be another goal of the thesis. For that purpose, the
possibility of incorporating active reinforcement learning techniques might be investigated.
References:
[1] Higuera, C. d. l. (2005). A bibliography study of grammatical inference.
Pattern Recognition, 38, 1332 – 1348.
[2] Carrasco, R.C., Oncina, J., (1994).
Learning Stochastic Regular Grammars by Means of a State Merging Method,
Lecture Notes in Artificial Intelligence, vol. 862, pp. 130-152.
[3] Castro, J., Gavaldà, R. (1998).
Towards feasible PAC-learning probabilistic deterministic finite automata.
Proc. ICGI’98, Saint Malo, France.
[4] Thollard, F., Clark A. (2004).
Learning Stochastic Deterministic Regular Languages.
Proc. ICGI’04, Athens, Greece,
[5] Alquezar, R., & Sanfeliu, A. (1995). An algebraic framework to represent
finite state machines in single-layer recurrent neural networks.
Neural Computation, 7 (5), 931-949.
[6] Alquézar, R. (1997). Symbolic and connectionist learning techniques for grammatical
inference. Ph.D. thesis Univ. Politècnica Catalunya.
Other comments:
Barcelona, March 15th 2010