Download Fundamentals of Speech Recognition

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Fundamentals of Speech
Recognition
• Goal
– Automatic recognition of speech by machine
Fundamentals of Speech
Recognition
•
Disciplines applied to most of the speech
recognition problems:
 Signal Processing: the process of extracting relevant
information from the speech signal in an efficient and
robust manner.
 Physics: the science of understanding the relationship
between the physical speech signal and physiological
mechanisms that produces speech and with which the
speech is perceived.
 Pattern recognition: is the research area that studies the
operation and design of the systems that recognize
patterns in data.
Fundamentals of Speech
Recognition
 Communication and information theory: the methods for
detecting the presence of particular speech pattern.
 Linguistics: the relationship between sounds (phonology), words
in a language (syntax), meaning of spoken words (semantics),
and sense derived from the meaning (pragmatics).
 Physiology: understanding of the mechanisms within the human
central nervous system that account for speech production and
perception in human beings.
Fundamentals of Speech
Recognition
 Computer Science: the study of efficient algorithms for
implementing, in S/W and H/W, the various methods used
in a practical speech-recognition system.
 Psychology: the science of understanding the factors that
enable a technology to be used by human beings in
practical tasks.
The Paradigm Speech Recognition
The Paradigm Speech Recognition
• Word recognition model: (spoken o/p is recognized) Speech
signal is decoded into a series of words that are meaningful
according to syntax, semantics, and pragmatics.
• Higher-level processor: the meaning of the recognized words
is obtained. The processor uses a dynamic knowledge
representation to modify the syntax, semantics and the
pragmatics according to the context of what it has previously
recognized.
• The feedback limits the search for valid input sentences from
the user.
• The system responds to the user in the form of a voice output.
Go through a Brief History