* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Motivation
Survey
Document related concepts
Transcript
Speech Recognition Christian Schulze Design of a speech recognition system which distinguishes the figures 0 to 9 and the words yes/no Applications: - speech input of telephone numbers for cellular phones (necessary in cars) - announcement of the different floors in the elevator Problem Storage of all patterns requires too much memory Algorithm which compares respective words with all stored patterns requires much calculation power => too costly and too expensive Instead of storing the whole signal storage of representative features of the signal => One possiblity: formants What are formants? Speech consists of different tones which are combined with each other Every tone has a special spectrum in the frequency domain The maxima of the contour of the spectrum are called formants Every tone has its own representative formants (especially vowels) Data collection Recording of 50 analog samples per word Division of the signal into parts of 10 ms length Calculation of the spectrum using Discrete Fourier Transformation figure 8 (500 ms) Storage of the first two maxima => 2-Formants-Recognition-System Assign the signal into 1 of 12 classes (98 X 1) vector used as input vector for training of an MLP-network Smoothing of the spectrum using Cepstral Algorithm Network and results MLP using back propagation algorithm 3 hidden layers, each with 12 hidden neurons Learning rate=0.01, Momentum=0.1 100000 epochs So far best solution: learning success rate = 86.11% testing success rate = 61,67% => has to be improved upon