Download Slide 1

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Auditory processing disorder wikipedia , lookup

Sound localization wikipedia , lookup

Sensorineural hearing loss wikipedia , lookup

Olivocochlear system wikipedia , lookup

Auditory system wikipedia , lookup

The aim of this project was to investigate receptive fields on a neural
network to compare a computational model to the actual cortical-level
auditory processing. The receptive fields are also analyzed against
traditional methods of characterizing neural models such as tuning
Neural network
Output unit
Layout of the ear
Input units
Figure 3. Schematic of neural network
Figure 1. The basilar membrane and frequency-selectivity.
Image taken from
The ear is the earliest stage of auditory processing. The ear is divided
into three main areas: the outer ear, the middle ear, and the inner ear.
Transduction, the process of converting mechanical signals into electrical
potentials, takes place in the inner ear. The vibrations in the inner ear
selectively cause hair cells along the basilar membrane in the cochlea to
move. The motion of the hair cells allows electrical potentials to travel to
the auditory nerve and become processed by the brain. Hair cells are
theorized to be frequency-selective: specific pitches excite specific areas
of the basilar membrane. This layout of the ear makes it convenient to
represent sound as a function of frequency and time, instead of a function
of pressure and time.
Spectro-temporal receptive fields (STRFs)
Neural networks mimic biological processing by joining layers of
artificial neurons in a meaningful way. The neural network employed in
this project is a two-layer model that responds to spectrograms,
frequency vs. time distributions of sound.
This model utilizes time-delayed inputs to approximate temporal
processing. The first layer of the neural network contains the information
from three consecutive timesteps. The auditory information from the first
layer gets relayed to the second layer through a network of weights. The
value of a second layer artificial neuron is:
is the weight matrix, is the input from the current timestep,
is the timestep input that appeared 12 ms ago, and
is the
timestep input that appeared 24 ms ago.
Animals can only hear a very limited range of frequencies. In this
project, neural network artificially simulates a limited range of hearing
between 0.172 and 5.512 kHz to more accurately match the real world.
Unsupervised training
The neural network was trained on a simple set of frequency-modulated
(FM) sweeps and pure tones. The model modified its own weights
according to Oja's rule after each presentation of a timestep in the
training set. Only one or two artificial neurons were trained at one time
depending on the initial frequency of the training stimulus.
Moving ripples
Figure 2. STRFs from Mexican free-tailed bats. The image on the left
shows the STRF from a neuron with blocked inhibition and the image on
the right shows the STRF from a neuron with inhibition.
Image taken from Spectrotemporal Receptive Fields in the Inferior Colliculus Revealing
Selectivity for Spectral Motion in Conspecific Vocalizations by Andoni et al.
STRFs represent the linear properties of primary auditory processing
neurons and depict the neuronal impulse response characterizations at
frequency-time. STRFs are generated by collecting a neuron's responses
to different moving ripple stimuli. Since these stimuli are approximate
components of complex sounds, the STRFs characterize the neuron
response to spectro-temporally rich sound stimuli. Since STRFs describe
the neuronal responses in both the spectral and temporal dimensions,
they are hypothesized to be more useful than traditional methods of
describing neurons such as tuning curves.
Tuning curves
Tuning curves have been used extensively in both biological and
computational applications because they allow researchers to
quantitatively analyze the frequencies at which a specific auditory
neuron responds best. To generate these curves, the neuronal response to
pure tones varied across the frequency domain are collected. The
maximum response to each tone was plotted in a intensity vs. frequency
plot, and the peak of the plotted curve denotes the best frequency (BF) of
the artificial neuron. The neurons respond with the greatest intensity to
tones that match their BF and with decreasing intensity to tones away
from their BF.
Oja's rule
Unsupervised learning paradigms allow neural network models to
dynamically modify their own weighted connections between nodes,
analogous to the changes in synaptic plasticity between neurons. Oja's
rule, one type of unsupervised learning algorithm, can be shown as:
represents weight change between two units,
is the
current weight, is the learning rate, and
are the activation
values of the pre-synaptic and post-synaptic neurons, respectively.
Figure 4. Generation of STRFs through ripples.
The moving ripple stimuli are complex, broadband noises that are used
to determine the STRFs of artificial neurons. They are composed of
hundreds of densely packed, log-spaced pure tones that are sinusoidally
modulated in the spectral and temporal domains. The ripple equation is
given as:
is intensity at frequency-time points,
is modulation
depth, is the ripple velocity (Hz),
is the ripple frequency
(cycles/octave), and
is the phase shift (radians). These ripple stimuli
were varied across two parameters separately, the ripple velocity (Hz)
and the ripple frequency (cycles/octave).
The transfer function (TF) is a broad characterization of an artificial
neuron's response to the various ripple stimuli and is defined by:
is the response phase (radians), and
is the response magnitude. A two-dimensional inverse Fourier transform
function was performed on the transfer function in order to generate the
desired STRF.