Download Randomness in Neural Networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Artificial neural network wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Transcript
Randomness in Neural
Networks
Presented by Paul Melman
Neural networks

Made up of many connected units, each one a non-linear
classifier

Train them by adjusting weights based on error, with
Backprop or other algorithms

Deep networks with many layers can produce high
classification accuracy, but have very long training times

Smaller networks are quick to train, but do not form rich
representations of training data
Random weights

Assign a subset of the weights randomly instead of
training

Three families of random weight models:
 Feedforward
networks with random weights (RW-FNN)
 Recurrent
networks with random weights (i.e. reservoir
computing)
 Randomized
kernel approximations
Basic premise

Random weights are used to define a feature map to
transform the input into a high dimensional space

Resulting optimization problem is linear least-squares

“Randomization is… cheaper than optimization” – Rahimi
& Recht
RW-FNN basic architecture
Dashed lines are fixed connections; solid lines are trainable
RW-FNN architecture cont.

B is typically much larger than the number of input
dimensions

Weights wm are drawn from predefines probability
distribution (potentially by an order of magnitude)
RW-FNNs cont.

Additive methods:

RBF methods:
 Each
function chosen as radial basis function
Kernel approximation

Random sampling can be used for kernel approximation

Kernel methods are often expensive in terms of time and
memory, random methods reduce these costs

Sample randomly from the kernel matrix

Design stochastic approximation of kernel function
Recurrent networks

Dynamic data with a temporal component is difficult for a
feedforward network to learn

Recurrent neural networks (RNNs) have connections going
in reverse, allowing for temporal processing

Units get information about prior states of other units in
the network via these connections
Reservoir computing

Recurrent layer of fixed, randomly generated
nonlinearities
Reservoir computing cont.

RC architectures are very successful in tasks that require
relatively short memory processing, including:

Grammatical inference

Stock price prediction

Speech recognition

Robotic control

Acoustic modeling
Echo state property

Reservoirs with random weights can be unstable; they may
oscillate or behave chaotically

The effects of any given input state must vanish over
time, so that it does not persist indefinitely, or worse,
become amplified

Having the reservoir “forget” prior states after a certain
number of epochs prevents these problems
New techniques

Lateral inhibition
 Biologically
inspired process by which the activity of
one unit inhibits adjacent units
 Can
be implemented by having multiple smaller
reservoirs which inhibit adjacent reservoirs

Intrinsic plasticity
 Add
adaptable parameters to nonlinearity function of
reservoir
References

Scardapane, Simone, and Dianhui Wang. "Randomness in
neural networks: an overview." Wiley Interdisciplinary
Reviews: Data Mining and Knowledge Discovery 7.2
(2017).