Download ppt - of Dushyant Arora

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Holonomic brain theory wikipedia , lookup

Neural coding wikipedia , lookup

Neural modeling fields wikipedia , lookup

Neural oscillation wikipedia , lookup

Neuropsychopharmacology wikipedia , lookup

Optogenetics wikipedia , lookup

Channelrhodopsin wikipedia , lookup

Gene expression programming wikipedia , lookup

Metastability in the brain wikipedia , lookup

Central pattern generator wikipedia , lookup

Neural binding wikipedia , lookup

Neural engineering wikipedia , lookup

Nervous system network models wikipedia , lookup

Development of the nervous system wikipedia , lookup

Artificial neural network wikipedia , lookup

Catastrophic interference wikipedia , lookup

Convolutional neural network wikipedia , lookup

Recurrent neural network wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Transcript
Caching & Replacement of
Multimedia Streaming Objects
using Soft Computing in Fixed
Networks
Neural Networks : Introduction
• Simplified model of human brain.
• Massively parallel distributed processing system.
• Ability to learn and thereby acquire knowledge.
Characteristics of neural networks
• Generally appropriate for problems where the
final answer depends heavily on combinations
of many input features.
• They exhibit mapping capabilities i.e. they can
map input patterns to their associated output
patterns.
Characteristics of neural networks
• They can be trained with known examples of a
problem before they are tested for their
‘inference’ capability on unknown instances of
the problem.
• Robust and fault tolerant.
• Can recall full patterns from incomplete,
partial or noisy patterns.
Model of an artificial neuron
Thresholding Unit
Activation functions
• Common activation function:
– Thresholding function: Compared to value Ф
– Here Ф=0.
Activation functions
Activation functions
Neural network architectures
• Single layer feedforward network
• Multilayer feedforward network
• Recurrent network.
Single layer Feedforward Network
• There are only two layers
– Input layer
– Output layer
• Only output layer performs the computation.
• Input layer merely transmits the signals.
Multilayer Feedforward network
In a feed forward network information always moves one direction; it never goes
backwards.
Recurrent Networks
• There is atleast one feedback loop from
output to input.
Learning Methods
Where Do The Weights Come From?
The weights in a neural network are the most important factor in determining its
function
Learning is the act of presenting the network with some sample data and
modifying the weights to better approximate the desired function.
There are two main types of training
1. Supervised Training
•
Supplies the neural network with inputs and the desired outputs, to
determine the error.
•
The weights are modified to reduce the difference between
the actual and desired outputs
2.
•
•
•
Unsupervised Training
Only supplies inputs
The neural network adjusts its own weights so that similar inputs
cause similar outputs
The network identifies the patterns and differences in the inputs
without any external assistance
Epoch
•
One iteration through the process of providing the network
with an input and updating the network's weights
•
Typically many epochs are required to train the neural
network
Perceptrons
• First neural network with the ability to learn
• Made up of only input neurons and output neurons
• Input neurons typically have two states: ON and OFF
Yk = f (netk) = 1, if netk >0
= 0, otherwise
netk = ∑ xiwik
0.5
0.2
0.8
Input neurons
Weights
Output neurons
How do perceptrons learn?
• Perceptrons cannot handle tasks which are not linearly seperable
• Set of points in two dimensional spaces are linearly seperable if the sets
can be seperated by a straight line
• Perceptron cannot find weights for problems that are not linearly
seperable. An example is the XOR problem.
XOR: Not linearly separable
XOR and its negation are the
only Boolean functions of two
arguments that are not linearly
separable
The XOR problem can be solved by multilayer
feedforward network
Backpropagation Networks
• Single neurons can perform certain simple pattern
detection functions, the power of neural
computation comes from the neurons connected in
network structure.
• For many years there was no theoretically sound
algorithm for training multilayer artificial neural
networks.
• Backpropagation was one of the first general
techniques developed to train multilayer networks
• Backpropagation networks use a gradient descent
method to minimize the total squared error of the
output
• A form of supervised training
• Simple, Slow, Prone to local minima issues
• Most common measure of error is the mean
square error
It is the sum over all points p in our data set of the squared difference
between the target value tp and the model's prediction yp, calculated from
the input value xp
Minimizing the error
•
The gradient of E gives us the direction in which the error function at the current
setting of the w has the steepest slope. In order to decrease E, we take a small
step in the opposite direction, -G
By repeating this over and over, we move "downhill" in E until we reach a
minimum, where G = 0, so that no further progress is possible
In order to train neural networks such as the ones shown above by gradient
descent, we need to be able to compute the gradient G of the error function with
respect to each weight wij of the network. It tells us how a small change in that
weight will affect the overall error E.
A small step (learning rate) in the opposite direction will result in the maximum
decrease of the (local) error function:
wnew = wold – α ∂E/∂wold
where α is the learning rate
Calculation of the derivatives flows backwards through the network, hence the
name, backpropagation
= - (ti - yi) yj
where
An important consideration is the learning rate µ, which determines by how
much we change the weights w at each step. If µ is too small, the algorithm will
take a long time to converge
Conversely, if µ is too large, we may end up bouncing around the error surface
out of control - the algorithm diverges
Backpropagation Learning
If the squashing function is the sigmoid function the derivative has
the convenient form
• Another popular choice of squashing function is tanh, which
takes values in the range (-1,1) rather than (0,1)
tanh(u)' = 1 - tanh(u)2
Adding a momentum term
• Tends to aid convergence
• Add a fraction of the previous weight change
to the current weight change.
Δwnew = βΔwold - α ∂E/∂wold
β is the momentum coefficient
wnew = wold + Δwnew
Addition of such a term smoothes out the descent path by preventing
extreme changes in the gradients due to local anomalies.
Replacement Issues
• Current replacement algorithms usually make
a binary decision on the caching of an atomic
object.
• The object is cached or flushed in its entirety
based on the time or frequency.
• The optimal caching algorithm caches the
objects partially, (i.e. some portion in a frame
is cached).
Replacement Issues
• A replacement algorithm, which uses just time
or frequency, would not suffice
• The algorithm should take into account size
because size is the most important factor in
case of multimedia objects.
• In case of videos, popularity of the videos is
quite essential, because once a video
becomes popular the requests for such videos
grow exponentially and less popular videos
are almost ignored.
Neural Network Proxy Cache
Replacement
• In the Neural Replacement Policy, whenever
the cache is full and a cache miss occurs ,the
NN algorithm determines the video frames to
evict by computing a mathematical merit
called cache metric for each of the video
depending on certain parameters viz. Size,
Frequency and Access recency
Neural Network Proxy Cache
Replacement
• Parameters
1) Access Frequency (Highest priority)
2) Size
3) Access Recency (Lowest priority)
Neural Network Proxy Cache
Replacement
• Access frequency:
Less is the frequency, more is the probablility
of that object being replaced.
• Size:
Larger size objects are given less priority so
that more number of objects can fit inside
cache.
Neural Network Proxy Cache
Replacement
• Access Recency:
Every video object has a field access recency
field T(r). Every time a request is made, this
field is calculated by subtracting the proxy
start time from the current time and is
updated. So a higher T(r) value indicates a
indicates more recently accessed object.
Neural Network Proxy Cache
Replacement
• Multilayer feed-forward artificial neural
network to handle web proxy cache
replacement decisions.
• weights of the network are adjusted using
back-propagation.
• sigmoid function is used as the activation
function for the neural network
Neural Network Proxy Cache
Replacement
• For approximating bounded continuous
functions as in our case, one hidden layer is
sufficient.
• If number of input neurons = j then no of
hidden neurons in hidden layer must be
between j & 2j.
• The exact no of hidden layer neurons can be
determined by the error and computation
time during training.
Neural Network Proxy Cache
Replacement
• The exact no of hidden layer neurons can be
determined by the error and computation
time during training.
• If number of input neurons = j then no of
hidden neurons in hidden layer must be
between j & 2j.
• For approximating bounded continuous
functions as in our case, one hidden layer is
sufficient.
Cache Metric Function
• The neural network is used to approximate the following cache
metric function:
H=
1
1+ exp(-F)
where F = fif Tir sis
fi = frequency of the ith video object/frame
Ti = Access recency time of ith video object/frame
si = Size of ith video object/frame
• All these values are normalized
• The indices f, r and s are integer constants. They signify the relative
priority given to the three parameters.
• For this cache metric, the values of these indices determined after
training are as f=5, r=2 and s= - 4.
Neural Network Proxy Cache
Replacement
• Neural network has a single output (tag value)
which is assigned to each video object and
also to each of its frames
• First, the video object with lowest tag value is
selected and then the frames with lower tag
values of that video are identified and evicted.
Training of neural network
• (3,4,1) Multilayer feedforward NN with
backpropagation.
• Initially randomized weights are assigned each
of the interconnection.
Frequency
Access
recency
Neural Network
Output
Tag Value
Size
Training of neural network
• Training Set of 100 video objects
• Each video having frequency, access recency
and size is given as input to neural network.
• Each of the inputs were normalized before
being fed to NN.
Frequency
Access
recency
Neural Network
Output
Tag Value
Size
Training of neural network
• Learning rate and momentum coefficient were
found to be optimum at 0.2 and 0.8
respectively.
Frequency
Access
recency
Neural Network
Output
Tag Value
Size
Pseudo code for replacement
• Initially we find cache size required for fulfilling all
the requests without a replacement policy. Next we
simulate the algo for different cache sizes.
• At start of program proxy server cache is empty and
we fix the cache size and cutoff .
• The client requests are stored on a text file. Also the
file(containing frame sizes) corresponding to each
video object is stored on the origin server.
• The proxy server starts reading the client requests
one by one.
Results
•
Neural Network Structure obtained after training (output written to a file)
Results
•
Output of neural network for the training set
The Proxy Cache
• The proxy server contains the following items1) Cached data of video objects
2) Binary tree storing entries corresponding to
each cached object
3) Stack for storing frames of each video.
Each node of binary tree holds the following values
1) Frequency
2) Size
3) Time Stamp
4) Neural network output(tag value)
The Proxy Cache
• The ordering parameter for the binary tree is
the tag value.
• For each cache miss a new node is inserted in
the tree.
• For each cache hit, we search the video in the
tree and update its parameters.
Replacement Algorithm
• As caching of the videos using OC algorithm
happens frame-by-frame, the replacement also
should happen frame-by-frame.
• Once the victim video is selected, the deletion of
frames is done from the last frame. If any request
for the video being deleted arrives, it could be
locked and other file could be chosen for deletion
and then the initial frames can be used to serve a
future request, at least partially
• Cache Hit
Start delivering the video file to the client and update all
parameters related to the video object.
• Cache Miss
If the current_cache_size + current_request_size <
Max_cache_size
Send request to the origin server. Start caching the video using
optimal caching algorithm and transfer data to the client.
Else
Run the replacement algorithm. Find victim video and
remove it from cache frame by frame until sufficient space is
created for the new video.