Download 130315

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Airborne Networking wikipedia , lookup

Transcript
DD2432 Artificial Neural Networks and Other Learning Systems
Exam 2013-03-15 at 14.00-19.00
Use a separate sheet for each question. Brief answers are preferred. Do not give several mutually
conflicting answers to a question (ingen helgardering). If you do, the alternative with lowest score
will be choosen. Allowed tools: Calculator and a standard english-other language dictionary may be
used.
Good luck! / Erik and Örjan
Question 1
(4p)
Combine the terms with the right description.
A) synapse efficacy
B) neuron output frequency
C) dendrite
D) action potential of a neuron
E) spatial and temporal summation properties
F) excitatory and inhibitory potentials
G) axon
H) soma
1) corresponds to ANN weight between two units
2) a pulse that is sent out along the axon of a neuron
3) corresponds to the transfer function of a node
4) corresponds to how weights can be positive or negative
5) corresponds to ANN node output value
6) corresponds to an input line of an ANN node
7) the summation component of a neuron
8) the output line of a neuron
Question 2
(4p)
A bipolar (-1, 1) perceptron is initialized with all weights and the bias set to 0. The learning rate  =
1 and the threshold  = 0. Set = 0 if net = 0. Compute the weight matrix after 1 epoch of training
using the Perceptron learning rule given the inputs and targets below. Show your calculations, and
not just the final answer.
input:
( 1 1)
( 1 -1)
(-1 1)
(-1 -1)
target:
1
-1
-1
-1
Question 3
(2p)
The Perceptron learning rule is said to stop unnecessesarily early compared to e.g. the Delta rule.
Describe what is meant by/behind this statement.
Question 4
(4p)
Which of these statements are correct when the number of hidden units increases in a two-layer
feed-forward network trained with back propagation? Give a short motivation for each answer.
a) More training patterns are required.
b) The net will be capable of approximating more complex functions.
c) The net will generalize better.
d) Training will be faster.
Question 5
(3p)
What is meant by principal components? In what situations can it be expedient (useful) to
use/compute these? Describe how a network can be used to find principal components by training,
give algorithm name and weight update formula.
Question 6
(3p)
a) In the Hopfield lab, you perform several tests to evaluate the storage capacity (in terms of how
many patterns that can be correctly retrieved). When you store 2-D images (simple clip-art like
binary images), how is the capacity compared to what would be expected? What is the reason for
this?
b) How can storage capacity of Hopfield nets be improved beyond the classical O (0.14N)
capacity limit?
Question 7
(4p)
a) The matrix below was suggested to be a weight matrix of a Hopfield net storing 3 patterns and
it was said to be able to reliably recall/retrieve these 3 patterns. Why would you even before
doing any calculations be skeptical about this?
0
-1
1
-1
-1
1
0
-1
-1
1
-1
-1
0
-1
1
-1
-1
1
0
1
1
-1
1
1
0
b) Assume a Hopfield network with bipolar {-1, 1} nodes has the following
0
2
-1
-1
-1
2
0
-1
-1
-1
-1
-1
0
2
2
-1
-1
2
0
2
weight matrix:
-1
-1
2
2
0
Is the following pattern x=[-1 1 -1 1 -1] a fix point (a stored pattern)?
Show your calculations.
Set output to 1 for sum=0.
Question 8
(2p)
The Bias-variance dilemma can be visualized in a 2-D graph. Make a sketch of such a graph
including curves and write out and explain what is on the axis and what the curves represent. In
the graph, there is a point of interest, which is it and where is it?
Question 9
(2p)
Give the name of an algorithm that can be used to process temporal data. Briefly descripbe the
network algorithm/topology.
Question 10
(6p)
In each case below, what type of problem/processing is this? For your solution, give algorithm
name, topology, what is input, output, how is training done etc?
a) A company selling music on the internet wants to give customers hints on what to buy. The idea
of the project is to provide the customer with suggestions of music bought by a set of other
customers. That set of customers should be selected as to be as similar as possible to the customer.
The company saves information for each customer about the titles the customer has bought. These
titles are described by a fairly large set of attributes (where each attribute describes the music, e.g.
hard rock, happy, fast beat etc). So, a customer can be described by a vector in the space of all those
attributes and with a magnitude along each dimention equal to how many songs bought with that
attribute (and with the total length of the vector normalized to unity). Your task is to construct the
network that produces the set of similar customers for a customer.
b) In a modern combustion engine, a number of set-parameters (valve timing, ignition timing etc)
can be changed during running to optimize performance. The optimal setting of these parameters
depends on both command signals (e.g. how much the gas pedal is pushed down) and
environmental/intrinsic factors (e.g. engine temperature). To set these parameters during actual
running in real life is different from setting them in the lab, so the manufacturer needs real-life data.
They have therefore equipped engines with a data logging equipment that saves command signals,
environmantal factors, set-parameters used as well as the resulting fuel consumption, car
acceleration and other performance variables. During the regular service visit, the company
downloads all this data and can off-line search for parameter settings that resulted in optimal
performance. Doing this, they have produced a set of optimal data points described by command
signals, environmental factors, set-parameters. Describe your network that produces output to get
optimal performance.