Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DD2432 Artificial Neural Networks and Other Learning Systems Exam 2013-03-15 at 14.00-19.00 Use a separate sheet for each question. Brief answers are preferred. Do not give several mutually conflicting answers to a question (ingen helgardering). If you do, the alternative with lowest score will be choosen. Allowed tools: Calculator and a standard english-other language dictionary may be used. Good luck! / Erik and Örjan Question 1 (4p) Combine the terms with the right description. A) synapse efficacy B) neuron output frequency C) dendrite D) action potential of a neuron E) spatial and temporal summation properties F) excitatory and inhibitory potentials G) axon H) soma 1) corresponds to ANN weight between two units 2) a pulse that is sent out along the axon of a neuron 3) corresponds to the transfer function of a node 4) corresponds to how weights can be positive or negative 5) corresponds to ANN node output value 6) corresponds to an input line of an ANN node 7) the summation component of a neuron 8) the output line of a neuron Question 2 (4p) A bipolar (-1, 1) perceptron is initialized with all weights and the bias set to 0. The learning rate = 1 and the threshold = 0. Set = 0 if net = 0. Compute the weight matrix after 1 epoch of training using the Perceptron learning rule given the inputs and targets below. Show your calculations, and not just the final answer. input: ( 1 1) ( 1 -1) (-1 1) (-1 -1) target: 1 -1 -1 -1 Question 3 (2p) The Perceptron learning rule is said to stop unnecessesarily early compared to e.g. the Delta rule. Describe what is meant by/behind this statement. Question 4 (4p) Which of these statements are correct when the number of hidden units increases in a two-layer feed-forward network trained with back propagation? Give a short motivation for each answer. a) More training patterns are required. b) The net will be capable of approximating more complex functions. c) The net will generalize better. d) Training will be faster. Question 5 (3p) What is meant by principal components? In what situations can it be expedient (useful) to use/compute these? Describe how a network can be used to find principal components by training, give algorithm name and weight update formula. Question 6 (3p) a) In the Hopfield lab, you perform several tests to evaluate the storage capacity (in terms of how many patterns that can be correctly retrieved). When you store 2-D images (simple clip-art like binary images), how is the capacity compared to what would be expected? What is the reason for this? b) How can storage capacity of Hopfield nets be improved beyond the classical O (0.14N) capacity limit? Question 7 (4p) a) The matrix below was suggested to be a weight matrix of a Hopfield net storing 3 patterns and it was said to be able to reliably recall/retrieve these 3 patterns. Why would you even before doing any calculations be skeptical about this? 0 -1 1 -1 -1 1 0 -1 -1 1 -1 -1 0 -1 1 -1 -1 1 0 1 1 -1 1 1 0 b) Assume a Hopfield network with bipolar {-1, 1} nodes has the following 0 2 -1 -1 -1 2 0 -1 -1 -1 -1 -1 0 2 2 -1 -1 2 0 2 weight matrix: -1 -1 2 2 0 Is the following pattern x=[-1 1 -1 1 -1] a fix point (a stored pattern)? Show your calculations. Set output to 1 for sum=0. Question 8 (2p) The Bias-variance dilemma can be visualized in a 2-D graph. Make a sketch of such a graph including curves and write out and explain what is on the axis and what the curves represent. In the graph, there is a point of interest, which is it and where is it? Question 9 (2p) Give the name of an algorithm that can be used to process temporal data. Briefly descripbe the network algorithm/topology. Question 10 (6p) In each case below, what type of problem/processing is this? For your solution, give algorithm name, topology, what is input, output, how is training done etc? a) A company selling music on the internet wants to give customers hints on what to buy. The idea of the project is to provide the customer with suggestions of music bought by a set of other customers. That set of customers should be selected as to be as similar as possible to the customer. The company saves information for each customer about the titles the customer has bought. These titles are described by a fairly large set of attributes (where each attribute describes the music, e.g. hard rock, happy, fast beat etc). So, a customer can be described by a vector in the space of all those attributes and with a magnitude along each dimention equal to how many songs bought with that attribute (and with the total length of the vector normalized to unity). Your task is to construct the network that produces the set of similar customers for a customer. b) In a modern combustion engine, a number of set-parameters (valve timing, ignition timing etc) can be changed during running to optimize performance. The optimal setting of these parameters depends on both command signals (e.g. how much the gas pedal is pushed down) and environmental/intrinsic factors (e.g. engine temperature). To set these parameters during actual running in real life is different from setting them in the lab, so the manufacturer needs real-life data. They have therefore equipped engines with a data logging equipment that saves command signals, environmantal factors, set-parameters used as well as the resulting fuel consumption, car acceleration and other performance variables. During the regular service visit, the company downloads all this data and can off-line search for parameter settings that resulted in optimal performance. Doing this, they have produced a set of optimal data points described by command signals, environmental factors, set-parameters. Describe your network that produces output to get optimal performance.