Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Incomplete Nature wikipedia , lookup
Neuroethology wikipedia , lookup
Process tracing wikipedia , lookup
Nervous system network models wikipedia , lookup
Neural modeling fields wikipedia , lookup
Computational creativity wikipedia , lookup
Binding problem wikipedia , lookup
Cognitive neuroscience wikipedia , lookup
Neurophilosophy wikipedia , lookup
Institute of Empirical Research in Economics (IEW) PATTERN RECOGNITION AND MACHINE LEARNING Laboratory for Social & Neural Systems Research (SNS) 22-09-2010 Computational Neuroeconomics and Neuroscience 1 Course schedule Date Topic 13-10-2010 Density Estimation, Bayesian Inference Adrian Etter, Marco Piccirelli, Giuseppe Ugazio Chapter 2 20-10-2010 Linear Models for Regression Susanne Leiberg, Grit Hein 3 27-10-2010 Linear Models for Classification Friederike Meyer, Chaohui Guo 4 03-11-2010 Kate Lomakina 6 Kernel Methods I: Gaussian Processes 10-11-2010 Kernel Methods II: SVM and RVM Christoph Mathys, Morteza Moazami 7 17-11-2010 Justin Chumbley 8 22-09-2010 Probabilistic Graphical Models Computational Neuroeconomics and Neuroscience 2 Course schedule Date Topic 24-11-2010 Mixture Models and EM Bastiaan Oud, Tony Williams Chapter 9 01-12-2010 Falk Lieder Approximate Inference I: Deterministic Approximations 10 08-12-2010 Kay Brodersen Approximate Inference II: Stochastic Approximations 11 15-12-2010 Inference on Continuous Latent Variables: PCA, Probabilistic PCA, ICA 12 Lars Kasper 22-12-2010 Sequential Data: Hidden Markov Models, Linear Dynamical Systems 13 Chris Burke, Yosuke Morishima 22-09-2010 Computational Neuroeconomics and Neuroscience 3 Institute of Empirical Research in Economics (IEW) CHAPTER 1: PROBABILITY, DECISION, AND INFORMATION THEORY Sandra Iglesias Laboratory for Social & Neural Systems Research (SNS) 22-09-2010 Computational Neuroeconomics and Neuroscience 4 Outline - Introduction - Probability Theory - Probability Rules - BayesβTheorem - Gaussian Distribution - Decision Theory - Information Theory 22-09-2010 Computational Neuroeconomics and Neuroscience 5 Pattern recognition computer algorithms ο automatic discovery of regularities in data use of these regularities to take actions such as classifying the data into different categories classify data (patterns) based either on - a priori knowledge or - statistical information extracted from the patterns 22-09-2010 Computational Neuroeconomics and Neuroscience 6 Machine learning 'How can we program systems to automatically learn and to improve with experience?' the machine is programmed to learn from an incomplete set of examples (training set) the core objective of a learner is to generalize from its experience 22-09-2010 Computational Neuroeconomics and Neuroscience 7 Polynomial Curve Fitting ππ’πππ‘πππ: sinβ‘ (2ππ₯) 22-09-2010 Computational Neuroeconomics and Neuroscience 8 Sum-of-Squares Error Function ο 22-09-2010 Computational Neuroeconomics and Neuroscience 9 Plots of polynomials 22-09-2010 Computational Neuroeconomics and Neuroscience 10 Over-fitting Root-Mean-Square (RMS) Error: 22-09-2010 Computational Neuroeconomics and Neuroscience 11 Regularization Penalize large coefficient values M=9 M=9 22-09-2010 Computational Neuroeconomics and Neuroscience 12 Regularization: vs. M=9 22-09-2010 Computational Neuroeconomics and Neuroscience 13 Outline - Introduction Probability Theory Decision Theory Information Theory 22-09-2010 Computational Neuroeconomics and Neuroscience 14 Probability Theory Noise on measurements Uncertainty Finite size of data sets Probability theory: ο consistent framework for the quantification and manipulation of uncertainty 22-09-2010 Computational Neuroeconomics and Neuroscience 15 Probability Theory Marginal Probability Joint Probability 22-09-2010 Conditional Probability Computational Neuroeconomics and Neuroscience 16 Probability Theory i = 1, β¦,M j = 1, β¦,L nij: number of trials in which X = xi and Y = yj ci: number of trials in which X = xi irrespective of the value of Y rj: number of trials in which X = xi irrespective of the value of Y 22-09-2010 Computational Neuroeconomics and Neuroscience 17 Probability Theory Marginal Probability Joint Probability 22-09-2010 Conditional Probability Computational Neuroeconomics and Neuroscience 18 Probability Theory Marginal Probability Joint Probability 22-09-2010 Conditional Probability Computational Neuroeconomics and Neuroscience 19 Probability Theory Marginal Probability Joint Probability 22-09-2010 Conditional Probability Computational Neuroeconomics and Neuroscience 20 Probability Theory Sum Rule 22-09-2010 Computational Neuroeconomics and Neuroscience 21 Probability Theory Product Rule 22-09-2010 Computational Neuroeconomics and Neuroscience 22 The Rules of Probability Sum Rule Product Rule 22-09-2010 Computational Neuroeconomics and Neuroscience 23 Bayesβ Theorem p(X,Y) = p(Y,X) T. Bayes (1702-1761) P.-S. Laplace (1749-1827) 22-09-2010 Computational Neuroeconomics and Neuroscience 24 Bayesβ Theorem Polynomial curve fitting problem π π·|π π π π π|π· = π π· T. Bayes (1702-1761) posterior ο΅ likelihood × prior P.-S. Laplace (1749-1827) 22-09-2010 Computational Neuroeconomics and Neuroscience 25 Probability Densities 22-09-2010 Computational Neuroeconomics and Neuroscience 26 Expectations Expectation of f(x) is the average value of some function f(x) under a probability distribution p(x) Expectation for a discrete distribution: 22-09-2010 Expectation for a continuous distribution: Computational Neuroeconomics and Neuroscience 27 The Gaussian Distribution 22-09-2010 Computational Neuroeconomics and Neuroscience 28 Gaussian Parameter Estimation Likelihood function 22-09-2010 Computational Neuroeconomics and Neuroscience 29 Maximum (Log) Likelihood 22-09-2010 Computational Neuroeconomics and Neuroscience 30 Curve Fitting Re-visited 22-09-2010 Computational Neuroeconomics and Neuroscience 31 Maximum Likelihood Determine 22-09-2010 by minimizing sum-of-squares error, Computational Neuroeconomics and Neuroscience . 32 Outline - Introduction Probability Theory Decision Theory Information Theory 22-09-2010 Computational Neuroeconomics and Neuroscience 33 Decision Theory β’ Used with probability theory to make optimal decisions β’ Input vector x, target vector t β’ Regression: t is continuous β’ Classification: t will consist of class labels β’ Summary of uncertainty associated is given by β’ Inference problem: is to obtain from data β’ Decision problem: make specific prediction for value of t and take specific actions based on t Inference step Determine either 22-09-2010 or . Decision step For given x, determine optimal t. Computational Neuroeconomics and Neuroscience 34 Medical Diagnosis Problem β’ β’ β’ β’ X-ray image of patient Whether patient has cancer or not Input vector x: set of pixel intensities Output variable t: whether cancer or not β’ C1 = cancer; C2 = no cancer β’ General inference problem is to determine π(π₯, πΆπ ) which gives most complete description of situation β’ In the end we need to decide whether to give treatment or not ο Decision theory helps do this 22-09-2010 Computational Neuroeconomics and Neuroscience 35 Bayesβ Decision β’ How do probabilities play a role in making a decision? β’ Given input x and classes Ck using Bayesβ theorem π π₯|πΆπ π πΆπ π πΆπ |π₯ = π π₯ β’ Quantities in Bayes theorem can be obtained from p(x,Ck) either by marginalizing or conditioning with respect to the appropriate variable 22-09-2010 Computational Neuroeconomics and Neuroscience 36 Minimum Expected Loss Example: classify medical images as βcancerβ or βnormalβ Decision Truth β’ Unequal importance of mistakes β’ Loss or Cost Function given by Loss Matrix β’ Utility is negative of Loss β’ Minimize Average Loss β’ Regions 22-09-2010 are chosen to minimize Computational Neuroeconomics and Neuroscience 37 Why Separate Inference and Decision? Classification problem ο broken into two separate stages: β Inference stage: training data is used to learn a model for π πΆπ , π₯ = π π₯, πΆπ π πΆπ π π₯ β Decision stage: posterior probabilities used to make optimal class assignments Three distinct approaches to solving decision problems 1. Generative models: 2. Discriminative models 3. Discriminant functions 22-09-2010 Computational Neuroeconomics and Neuroscience 38 Generative models 1. solve inference problem of determining πΆπ each class-conditional densities π π₯|πΆπ π for π πΆπ |π₯ = π π₯ π π₯, πΆ π πΆ class separately and the prior probabilities π π π πΆπ , π₯ = π π₯ probabilities 2. use Bayesβ theorem to determine posterior π π₯|πΆπ π πΆπ π πΆπ |π₯ = π π₯ 3. use decision theory to determine class membership 22-09-2010 Computational Neuroeconomics and Neuroscience 39 Discriminative models 1. solve inference problem to determine π π₯|πΆπ π πΆ posterior class probabilities π πΆπ |π₯ = π π₯ 2. Use decision theory to determine class membership 22-09-2010 Computational Neuroeconomics and Neuroscience 40 Discriminant functions 1. Find a function f(x) that maps each input x directly to a class label e.g. two-class problem: f (·) is binary valued f =0 represents C1, f =1 represents C2 ο Probabilities play no role 22-09-2010 Computational Neuroeconomics and Neuroscience 41 Decision Theory for Regression Inference step Determine Decision step For given x, make optimal prediction, y(x), for t Loss function: 22-09-2010 Computational Neuroeconomics and Neuroscience 42 Outline - Introduction Probability Theory Decision Theory Information Theory 22-09-2010 Computational Neuroeconomics and Neuroscience 43 Information theory β’ Quantification of information Degree of surprise: highly improbable ο a lot of information highly probable ο less information certain ο no information β’ Based on probability theory β’ Most important quantity: entropy 22-09-2010 Computational Neuroeconomics and Neuroscience 44 Entropy Entropy is the average amount of information expected, weighted with the probability of the random variable ο quantifies the uncertainty involved when we encounter this random variable H[x] 0 22-09-2010 p(x) Computational Neuroeconomics and Neuroscience 45 The Kullback-Leibler Divergence β’ Non-symmetric measure of the difference between two probability distributions β’ Also called relative entropy 22-09-2010 Computational Neuroeconomics and Neuroscience 46 Mutual Information Two sets of variables: x and y If independent: π π₯, π¦ = π π₯ π π¦ If not independent: 22-09-2010 Computational Neuroeconomics and Neuroscience 47 Mutual Information Mutual information 22-09-2010 ο mutual dependence ο shared information ο related to the conditional entropy Computational Neuroeconomics and Neuroscience 48 Course schedule Date 22-09-2010 13-10-2010 20-10-2010 27-10-2010 03-11-2010 10-11-2010 17-11-2010 24-11-2010 01-12-2010 08-12-2010 15-12-2010 22-12-2010 22-09-2010 Topic Chapter Probability, Decision, and Information Theory 1 Density Estimation, Bayesian Inference 2 Linear Models for Regression 3 Linear Models for Classification 4 Kernel Methods I: Gaussian Processes 6 Kernel Methods II: SVM and RVM 7 Probabilistic Graphical Models 8 Mixture Models and EM 9 Approximate Inference I: Deterministic Approximations 10 Approximate Inference II: Stochastic Approximations 11 Inference on Continuous Latent Variables: PCA, Probabilistic PCA, ICA 12 Sequential Data: Hidden Markov Models, Linear Dynamical Systems 13 Computational Neuroeconomics and Neuroscience 49