Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Normative models of human inductive inference Tom Griffiths Department of Psychology Cognitive Science Program University of California, Berkeley Perception is optimal QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncomp resse d) de com press or are nee ded to s ee this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Körding & Wolpert (2004) Cognition is not QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Optimality and cognition • Can optimal solutions to computational problems shed light on human cognition? Optimality and cognition • Can optimal solutions to computational problems shed light on human cognition? • Can we explain aspects of cognition as the result of sensitivity to natural statistics? • What kind of representations are extracted from those statistics? Optimality and cognition • Can optimal solutions to computational problems shed light on human cognition? • Can we explain aspects of cognition as the result of sensitivity to natural statistics? • What kind of representations are extracted from those statistics? Joint work with Josh Tenenbaum Natural statistics Neural representation Images of natural scenes QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. sparse coding (Olshausen & Field, 1996) Predicting the future QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. How often is Google News updated? t = time since last update ttotal = time between updates What should we guess for ttotal given t? QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Reverend Thomas Bayes Bayes’ theorem Posterior probability Likelihood Prior probability p ( d | h) p ( h) p(h | d ) p(d | h) p(h) hH h: hypothesis d: data Sum over space of hypotheses Bayes’ theorem p(h | d) p(d | h)p(h) h: hypothesis d: data Bayesian inference p(ttotal|t) p(t|ttotal) p(ttotal) posterior probability likelihood prior Bayesian inference p(ttotal|t) p(t|ttotal) p(ttotal) posterior probability likelihood prior p(ttotal|t) 1/ttotal p(ttotal) assume random sample (0 < t < ttotal) The effects of priors Evaluating human predictions • Different domains with different priors: – – – – – a movie has made $60 million your friend quotes from line 17 of a poem you meet a 78 year old man a movie has been running for 55 minutes a U.S. congressman has served for 11 years [power-law] [power-law] • Prior distributions derived from actual data • Use 5 values of t for each • People predict ttotal [Gaussian] [Gaussian] [Erlang] people Gott’s rule empirical prior parametric prior Predicting the future • People produce accurate predictions for the duration and extent of everyday events • People are sensitive to the statistics of their environment in making these predictions – form of the prior (power-law or exponential) – distribution given that form (parameters) Optimality and cognition • Can optimal solutions to computational problems shed light on human cognition? • Can we explain aspects of cognition as the result of sensitivity to natural statistics? • What kind of representations are extracted from those statistics? Joint work with Adam Sanborn Categories are central to cognition Sampling from categories Frog distribution P(x|c) Markov chain Monte Carlo • Sample from a target distribution P(x) by constructing Markov chain for which P(x) is the stationary distribution • Markov chain converges to its stationary distribution, providing outcomes that can be used similarly to samples Metropolis-Hastings algorithm (Metropolis et al., 1953; Hastings, 1970) Step 1: propose a state (we assume symmetrically) Q(x(t+1)|x(t)) = Q(x(t))|x(t+1)) Step 2: decide whether to accept, with probability Metropolis acceptance function Barker acceptance function Metropolis-Hastings algorithm p(x) Metropolis-Hastings algorithm p(x) Metropolis-Hastings algorithm p(x) Metropolis-Hastings algorithm p(x) A(x(t), x(t+1)) = 0.5 Metropolis-Hastings algorithm p(x) Metropolis-Hastings algorithm p(x) A(x(t), x(t+1)) = 1 A task Ask subjects which of two alternatives comes from a target category Which animal is a frog? A Bayesian analysis of the task Assume: Response probabilities If people probability match to the posterior, response probability is equivalent to the Barker acceptance function for target distribution p(x|c) Collecting the samples Which is the frog? Which is the frog? Which is the frog? Trial 1 Trial 2 Trial 3 Verifying the method Training Subjects were shown schematic fish of different sizes and trained on whether they came from the ocean (uniform) or a fish farm (Gaussian) Between-subject conditions Choice task Subjects judged which of the two fish came from the fish farm (Gaussian) distribution Examples of subject MCMC chains Estimates from all subjects • Estimated means and standard deviations are significantly different across groups • Estimated means are accurate, but standard deviation estimates are high – result could be due to perceptual noise or response gain Sampling from natural categories Examined distributions for four natural categories: giraffes, horses, cats, and dogs Presented stimuli with nine-parameter stick figures (Olman & Kersten, 2004) Choice task Samples from Subject 3 (projected onto plane from LDA) Mean animals by subject S1 giraffe horse cat dog S2 S3 S4 S5 S6 S7 S8 Marginal densities (aggregated across subjects) Giraffes are distinguished by neck length, body height and body tilt Horses are like giraffes, but with shorter bodies and nearly uniform necks Cats have longer tails than dogs Markov chain Monte Carlo with people • Normative models can guide the design of experiments to measure psychological variables • Markov chain Monte Carlo (and other methods) can be used to sample from subjective probability distributions – category distributions – prior distributions Conclusion • Optimal solutions to computational problems can shed light on human cognition • We can explain aspects of cognition as the result of sensitivity to natural statistics • We can use optimality to explore representations extracted from those statistics Relative volume of categories Convex Hull Minimum Enclosing Hypercube Convex hull content divided by enclosing hypercube content Giraffe Horse Cat Dog 0.00004 0.00006 0.00003 0.00002 Discrimination method (Olman & Kersten, 2004) Parameter space for discrimination Restricted so that most random draws were animal-like MCMC and discrimination means Iterated learning (Kirby, 2001) QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Each learner sees data, forms a hypothesis, produces the data given to the next learner With Bayesian learners, the distribution over hypotheses converges to the prior (Griffiths & Kalish, 2005) Explaining convergence to the prior PL(h|d) PL(h|d) PP(d|h) PP(d|h) QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. • Intuitively: data acts once, prior many times • Formally: iterated learning with Bayesian agents is a Gibbs sampler on P(d,h) (Griffiths & Kalish, in press) Iterated function learning (Kalish, Griffiths, & Lewandowsky, in press) data hypotheses • Each learner sees a set of (x,y) pairs • Makes predictions of y for new x values • Predictions are data for the next learner Function learning experiments Stimulus Feedback Response Slider Examine iterated learning with different initial data