Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

Searching for Principles of Brain Computation Author: Wolfgang Maass Affiliation: Graz University of Technology, Institute for Theoretical Computer Science, Inffeldgasse 16b/I, A-8010 Graz, Austria; e-mail: maass@igi.tugraz.at Highlights: Computational role of transient network states Simultaneous recordings from many neurons constrain computational models Stochastic state transitions point to Markov chain computations Ongoing network rewiring and compensation through synaptic sampling Abstract Experimental methods in neuroscience, such as Ca-imaging and recordings with multielectrode arrays, are advancing at a rapid pace. They produce insight into the activity of large numbers of neurons and plastity processes in the brains of awake and behaving animals. These data provide new constraints for modelling neural computations and learning processes that underlie perception, cognition, and behaviour. I will discuss in this short review four such constraints: Inherent recurrent network activity and heterogeneous dynamic properties of neurons and synapses, stereotypical spatio-temporal activity patterns, high trialto-trial variability of network responses, and functional stability in spite of permanently ongoing changes in the network. I will explain that these constraints provide hints to underlying principles of brain computation and learning. Constraint/Principle 1: Neural circuits are highly recurrent networks of neurons and synapses with diverse dynamic properties Many concepts and computational models in computational neuroscience are strongly influenced by paradigms for the organization of computation in our current generation of digital computers. There one typically finds feedforward networks of computational modules that each carry out a precisely specified subcomputation, and transmit their result to the next processing stage. Such networks are silent in the absence of network inputs. In contrast, experimental data suggest that evolution has discovered a completely different way of organizing computations, that takes place in recurrent rather than feedforward networks of neurons. From the evolutionary oldest brains, such as brains of hydra (Dupre and Yuste, 2015), C-elegans (Kato et al, 2015), and zebrafish larvae (Portugues et al., 2014), to brains of humans (Raichle, 2015) one finds a rich repertoire of temporally extended activity patterns, both spontaneously and stimulus-evoked, that are characteristic for recurrent networks. Therefore recent reviews in systems neuroscience (Singer, 2013), (Yuste, 2015) emphasize the need to understand how highly recurrent networks of neurons in the brain compute. The theory of dynamical systems provides interesting concepts, such as criticality, attractors, and state-trajectories, that may help for that. But the traditional focus of this theory is on deterministic low-dimensional dynamical systems that consist of simple and homogeneous dynamic components. In contrast, recurrent neural networks in the brain are stochastic, very high-dimensional, and consist of different types of neurons and synapses with diverse dynamical features (Gupta et al., 2000), (Harris and Shepherd, 2015). These experimental have the unpleasant consequence that one would not even be able to implement in realistic models for neural circuits standard computing paradigms from computer science and articial neural networks, since these usually require that the network units are homogeneous and have no inherent temporal dynamics. Hence a new theory of computation in stochastic highdimensional dynamical systems with heterogeneous dynamic components is needed for elucidating brain computations. The liquid computing paradigm (Maass et al., 2002), (Maass et al, 2004), (Legenstein et al., 2007), (Buanomano et al, 2009) can be viewed as a first step in that direction. Theorem 1 in (Maass et al., 2002) points to a benefit of diverse dynamic components for computations that require a fading memory. This computing paradigm also exploits specific computational benefits of high-dimensional networks:The expressive capability of a linear readout, for example a projection neuron, is boosted if network inputs are projected nonlinearly into a high-dimensional state space, see the notion of a kernel in Support Vector Machines (Bishop, 2006). Remarkably, if one views a recurrent network of neurons as such kernel-like generic computational preprocessing stage, it is not relevant which nonlinear operations are carried out by them. It is only required that saliently different network inputs are mapped onto linearly independent network states. According to this computational model a recurrent network of neurons carries out two types of generic preprocessing steps for readouts: fading memory and generic nonlinear preprocessing (Fig. 1a). Obviously this model is very different from traditional feedforward processing schemes that I had sketched at the beginning of this section. It also differs from traditional artificial neural network approaches. But a related computing paradigm for artificial neural networks, that requires --in contrast to the liquid computing model-- that network units are homogeneous and noise-free, but also employs random connectivity and has a focus on readouts rather than internal network computations, had been developed independently in (Jäger, 2001). An obvious benefit of such processing strategy is that a single recurrent network can accumulate and preprocess simultaneously a large amount of incoming information, from which different projection neurons learn to select those features that are relevant for their specific projection target, see Fig. 1b for a simple demo and (Chen et al., 2013) for related experimental data. Mathematical results (Maass et al., 2007) imply that the capability of this computational model can be substantially enhanced through readout neurons --each trained for specific tasks-- which project their output through axon collaterals also back into the recurrent network. In fact, if there is no noise in the network, and readouts can compute arbitrary continuous functions, then a recurrent network can acquires in this way the computational power of a unversal Turing machine. Such extreme computational capabilities are not realistic in the presence of noise, but computer simulations demonstrate that readouts with feedback can definitely enhance the computational capability of the model: The model can learn to maintain internal network states (see Fig. 1 c, d), and switch these states as well as network computations in response to external cues (Fig. 1e). Figure 1: a) Computational model consistent with constraint 1; b) Demonstration that a generic spike-based network model with 135 neurons can support diverse linear readouts hat are trained to extract different pieces of information about two time-varying firing rates f1( t) and f2( t) represented by spiking activity of first 2 and last 2 input neurons shown at the top (see (Maass and Markram, 2007) for details). This provides a new paradigm for multiplexing of network computations. c) - e): Demonstration of nonfading memory and context-depending switching of computation on 4 input streams with timevarying Poisson firing rates r1( t),...,r4(t ) shown in c). One readout with feedback was trained to represent which of r1( t), r2( t) had last exhibited a burst (d; times when r1( t) had the most recent burst are marked in blue), and another readout was trained to switch between r3( t) + r4( t) and |r3( t) - r4( t)| in dependence of internal state represented in d). Readouts with feedback tend to be quite parameter sensitive and require well-tuned learning methods. Two methods for supervised training of readouts with feedback have been proposed (Maass et al., 2007) and (Sussillo and Abbott, 2009) build on preceding related work for artificial neural networks (Jäger and Haas, 2004). These learning methods work well in simulations, but do not aspire to be biologically realistic A biologically more realistic learning method based on reinforcement learning has more recently been proposed by (Hoerzer et al., 2014). A number of experimental studies have confirmed predictions of the liquid computing model, such as an inherent fading memory and generic nonlinear preprocessing (Nikolic et al., 2009), (Klampfl et al., 2012), (Stokes, 2015), generic projection into high dimensional network states (Rigotti et al., 2013) and state-dependent switching of computational operations (Mante et al., 2013). Constraint/Principle 2: Activity in neural network is dominated by variations of a few stereotypical patterns Many computational models are based on the assumption that neurons tend to encode specific features of sensory stimuli or internal variables, likea filter bank. This neuron-centric view is contradicted by data from simultaneous recordings from large numbers of neurons, which suggest that their joint activity, both spontaneously and stimulus-evoked, typically consists of variations of a rather small repertoire of spato-temporal firing patterns that engage assemblies of hundreds or more neurons. This has been shown for the primary sensory areas A1 (Luczak et al, 2009), (Bathellier et al., 2012) and V1 (Sadowski and Maclean, 2014), (Miller et al., 2014). It also has been shown as a result of learning for PFC (Fujisawa et al., 2008), and PPC (Harvey et al., 2012) . This feature of generic network activity obviously constrains models for neural computation and coding. Compelling theoretical models for the computational role of such cell assemblies are still missing, but many interesting ideas have been proposed (Buzsaki et al., 2010) . It is not even trivial to produce a neural network model whose response to external inputs is similarly dominated by variations of a small number of stereotypical patterns. One way of achieving that is to engage STDP for synaptic connections between pyramidal cells in a network model that induces competition for firing (and hence for STDP) via strong lateral inhibition, see (Klampfl and Maass, 2013) for a network model with simplified lateral inhibition, and Fig. 2 for a newer model where lateral inhibition is carried out by data-based interactions with inhibitiory neurons (Pokorny et al, 2016). Viewed as a computational principle, this phenomen facilitates the learning for readout neurons. Whereas without the occurrence of clear activity patterns for different stimuli a linear readout has to be trained by supervised learning or reinforcement learning to recognize and classify incoming stimuli to the network, this task can be carried out readout neurons that can learn to recognize without supervision or rewards which of a small repertoire of internal activity patterns is currently on (fig. 2, c,d,e). A closer look at the experimental data (and the simulation results of Fig. 2 a, b shows that these stereotypical activity patterns tend to have a clear temporal evolution. Hence they are reminiscent of the „phase sequences“ proposed by (Hebb, 1949) as a means to integrate temporally or spatially distributed features into a concept. Figure 2: a) Emergence of stereotypical assembly responses to repeating external input patterns (blue and green spike patterns, superimposed by noise spikes shown in black), embedded into Poisson spike input stream with the same rate. After about100 occurrences of each pattern, stereotypical network responses (assembly sequences) emerge through STDP for synaptic connections between pyramidal cells in a model for generic networks of pyramidal cells and PV+ inhibitory neurons that is based on data from the Petersen Lab (see (Avermann et al, 2012). Shown are simulation results from (Pokorny et al., 2016) that corraborate earlier results from (Klampfl and Maass, 2013) for a simplified model. b) Mean firing times of each neuron in one of the two assemblies marked by a white dot, with a histogram of relative firing times in color coding on the same row. c) -e) Emergent stereotypical network responses (d and e) to repeated input patterns (spoken digits, c) become sufficiently different so that a WTA readout circuit of 4 neurons (bottom panel of d) learns without any supervision to classify the spoken digits. Constraint/Principle 3: Networks of neurons in the brain are spontaneously active and have high trial-to-trial variability Virtually all neural recordings show that network responses vary substantially from trial to trial. This is no surprise, since not only channel kinetics in dendrites but also synaptic transmission (Branco and Staras, 2005), (Kavalali, 2015) appear to be highly stochastic processes. Furthermore networks of neurons in the brain represent ambiguous sensory stimuli not through superimposed interpretations, but through spontaneous flips between corresponding network states that represent alternative interpretations (Leopold and Logothetis, 1999), (Jezek et al., 2011). Also spontaneous activity of networks of neurons exhibits seemingly stochastic changes between highly structured network states, similary as in response to stimuli (Luczak et al., 2009), (Berkes et al., 2011), (Miller et al., 2014). These data force us to add stochasticity to the set of constraints for brain computation. Liquid computing models, as discussed in the context of constraint 1, are compatible with noise. But a key question is, whether models for computation with network states could benefit from noise. In computational sciences there are in fact well known paradigms of computational models that not only tolerate noise, but require noise for their computational strategy (Maass, 2014). For example, Markov chains are stochastic dynamical systems whose computational function or program consists of their stationary distribution of network states, which they approximate from any initial state after some finite mixing time. Boltzmann machines are Markov chains that can be „embodied“ by networks of perceptrons with stochastic rules for their switching between states 0 and 1. This class of computational models is good at solving constraint satisfaction problems (Aarts and Korst, 1988), probabilistic inference through Markov-chain Monte Carlo (MCMC) sampling (Andrieu et al., 2003), (Bishop, 2006) and for learning generative models of complex distributions from examples.. One can also consider related stochastic models for brain computation, where recurrent neural networks are viewed as Markov chains that sample from their stationary distribution of network states, conditioned on the current input e to the network (termed evidence in the language of probabilistic inference), see Fig. 3. It has been rigorously shown (Habenschuss et al., 2013), that even quite realistic models for such networks with noise have a unique stationary distribution p, where the firing or non-firing of each neuron is modelled by a separate random variable. The inherent stochastic dynamics enables such model to estimate ∑ ,…, | , ,…, | , through MCMC sampling posterior marginals such as where e could represent for example sensory evidence and internal goals, runs over all possible values of random variables that are irrelevant for the current computational taks, and a binary variable could represent for example the choice between two sources of food. This posterior marginal can be estimated according to a sampling model by simply observing the firing rate of a neuron whose firing or non-firing represents the values of the . A network of neurons in the brain that exhibits for a choice task binary random variable fluctuating network states as proposed by MCMC-sampling has recently been discoved in area OFC (Rich and Wallis, 2015). Figure 3: Model for probabilistic inference through sampling in generic cortical microcircuits, see (Habenschuss et al., 2013) for details. a) Synaptic weights and other parameters of a simple data-based generic cortical microcircuit model encode a unque stationary distribution of network states. The network state at time t can be defined for example as a binary vector that records which neuron fires in a small time window around t. Generic circuit computation can be viewed as estimate of a posterior --conditioned on external input e-- through sampling from the posterior distribution of network states. b) Instead of traditional measures for computation time, the time needed to converge from an initial state to the stationary distribution of the network becomes relevant. Shown is a common heuristic estimate (Gelman-Rubin ratio) for proximity of current sampling from the stationary distribution (when the ratio enters the gray zone below 1.1 one usually says that convergence has taken place). Computer simulations suggest quite fast convergence of estimates of a posterior marginal (solid lines: mean; dashed lines: worst case) -independently of network size. One interesting implication of sampling-based models for probabilistic inference is that one has to re-analyze the concept of the required computing time (response time). A samplingbased model can provide a rough estimate very fast, but needs to look at more samples for a more reliable answer. In addition, the samples that the stochastic system produces initially are still influenced by its initial state. Hence the mixing time of a Markov chain is a key component of the computing time for a stochastic computational model. According to first simulation results this mixing time could lie for data-based models of cortical circuitry within the range of behavioural response times (Fig. 3 c). Theoretical results show that in principle every Boltzmann machine can be approximatly represented through MCMC sampling of network states in a network of spiking neurons. The term „neural sampling“ had been coined for the resulting model of brain computation (Buesing et al, 2011). A suitable structure of cortical microcircuits enables networks of spiking neurons in principle to even go beyond spike-based emulations of Boltzmann machines with symmetric weights (which limits their capability to sample efficiently from distributions with higher than 2nd order dependencies), and to represent (Pecevski et al. 2011) and learn (Pecevski and Maass, 2016) more general types of distributions over many discrete variables in their stationary distribution of network states. Such distributions with higher order dependencies are needed for example to include the „explaining away“ effect in probabilistic inference that has been shown to be essential for visual perception (Knill and Kersten, 1991). The hypothesis that the human brain encodes substantial amounts of knowledge in the form of probability distributions has been subsequentially explored in cognitive science (Tenenbaum et al., 2011). Constraint/Principle 4: Networks configurations remain transient even in the adult brain Experimental data show that local network connectivity (Holtmaat and Svoboda, 2009), (Stettler et al, 2006), (Löwenstein at al., 2015) and neural codes (Ziv et al., 2013) are subject to permanently ongoing changes, even in the adult brain. This constraint forces us to consider the hypothesis that the brain samples not only network states (neural sampling) on the fast time scale of producing a behavioural response, but also network configurations on the slower time scale of learning (synaptic sampling). This hypothesis was recently explored in (Kappel et al., 2015). The resulting model suggests that brain networks sample continuously --through rewiring and synaptic plasticity-- from a stationary distribution of network configurations (Fig. 4). This distribution has most of its probability mass on a lowdimensional submanifold of a very high-dimensional space that is distinguished through structural constraints or priors (e.g., sparse connectivity), and good network performance. The Fokker-Planck equation provides clear links between local stochastic rules for rewiring and synaptic plasticity on one hand, and the resulting stationary distribution of network configurations and parameter settings on the other hand. An immediate benefit of such alternative organisation of network plasticity is that a network can immediately compensate for internal or external changes, since no clever supervisor is needed to unlock a permanently wandering network configuration from a previously found solution. Instead, such compensation can be viewed as probabilistic inference on the basis of new evidence e, only on the slower time scale of MCMC sampling from network configurations. This model suggests that an approach to understand the structure and organization of brain computations on the basis of compilation of connectivity data and neuron parameters from many individuals faces a systematic problem: These data just represent snapshots from somewhat unrelated dynamic processes in different individuals (Marder et al., 2006), and one would need to collect and compare their statistics from large numbers of specimen. Salient invariant properties and rules of neural circuits that are in permanent transition are likely to be encoded more directly in the underlying genetic machinery. Figure 4: Replacing the traditional maximum likelihood learning (ML) paradgm by sampling from a posterior of network configurations. Upper 3 plots illustrate common ML learning paradigm for a neural network modelled as generative model. The goal of learning is to reach a local optimum, which generally depends on the initial state of the learning system. Lower 4 plots show integration of a prior on generally desirable network configurations and suggest continuous sampling of network configurations from the resulting posterior through synaptic sampling (see (Kappel et al., 2015) for details). Conclusions The four constraints/principles for models of brain computation that I have discussed are compatible with each other. And they have to be compatible, since experimental tell us that they are all --more or less-- present in networks of neurons in the brain. But obviously there are tradeoffs between them from the functional perspective (see e.g. Fig. 12 in (Klampfl and Maass, 2013) for a tradeoff between the first two principles). Hence one may conjecture that the relative weight of each principle is regulated by the brain for each area and developmental stage in a task dependent manner. Altogether I have argued that the currently avaible experimental data provide useful guidance for understanding how cognition and behaviour is implemented and regulated by networks of neurons in the brain. A scientific approach that integrates experimental and theoretical neuroscience looks like a straightforward strategy. But it may be seen as contradicting the strategy proposed by (Marr and Poggio, 1976), to distinguish three levels of understanding brain computations: -- the computational (behavioural) level -- the algorithmic level -- the biological implementation level. This partition, and an emergent focus on the upper two levels, was certainly fruitful a few decades ago, when a good understanding on the computational level and functionally powerful models on the algorithmic level were still missing. However in view of the tremendous progress since then in algorithm design, theoretical neuroscience, and machine learning, we are now faced with numerous competing models on the algorithmic level that are all attractive from the functional perspective. The currently available experimental data can help us now to select and refine these models on the basis of constraints from the biological implementation level. Acknowledgements: I would like to thank David Kappel, Robert Legenstein, and Zhaofei Yu for helpful comments. This review was written under partial support the by the European Union project #604102. References Aarts, Emile, and Jan Korst. Simulated Annealing and Boltzmann machines. (1988). Andrieu, C., De Freitas, N., Doucet, A., & Jordan, M. I. (2003). An introduction to MCMC for machine learning. Machine Learning, 50(1-2), 5 Avermann, M., Tomm, C., Mateo, C., Gerstner, W., & Petersen, C. C. (2012). Microcircuits of excitatory and inhibitory neurons in layer 2/3 of mouse barrel cortex. Journal of neurophysiology, 107(11), 3116-3134. Bathellier B., Ushakova, L., Rumpel. S. Discrete neocortical dynamics predict behavioral categorization of sounds. Neuron 2012; 76:435-449. (*) Berkes, P., Orbán, G., Lengyel, M., & Fiser, J. (2011). Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science, 331(6013), 8387. Introduced new paradigm for the analysis of experimental data that aims at elucidating the probability distribution of network states. Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer. T. Branco and K. Staras. The probability of neurotransmitter release: variability and feedback control at single synapses. Nature Reviews Neuroscience, 10(5):373-383, 2009. (*) L. Buesing, J. Bill, B. Nessler, and W. Maass. Neural dynamics as sampling: A model for stochastic computation in recurrent networks of spiking neurons. PLoS Computational Biology, 7(11):e1002211, 2011. Introduced a mathematically rigorous framework for stochastic computation with spiking neurons through neural sampling. D. Buonomano and W. Maass. State-dependent computations: Spatiotemporal processing in cortical networks. Nature Reviews in Neuroscience, 10(2):113-125, 2009. (*)Buzsáki, G. (2010). Neural syntax: cell assemblies, synapsembles, and readers. Neuron, 68(3), 362-385. Reviews experimental data on assemblies and assembly sequences, and relates them to computational models in cognitive science. J. L. Chen, S. Carta, J. Soldado-Magraner, B. L. Schneider, and F. Helmchen. Behaviourdependent recruitment of long-range projection neurons in somatosensory cortex. Nature 499:336–340, 2013. C. Dupre and R. Yuste. Calcium imaging reveals multiple conduction systems in hydra. Abstract 94.21, Annual Meeting of the Society for Neuroscience, 2015 (*) Fujisawa, S., Amarasingham, A., Harrison, M. T., & Buzsáki, G. (2008). Behaviordependent short-term assembly dynamics in the medial prefrontal cortex. Nature neuroscience, 11(7), 823-833. Shows that assembly sequences encode learned behaviour in PFC. Gupta, A., Wang, Y., & Markram, H. (2000). Organizing principles for a diversity of GABAergic interneurons and synapses in the neocortex. Science, 287(5451), 273-278. (*) S. Habenschuss, Z. Jonke, and W. Maass. Stochastic computations in cortical microcircuit models. PLOS Computational Biology, 9(11):e1003311, 2013. Proposed a model for stochastic computation in data-based cortical microcircuit models. (*) Harvey, C. D., Coen, P., & Tank, D. W. (2012). Choice-specific sequences in parietal cortex during a virtual-navigation decision task. Nature, 484(7392), 62-68 Demonstrated the significance of assembly sequences for learning behaviours. S. Haeusler and W. Maass. A statistical analysis of information processing properties of lamina-specific cortical microcircuit models. Cerebral Cortex, 17(1):149-162, 2007 K. D. Harris and G. M. G. Shepherd. The neocortical circuit: themes and variations. Nature Neuroscience, 18(2):170-181, 2015 D. O. Hebb. The Organization of Behavior. New York: Wiley; 1949. G. M. Hoerzer, R. Legenstein, and Wolfgang Maass. Emergence of complex computational structures from chaotic neural networks through reward-modulated Hebbian learning. Cerebral Cortex, 24:677-690, 2014. Holtmaat, A., & Svoboda, K. (2009). Experience-dependent structural synaptic plasticity in the mammalian brain. Nature Reviews Neuroscience, 10(9), 647-658. (*)Jaeger H. The "echo state" approach to analysing and training recurrent neural networks. GMD Report 148, GMD - German National Research Institute for Computer Science (2001). Introduced the echo state network paradigm. Jaeger, H., & Haas, H. (2004). Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science, 304(5667), 78-80. Examined an artificial neural network model (echo state networks) that had been introduced simultaneously with the liquid computing model. (*) Jezek, K., Henriksen, E. J., Treves, A., Moser, E. I., & Moser, M. B. (2011). Theta-paced flickering between place-cell maps in the hippocampus. Nature, 478(7368), 246-249. Demonstrates sampling from different network states that represent alternative interpretations of sensory cues. (*) D. Kappel, S. Habenschuss, R. Legenstein, and W. Maass. Network plasticity as Bayesian inference. PLOS Computational Biology, 11(11):e1004485, 2015. Introduced a new learning paradigm based on transient network configurations and immediate compensation for disturbances. E. T. Kavalali. The mechanisms and functions of spontaneous neurotransmitter release. Nature Reviews Neuroscience, 16(1):5-16, 2015. (*) Kato, S., Kaplan, H. S., Schrödel, T., Skora, S., Lindsay, T. H., Yemini, E., ... & Zimmer, M. (2015). Global brain dynamics embed the motor command sequence of caenorhabditis elegans. Cell, 163(3), 656-669. Elucidated the perspective of a brain as dynamical system. (*)S. Klampfl and W. Maass. Emergence of dynamic memory traces in cortical microcircuit models through STDP. The Journal of Neuroscience, 33(28):11515-11529, 2013. Demonstrated emergence of assembly sequences through STDP and explored computational implications. S. Klampfl, S. V. David, P. Yin, S. A. Shamma, and W. Maass. A quantitative analysis of information about past and present stimuli encoded by spikes of A1 neurons. Journal of Neurophysiology, 108:1366-1380, 2012. Knill, D. C., & Kersten, D. (1991). Apparent surface curvature affects lightness perception. Nature, 351(6323), 228-230. R. Legenstein and W. Maass. Edge of chaos and prediction of computational performance for neural circuit models. Neural Networks, 20(3):323-334, 2007 Leopold, D. A., & Logothetis, N. K. (1999). Multistable phenomena: changing views in perception. Trends in cognitive sciences, 3(7), 254-264. Loewenstein, Y., Yanover, U., & Rumpel, S. (2015). Predicting the dynamics of network connectivity in the neocortex. The Journal of Neuroscience, 35(36), 12535-12544. (*)Luczak, A., Barthó, P., & Harris, K. D. (2009). Spontaneous events outline the realm of possible sensory responses in neocortical populations. Neuron, 62(3), 413-425. Focused attention on data which suggest that network activity during behaviour emerges from a low-dimensional subspace that is also explored during spontaneous activity. (*) W. Maass, T. Natschlaeger, and H. Markram. Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural Computation, 14(11):2531-2560, 2002 Introduced the liquid computing paradigm, and provided theoretical support for benefits of having heterogenous network components. (*) W. Maass, T. Natschlaeger, and H. Markram. Fading memory and kernel properties of generic cortical microcircuit models. Journal of Physiology -- Paris, 98(4-6):315-330, 2004 Explained the kernel property, role of high dimensionality, and multiplexing in the liquid computing model. (*) W. Maass, P. Joshi, and E. D. Sontag. Computational aspects of feedback in neural circuits. PLoS Computational Biology, 3(1):e165, 2007. Provided theoretical results on the computational power of readouts with feedback in the liquid computing model. W. Maass. Noise as a resource for computation and learning in networks of spiking neurons. Special Issue of the Proc. of the IEEE on "Engineering Intelligent Electronic Systems based on Computational Neuroscience", 102(5):860-880, 2014. V. Mante, D. Sussillo, K. V. Shenoy, and W. T. Newsome. Context-dependent computation by recurrent dynamics in prefrontal cortex. Nature 503:78–84, 2013 (**) Marder, E., & Goaillard, J. M. (2006). Variability, compensation and homeostasis in neuron and network function. Nature Reviews Neuroscience, 7(7), 563-574. Provided experimental data that point to an ongoing variability of network configurations and parameters. Marr, David, and Tomaso Poggio. "From understanding computation to understanding neural circuitry." (1976). MIT Tech Report. (*) Miller, J. E. K., Ayzenshtat, I., Carrillo-Reid, L., & Yuste, R. (2014). Visual stimuli recruit intrinsically generated cortical ensembles. Proceedings of the National Academy of Sciences, 111(38), E4053-E4061 Demonstrated how network activity in area V1 differs from predictions of traditional models based on feedforward processing (filter banks). D. Nikolic, S. Haeusler, W. Singer, and W. Maass. Distributed fading memory for stimulus properties in the primary visual cortex. PLoS Biology, 7(12):1-19, 2009 D. Pecevski and W. Maass. Learning probabilistic inference through STDP, 2016. Under review.. C. Pokorny, G. Griesbacher, W. Maass (2016). Emergence of assembly sequences through STDP in data-based cortical microcircuit models (in preparation) Portugues, R., Feierstein, C. E., Engert, F., & Orger, M. B. (2014). Whole-brain activity maps reveal stereotyped, distributed networks for visuomotor behavior. Neuron, 81(6), 1328-1343. Raichle, M. E. (2015). The restless brain: how intrinsic activity organizes brain function. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 370(1668), 20140172. E. L. Rich, J. D. Wallis. Competing neural representations of choice alternatives in orbitofrontal cortex during value-based decisions. Abstract 20.06 of the Annual Meeting of the Society for Neuroscience, 2015. M. Rigotti, O. Barak, M. R. Warden, X. J. Wang, N.D. Daw, E. K. Miller, and S. Fusi. The importance of mixed selectivity in complex cognitive tasks. Nature 497:585–590, 2013. Sadovsky, A. J., & MacLean, J. N. (2014). Mouse visual neocortex supports multiple stereotyped patterns of microcircuit activity. The Journal of Neuroscience, 34(23), 77697777. (*) W. Singer. Cortical dynamics revisited. Trends in Cognitive Sciences, 17(12):616-626, 2013 Emphases the need for new models for network computation. Stettler, D. D., Yamahachi, H., Li, W., Denk, W., & Gilbert, C. D. (2006). Axons and synaptic boutons are highly dynamic in adult visual cortex. Neuron, 49(6), 877-887. Stokes, M. G. (2015). ‘Activity-silent’working memory in prefrontal cortex: a dynamic coding framework. Trends in Cognitive Sciences, 19(7), 394-405. Sussillo, D., & Abbott, L. F. (2009). Generating coherent patterns of activity from chaotic neural networks. Neuron, 63(4), 544-557. (*)Tenenbaum, J. B., Kemp, C., Griffiths, T. L., & Goodman, N. D. (2011). How to grow a mind: statistics, structure, and abstraction. Science (New York, NY), 331(6022), 1279-1285. Highlights the role of probabilistic computation and internal representations of probability distributions in the human brain. (*) Yuste, R. (2015). On testing neural network models. Nature Reviews Neuroscience, 16(12), 767-767. Reviews the paradigm shift from single neuron data and models to network data and models. (*) Y. Ziv, L. D. Burns, E. D. Cocker, E. O. Hamel, K. K. Ghosh, L. J. Kitch, A. E. Gamal and M. J. Schnitzer. Long-term dynamics of CA1 hippocampal place codes. Nature Neuroscience 16: 264–266, 2013. Demonstrates that neural codes of place cells drift over time.