* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download NETADIS Research Project Overview The first list below gives the
Survey
Document related concepts
Artificial neural network wikipedia , lookup
Neuroinformatics wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Computational chemistry wikipedia , lookup
Computational fluid dynamics wikipedia , lookup
Artificial intelligence wikipedia , lookup
Network science wikipedia , lookup
Hendrik Wade Bode wikipedia , lookup
Molecular dynamics wikipedia , lookup
Theoretical computer science wikipedia , lookup
Data assimilation wikipedia , lookup
Operations research wikipedia , lookup
Transcript
NETADIS Research Project Overview The first list below gives the academic (public sector) partner institutions, called “beneficiaries”, that will host the NETADIS research projects. The second list outlines for each ESR (early stage researcher, i.e. PhD student) the scientific content envisaged for the relevant research project. Candidates interested in applying for a NETADIS studentship should contact the person listed at the relevant host institution, both for queries about the scientific content and for details of the application procedure for that institution. NETADIS Host Institutions Beneficiary Number Beneficiary name Beneficiary short name City & Country Contact person 1 (Coordinator) King's College London KCL London, U.K. Prof. Peter Sollich [email protected] 2a CNRS – Ecole Normale Supérieure ENS Paris, France Prof. Guilhem Semerjian [email protected] 2b CNRS – Université Paris-Sud Orsay Orsay, France Prof. Silvio Franz [email protected] 3 Technische Universität Berlin TUB Berlin, Germany Prof. Manfred Opper [email protected] 4 Politecnico di Torino Torino Torino, Italy Prof. Riccardo Zecchina [email protected] 5 CNR – Università degli Studi di Roma "La Sapienza" Rome Roma, Italy Dr. Luca Leuzzi [email protected] 6 Norges TekniskNaturvitenskapelige Universitet NTNU Trondheim, Norway Dr. Yasser Roudi [email protected] 7 Kungliga Tekniska Högskolan KTH Stockholm, Sweden Prof. Erik Aurell [email protected] ICTP Trieste, Italy (International organisation) Dr. Matteo Marsili [email protected] 8 The Abdus Salam International Centre for Theoretical Physics NETADIS Research Projects Position: ESR1 (KCL-1) Title: Sub-network analysis using projection methods Research Objectives: Even for well-studied protein interaction networks in systems biology, e.g. the ones underlying the operation of important signalling receptors such as epidermal growth factor receptor (EGFR), which plays a key role in cancer, much uncertainty remains in the identification of the network of molecular ingredients and pathways: we cannot assume that the entire network is known. Even if it were, analysing the dynamics mathematically requires restricting attention to a small enough sub-network. This project will study what can be said about the dynamics on the known part of the network in such a setting. Description of work and research methodology: The project will exploit projection techniques from statistical physics, developed similarly to focus from a large system down to a smaller set of observable quantities. Once qualitative insights have been established, including e.g. the response to perturbations that can be mediated via the larger network, the ESR will explore the reverse direction: can statistical inference be used to infer from observations of the sub-network dynamics something about the unknown network parts? Planning: The generic form of the projected equations of motion may be too complicated to work with. In this case techniques inspired from machine learning, e.g. expectation propagation, could be used to approximate further. Additional support for the IRP will arise from strong complementarities with research projects at TUB, but also from ongoing work at Torino, on determining maximally informative drug combination experiments for inferring signalling networks. Position: ESR2 (KCL-2) Title: Contagion dynamics across credit networks Research Objectives: This project will integrate credit derivatives and liquidity dynamics as crucially relevant degrees of freedom into interacting models of systemic risk across credit networks (see “Finance and socio-economic systems” above). Studying systemic risk created by credit default swaps (CDS) requires extending existing dynamical models of counter-party risk to include the three-vertex interactions generated by CDS contracts, while including effects of liquidity dynamics requires formulating feedback via additional degrees of freedom. Models will initially be studied in a schematic stochastic setting using generating functional methods. More realistic networks of dependencies will be investigated, once inference results on financial networks from the ICTP node, and further insights from network inference in non-equilibrium states (NTNU) become available. Implications for the control and regulation of financial networks to optimize their resilience will be explored. Description of work and research methodology: The project will use generating functional methods in schematic stochastic settings to analyse contagion and liquidity dynamics. Work will proceed from large connectivity networks (where statistical limit theorems help to simplify the analysis to low-connectivity/large heterogeneity situations. Analytic approaches will be complemented by simulation studies, which are expected to play a larger role when studying micro-realistic networks of financial dependencies. Planning: For networks with high degrees of heterogeneity the resulting macroscopic equations of motion may become too complex to solve analytically. In such situations an intermediate approach pioneered by Eissfeller and Opper, which allows to use simulation methods to investigate the analytically derived macroscopic equations of motion will be employed. Finally, if all else fails, relying to a larger extent on numerical simulations of microscopic models is always a safe fall-back position that can be used to explore the systems that we propose to study. Position: ESR3 (ENS) Title: Epidemic processes on networks, viral marketing and optimal vaccination Research Objectives: This project will study extremal properties of processes defined on networks. Consider e.g. an epidemic model where contaminated nodes can propagate an illness to their neighbours. A well-studied problem concerns the determination of the threshold p_c for the percolation of contamination when in the initial configuration each node is ill or not independently with some probability p. An optimization version of this problem, for which much less is known, corresponds to determining the minimal fraction of nodes that has to be activated in order to trigger an avalanche of contamination. As there is now freedom in the choice of the initial set of nodes, in general this threshold is smaller than the typical one, p_c. Finding efficient algorithms for determining the optimal set to be contaminated has crucial applications for viral marketing, where the epidemic models the adoption of a new product by customers. A dual optimization problem consists in finding the minimal set of nodes to vaccinate in order to block the propagation of the epidemic. The objectives are: 1) to find analytical expressions for these optimized thresholds on random graph models with simple contagion rules 2) develop algorithms able to find a close to optimal (in terms of size) set of initially infected/ vaccinated nodes valid for any single graph 3) extend these results to more complicated contamination rules. Description of work and research methodology: The first part of the work will consist in reformulating the above questions on the final state of a dynamical process as a static optimization problem, where the degrees of freedom are the nodes to be initially infected. Then the latter will be dealt with the powerful analytical (replica and cavity method) and algorithmic (message-passing) tools that have been developed for optimization problems on networks. Planning: In case the intended research methodology runs into trouble it would be possible 1) to rely on exhaustive numerical simulations in order to unveil the scaling laws governing the neighbourhood of the transition, 2) to derive bounds on the optimal fraction of infected node via probabilistic methods. Position: ESR4 (Orsay-1) Title: Decentralized network control and optimization Research Objectives: The objective of this project is to leverage tools from the realms of statistical physics, stochastic processes and Bayesian statistics to develop a powerful methodology for decentralized and efficient network control and optimization and to achieve fundamental advances in computing and communication networks. Such methods are already having substantial impact on computer science, coding theory and hard computational problems. Networks consisting of multiple heterogeneous devices inherently resemble a statistical physics disordered system in their complex structure, with randomness of interactions at a microscopic level, and unpredictability in the macroscopic outcome of these interactions. Description of work and research methodology: Statistical physics methods, such as the cavity method and message passing algorithms, theoretical and computational tools from disordered systems and random matrix theory will shed new light on the design of self-managed algorithms and will enable novel applications that depend on scheduling and routing, resource allocation, decentralized content distribution, inference and decision making amidst uncertainty. Planning: Message passing algorithms are extremely effective for solving problems on networks without loops. In the presence of loops – as in real word networks – there is no guarantee of convergence to exact solutions. In case of lack of convergence, other strategies combining analytic methods and numerical simulations will be used to get good approximations. Position: ESR5 (Orsay-2) Title: Inference of gene regulation networks by comparative genome analysis Research Objectives: Regulatory DNA regions that control the expression of a gene can be located kilobases away from it, especially in eukaryotes. Furthermore, binding sites for transcription factors are not necessarily strong and easily recognizable. The signal in the genome is often encoded in clusters of relatively weak binding sites. The kinetics of the genome exploration and the search of the binding sites by the transcription factors are also quite nontrivial. All these problems call for the development of new computational tools and theoretical models. Description of work and research methodology: In this project we will exploit ideas borrowed from information theory and statistical physics, including message passing, optimization theory, genetic algorithms etc. This work will be enriched by comparative analyses of the many genomes that are now sequenced. This opens up challenging problems such as graph alignment and inference of evolutionary scenarios, creating connections also with population evolutionary genetics where statistical physicists have traditionally been active. Planning: If the number of genomes in the data bases is too low, there is a risk of insufficient predictive power; we thus might be limited in our choice of regulatory subsystems. We expect that simultaneous use of transcriptome data will allow us to overcome this potential difficulty. Position: ESR6 (TUB) Title: Approximate inference for stochastic dynamics in large biochemical networks Research Objectives: Stochastic dynamical models describing biochemical networks on a molecular level could be used in order to decide which molecular pathway or network structure is more likely to describe a certain biological function in light of a limited amount of experimental data. A Bayesian inference approach in which all unobserved quantities (reaction parameters, states) conditioned on the observations are treated as random variables would provide the necessary likelihood of a given model structure. Unfortunately, for large networks, such an inference approach is computationally infeasible. The project will develop efficient approximations to this task. Description of work and research methodology: We will pursue a combination of dynamical functional methods of statistical physics (to average over parameters) with approximate inference methods for the resulting intractable model. Approximations based on machine learning methods (variational techniques, expectation propagation) will be combined with ideas from statistical physics such as variational perturbation theory. Planning: For very large systems, a full variational treatment could become too time-consuming. Simpler saddle point type of approximations could provide tractable alternatives. Additional support for the IRP will come from IRPs on non-equilibrium statistical physics methods at NTNU and KTH. Position: ESR7 (Torino) Title: Directed and undirected network inference by message passing and applications to gene regulation Research Objectives: This project deals with inference of gene regulation networks. Major challenges are (i) sparsemodel learning, since most regulatory networks are sparse, with one variable being directly influenced only by a small set of other variables, (ii) combinatorial inference, since the regulation of one variable by others may include nontrivial combinatorial effects in combining the single regulators, (iii) the handling of missing variables, i.e. of components of the system which are not determined by the measured data. Given the large size of real biological networks, approximate algorithmic approaches are needed; exact algorithms for network reconstruction are restricted to very small problems. Recent advances in the statistical physics of disordered systems have led to initial promising results. This project will exploit these for the full benefit of applications, here specifically inference of signaltransduction networks in cancer cell lines from multiple-perturbation data, and of residue-contact networks in proteins and protein complexes. Description of work and research methodology: We plan to apply techniques developed within the framework of the physics of disordered systems (with specific emphasis on the cavity methods) to study network reconstruction problems. In particular, the work will be focused on the development of distributed algorithms (design of parallel codes) suitable to effectively study large-scale problems. A thorough investigation of the computational efficiency of the devised algorithms will be performed. Planning: In order to minimize the risk connected to focusing on a single approach, which might lead to limitations in the treatable instances, we plan to carry on three approaches in parallel. In addition to the one described above, linear programming and Monte Carlo methods will be considered. This in turn will allow for comparisons among the results obtained with the three techniques and for an integrated approach, useful to overcome the intrinsic weaknesses of each method. Position: ESR8 (Rome-1) Title: Inference of regulatory controls in biochemical reaction networks Research Objectives: This project will study inference of regulatory controls in biochemical reaction networks, most notably in genome-scale reconstructions of cellular metabolism (where the interacting units are genes coding for reaction-catalyzing enzymes). This requires overcoming limitations of the available techniques, primarily message passing algorithms, to achieve (a) scalability to genome-scale graphs and (b) good performance on loopy graphs: biochemical networks are usually rich in loops, which hinder the convergence of message passing. An important byproduct would lie in defining more robust and biologically sound criteria for gene essentiality in metabolic networks, by combining topological and dynamical aspects (both already accounted for in the literature) with the regulatory element. Good essentiality predictors are key not only for pharmacological applications but for effective and scalable algorithms for the analysis of epistasis, i.e. interaction between genes that, when masked at the phenotypic level, can hinder the discovery of pathologies. Description of work and research methodology: The technical toolbox required for this project lies essentially in message-passing algorithms (for the analysis of topological properties of large networks), models for flux- and energybalance analysis (to predict reaction rates and chemical potentials in genome-scale networks), and Gillespie-like algorithms (to simulate the stochastic dynamics of small reaction modules). Continuous cross-reference to genomic, proteomic and metabolic databases will be crucial, since the project will focus strongly on the analysis of real cellular biochemical networks. Planning: The type of message-passing algorithms that are most effective on random graphs (e.g. belief propagation) may not converge on the loopy architecture of real metabolic networks. Less ambitious alternatives (e.g. warning propagation) are however known to be excellent substitutes in such cases, at the cost of being slightly more costly in terms of CPU time. Position: ESR9 (Rome-2) Title: Inference of coupling of waves in nonlinear disordered media Research Objectives: In random lasers, interactions among competing modes depend on the mutual spatial overlap of their electromagnetic fields modulated by a non-linear susceptibility. So far, localized mode distributions and nonlinear susceptibilities have never been successfully recovered from the analysis of the measurements of random laser spectra; this is a fundamental aim both conceptually (with the aim of experimentally testing theories of disordered systems that are in widespread use) and technologically (e.g., to determine the tolerance threshold to random impurities in photonic crystals). Different statistical mechanics methodologies will be adopted in this investigation: approaches to disordered systems (cavity method, random graphs, replica symmetry breaking), Monte Carlo simulation algorithms, and computational techniques for the inverse problem of reconstructing the mode network from the analysis of experimental measurements of phases and intensities in random laser systems. Description of work and research methodology: Cavity method, random graph theory, replica symmetry breaking theory approaches to disordered systems; Monte Carlo simulation algorithms for equilibrium and out of equilibrium systems. Finite size scaling techniques for data elaboration and quantitative estimates of finite size effects. Computational techniques for the inverse problem of reconstructing the mode network from the analysis of experimental measurements of phases and intensities in random laser systems. Planning: The step of reconstructing the mode interactions by means of statistical inference strongly depends on experimental data available at the time the ESR will have acquired the skill to perform such analysis. Where that data is too scarce, the direct numerical simulation of networks with microscopic features taken from the behaviour of experimentally studied compounds under laser pumping would be an alternative goal in its own right and lead to a satisfactory modeling of experimental properties, e.g. to determine the tolerance threshold to random impurities in photonic crystals or test theories of disordered systems. Position: ESR10 (NTNU) Title: Efficient inference of interactions from non-equilibrium data and applications to multi-electrode neural recording Research Objectives: With the advent of neural multi-electrode arrays and gene microarrays, we are now entering the stage where data required for reverse engineering large neuronal and genetic networks is becoming available. Current techniques for this reverse engineering, however, are mostly restricted to equilibrium/static models that usually do not give meaningful results, and/or cannot exploit all the information available in the data e.g. temporal patterns. Here, we first develop inference methods for kinetic models taking into account crucial features of biological networks and importantly limitations of real data, e.g. noise and experimental access to only small parts of a network at a time. Second, we apply these methods to multi-neural data collected at the Kavli Institute to understand the circuitry that underlies spatial navigation in mammals. Description of work and research methodology: We construct efficient inference methods for fitting models such as Generalized Linear Model and kinetic Ising networks with memory. For this we will use methods from nonequilibrium statistical physics mainly generating functional method and mean-field methods for network reconstruction recently developed for simpler models (Roudi & Hertz 2011). Using these methods on data from Hippocampus and Entorhinal Cortex, we will find how different neuronal types in the mammalian navigation system interact with each other. Various cell types have been recently discovered in these areas e.g. grid and border cells, but the way they interact and work together is unknown. Using the methods that we develop, we aim to understand the microcircuitry that these neurons form. Planning: In real life one can only observe a part of the network (hundreds of cells out of millions in a local cortical network), and even this is noisy. To compensate for this, we will use Bayesian inference and regularization methods, taking into account prior knowledge, e.g. graph sparsity and known connections. Significant complementarities with network reconstruction work in Berlin and Torino will help implementing these Bayesian techniques. Position: ESR11 (KTH) Title: Cavity method for non-equilibrium states Research Objectives: The stationary probability distributions of dynamics on graphs are important in many areas, ranging from distributed information systems to chemical reactions in discrete geometries. In physical processes obeying detailed balance these stationary distributions are of known, Boltzmann-Gibbs, form but in general no such universal description exists. Approximate methods for calculations of (marginals of) equilibrium distributions by Belief Propagation (BP), closely linked to the cavity method of statistical physics, have therefore attracted considerable attention, and underlie very important technical systems (iterative decoding). This project will study to what extent BP can be genera to the description of non-equilibrium states, and applications thereof. Description of work and research methodology: The project will develop dynamic BP methods, delineate their ranges of applicability in theoretically well-understood physical models, and quantitatively compare to hightemperature expansions known as naïve mean-field and dynamic TAP. Once foundational insights have been established the ESR will explore applications to models of disease spreading, to communication systems, or to other areas where non-equilibrium stationary states appear naturally and can be addresses by dynamic BP. Planning: Dynamic BP might turn out not to be convenient beyond the synchronously updated physical models which have been sketched in the literature. In this case the project has to be limited to such synchronously updated model, which instead will be investigated in more depth. Convenient applications may be hard to develop independently. We will here leverage the strong complementarities with research projects Orsay-1, ENS, ICTP, KCL-2 & NTNU. Position: ESR12 (ICTP) Title: Inference in finance and socio-economic networks Research Objectives: The project will address the effects of interactions in financial and socio-economic systems, by specifically focusing on the following topics: i) optimal portfolio management and optimal execution strategies in models of illiquid financial markets; ii) statistical physics approach to systemic financial risk: stability of risk neutral measures, reconstruction of interbank exposure matrices with message passing techniques and stress tests; iii) inference of dynamic models of ensembles of prices with modern techniques of statistical learning (e.g. Boltzmann learning); iv) application of inference techniques for the reconstruction of social networks or behavioural patterns from incomplete data. We expect strong interaction with the financial partner involved, CFM, and possibly with Medialab on the network analysis of scientific communities. Description of work and research methodology: This project involves techniques from the theory of disordered systems in statistical mechanics (e.g. replica method, message passing techiques) that are rather sophisticated. Part of the objectives will be addressed by numerical simulation methods. Empirical analysis of financial and network data will also be performed. Planning: The project foresees a range of subjects which is broader than the scope of a typical PhD thesis. This is intended to allow for a minimal degree of independence of the ESR to orientate his/her work in a particular direction, and to stimulate a critical attitude towards research. So we expect that the ESR will focus only on some of the subjects above, while others may be developed in collaboration with other partners' ESRs through secondments.