Download NETADIS Research Project Overview The first list below gives the

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Artificial neural network wikipedia , lookup

Neuroinformatics wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Computational chemistry wikipedia , lookup

Computational fluid dynamics wikipedia , lookup

Artificial intelligence wikipedia , lookup

Network science wikipedia , lookup

Hendrik Wade Bode wikipedia , lookup

Molecular dynamics wikipedia , lookup

Theoretical computer science wikipedia , lookup

Data assimilation wikipedia , lookup

Operations research wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Natural computing wikipedia , lookup

Transcript
NETADIS Research Project Overview
The first list below gives the academic (public sector) partner institutions, called “beneficiaries”, that will host the
NETADIS research projects. The second list outlines for each ESR (early stage researcher, i.e. PhD student) the
scientific content envisaged for the relevant research project.
Candidates interested in applying for a NETADIS studentship should contact the person listed at the relevant host
institution, both for queries about the scientific content and for details of the application procedure for that institution.
NETADIS Host Institutions
Beneficiary
Number
Beneficiary name
Beneficiary
short name
City & Country
Contact person
1 (Coordinator)
King's College
London
KCL
London, U.K.
Prof. Peter Sollich
[email protected]
2a
CNRS – Ecole
Normale Supérieure
ENS
Paris, France
Prof. Guilhem Semerjian
[email protected]
2b
CNRS – Université
Paris-Sud
Orsay
Orsay, France
Prof. Silvio Franz
[email protected]
3
Technische
Universität Berlin
TUB
Berlin, Germany
Prof. Manfred Opper
[email protected]
4
Politecnico di Torino
Torino
Torino, Italy
Prof. Riccardo Zecchina
[email protected]
5
CNR – Università
degli Studi di Roma
"La Sapienza"
Rome
Roma, Italy
Dr. Luca Leuzzi
[email protected]
6
Norges TekniskNaturvitenskapelige
Universitet
NTNU
Trondheim, Norway
Dr. Yasser Roudi
[email protected]
7
Kungliga Tekniska
Högskolan
KTH
Stockholm, Sweden
Prof. Erik Aurell
[email protected]
ICTP
Trieste, Italy
(International organisation)
Dr. Matteo Marsili
[email protected]
8
The Abdus Salam
International Centre
for Theoretical
Physics
NETADIS Research Projects
Position: ESR1 (KCL-1)
Title: Sub-network analysis using projection methods
Research Objectives: Even for well-studied protein interaction networks in systems biology, e.g. the ones underlying
the operation of important signalling receptors such as epidermal growth factor receptor (EGFR), which plays a key
role in cancer, much uncertainty remains in the identification of the network of molecular ingredients and pathways:
we cannot assume that the entire network is known. Even if it were, analysing the dynamics mathematically requires
restricting attention to a small enough sub-network. This project will study what can be said about the dynamics on the
known part of the network in such a setting.
Description of work and research methodology: The project will exploit projection techniques from statistical
physics, developed similarly to focus from a large system down to a smaller set of observable quantities. Once
qualitative insights have been established, including e.g. the response to perturbations that can be mediated via the
larger network, the ESR will explore the reverse direction: can statistical inference be used to infer from observations
of the sub-network dynamics something about the unknown network parts?
Planning: The generic form of the projected equations of motion may be too complicated to work with. In this case
techniques inspired from machine learning, e.g. expectation propagation, could be used to approximate further.
Additional support for the IRP will arise from strong complementarities with research projects at TUB, but also from
ongoing work at Torino, on determining maximally informative drug combination experiments for inferring signalling
networks.
Position: ESR2 (KCL-2)
Title: Contagion dynamics across credit networks
Research Objectives: This project will integrate credit derivatives and liquidity dynamics as crucially relevant
degrees of freedom into interacting models of systemic risk across credit networks (see “Finance and socio-economic
systems” above). Studying systemic risk created by credit default swaps (CDS) requires extending existing dynamical
models of counter-party risk to include the three-vertex interactions generated by CDS contracts, while including
effects of liquidity dynamics requires formulating feedback via additional degrees of freedom. Models will initially be
studied in a schematic stochastic setting using generating functional methods. More realistic networks of dependencies
will be investigated, once inference results on financial networks from the ICTP node, and further insights from
network inference in non-equilibrium states (NTNU) become available. Implications for the control and regulation of
financial networks to optimize their resilience will be explored.
Description of work and research methodology: The project will use generating functional methods in schematic
stochastic settings to analyse contagion and liquidity dynamics. Work will proceed from large connectivity networks
(where statistical limit theorems help to simplify the analysis to low-connectivity/large heterogeneity situations.
Analytic approaches will be complemented by simulation studies, which are expected to play a larger role when
studying micro-realistic networks of financial dependencies.
Planning: For networks with high degrees of heterogeneity the resulting macroscopic equations of motion may
become too complex to solve analytically. In such situations an intermediate approach pioneered by Eissfeller and
Opper, which allows to use simulation methods to investigate the analytically derived macroscopic equations of
motion will be employed. Finally, if all else fails, relying to a larger extent on numerical simulations of microscopic
models is always a safe fall-back position that can be used to explore the systems that we propose to study.
Position: ESR3 (ENS)
Title: Epidemic processes on networks, viral marketing and optimal vaccination
Research Objectives: This project will study extremal properties of processes defined on networks. Consider e.g. an
epidemic model where contaminated nodes can propagate an illness to their neighbours. A well-studied problem
concerns the determination of the threshold p_c for the percolation of contamination when in the initial configuration
each node is ill or not independently with some probability p. An optimization version of this problem, for which
much less is known, corresponds to determining the minimal fraction of nodes that has to be activated in order to
trigger an avalanche of contamination. As there is now freedom in the choice of the initial set of nodes, in general this
threshold is smaller than the typical one, p_c. Finding efficient algorithms for determining the optimal set to be
contaminated has crucial applications for viral marketing, where the epidemic models the adoption of a new product
by customers. A dual optimization problem consists in finding the minimal set of nodes to vaccinate in order to block
the propagation of the epidemic. The objectives are: 1) to find analytical expressions for these optimized thresholds on
random graph models with simple contagion rules 2) develop algorithms able to find a close to optimal (in terms of
size) set of initially infected/ vaccinated nodes valid for any single graph 3) extend these results to more complicated
contamination rules.
Description of work and research methodology: The first part of the work will consist in reformulating the above
questions on the final state of a dynamical process as a static optimization problem, where the degrees of freedom are
the nodes to be initially infected. Then the latter will be dealt with the powerful analytical (replica and cavity method)
and algorithmic (message-passing) tools that have been developed for optimization problems on networks.
Planning: In case the intended research methodology runs into trouble it would be possible 1) to rely on exhaustive
numerical simulations in order to unveil the scaling laws governing the neighbourhood of the transition, 2) to derive
bounds on the optimal fraction of infected node via probabilistic methods.
Position: ESR4 (Orsay-1)
Title: Decentralized network control and optimization
Research Objectives: The objective of this project is to leverage tools from the realms of statistical physics,
stochastic processes and Bayesian statistics to develop a powerful methodology for decentralized and efficient
network control and optimization and to achieve fundamental advances in computing and communication networks.
Such methods are already having substantial impact on computer science, coding theory and hard computational
problems. Networks consisting of multiple heterogeneous devices inherently resemble a statistical physics disordered
system in their complex structure, with randomness of interactions at a microscopic level, and unpredictability in the
macroscopic outcome of these interactions.
Description of work and research methodology: Statistical physics methods, such as the cavity method and message
passing algorithms, theoretical and computational tools from disordered systems and random matrix theory will shed
new light on the design of self-managed algorithms and will enable novel applications that depend on scheduling and
routing, resource allocation, decentralized content distribution, inference and decision making amidst uncertainty.
Planning: Message passing algorithms are extremely effective for solving problems on networks without loops. In the
presence of loops – as in real word networks – there is no guarantee of convergence to exact solutions. In case of lack
of convergence, other strategies combining analytic methods and numerical simulations will be used to get good
approximations.
Position: ESR5 (Orsay-2)
Title: Inference of gene regulation networks by comparative genome analysis
Research Objectives: Regulatory DNA regions that control the expression of a gene can be located kilobases away
from it, especially in eukaryotes. Furthermore, binding sites for transcription factors are not necessarily strong and
easily recognizable. The signal in the genome is often encoded in clusters of relatively weak binding sites. The
kinetics of the genome exploration and the search of the binding sites by the transcription factors are also quite nontrivial. All these problems call for the development of new computational tools and theoretical models.
Description of work and research methodology: In this project we will exploit ideas borrowed from information
theory and statistical physics, including message passing, optimization theory, genetic algorithms etc. This work will
be enriched by comparative analyses of the many genomes that are now sequenced. This opens up challenging
problems such as graph alignment and inference of evolutionary scenarios, creating connections also with population
evolutionary genetics where statistical physicists have traditionally been active.
Planning: If the number of genomes in the data bases is too low, there is a risk of insufficient predictive power; we
thus might be limited in our choice of regulatory subsystems. We expect that simultaneous use of transcriptome data
will allow us to overcome this potential difficulty.
Position: ESR6 (TUB)
Title: Approximate inference for stochastic dynamics in large biochemical networks
Research Objectives: Stochastic dynamical models describing biochemical networks on a molecular level could be
used in order to decide which molecular pathway or network structure is more likely to describe a certain biological
function in light of a limited amount of experimental data. A Bayesian inference approach in which all unobserved
quantities (reaction parameters, states) conditioned on the observations are treated as random variables would provide
the necessary likelihood of a given model structure. Unfortunately, for large networks, such an inference approach is
computationally infeasible. The project will develop efficient approximations to this task.
Description of work and research methodology: We will pursue a combination of dynamical functional methods of
statistical physics (to average over parameters) with approximate inference methods for the resulting intractable
model. Approximations based on machine learning methods (variational techniques, expectation propagation) will be
combined with ideas from statistical physics such as variational perturbation theory.
Planning: For very large systems, a full variational treatment could become too time-consuming. Simpler saddle point
type of approximations could provide tractable alternatives. Additional support for the IRP will come from IRPs on
non-equilibrium statistical physics methods at NTNU and KTH.
Position: ESR7 (Torino)
Title: Directed and undirected network inference by message passing and applications to gene regulation
Research Objectives: This project deals with inference of gene regulation networks. Major challenges are (i) sparsemodel learning, since most regulatory networks are sparse, with one variable being directly influenced only by a small
set of other variables, (ii) combinatorial inference, since the regulation of one variable by others may include nontrivial combinatorial effects in combining the single regulators, (iii) the handling of missing variables, i.e. of
components of the system which are not determined by the measured data. Given the large size of real biological
networks, approximate algorithmic approaches are needed; exact algorithms for network reconstruction are restricted
to very small problems. Recent advances in the statistical physics of disordered systems have led to initial promising
results. This project will exploit these for the full benefit of applications, here specifically inference of signaltransduction networks in cancer cell lines from multiple-perturbation data, and of residue-contact networks in proteins
and protein complexes.
Description of work and research methodology: We plan to apply techniques developed within the framework of
the physics of disordered systems (with specific emphasis on the cavity methods) to study network reconstruction
problems. In particular, the work will be focused on the development of distributed algorithms (design of parallel
codes) suitable to effectively study large-scale problems. A thorough investigation of the computational efficiency of
the devised algorithms will be performed.
Planning: In order to minimize the risk connected to focusing on a single approach, which might lead to limitations in
the treatable instances, we plan to carry on three approaches in parallel. In addition to the one described above, linear
programming and Monte Carlo methods will be considered. This in turn will allow for comparisons among the results
obtained with the three techniques and for an integrated approach, useful to overcome the intrinsic weaknesses of each
method.
Position: ESR8 (Rome-1)
Title: Inference of regulatory controls in biochemical reaction networks
Research Objectives: This project will study inference of regulatory controls in biochemical reaction networks, most
notably in genome-scale reconstructions of cellular metabolism (where the interacting units are genes coding for
reaction-catalyzing enzymes). This requires overcoming limitations of the available techniques, primarily message
passing algorithms, to achieve (a) scalability to genome-scale graphs and (b) good performance on loopy graphs:
biochemical networks are usually rich in loops, which hinder the convergence of message passing. An important byproduct would lie in defining more robust and biologically sound criteria for gene essentiality in metabolic networks,
by combining topological and dynamical aspects (both already accounted for in the literature) with the regulatory
element. Good essentiality predictors are key not only for pharmacological applications but for effective and scalable
algorithms for the analysis of epistasis, i.e. interaction between genes that, when masked at the phenotypic level, can
hinder the discovery of pathologies.
Description of work and research methodology: The technical toolbox required for this project lies essentially in
message-passing algorithms (for the analysis of topological properties of large networks), models for flux- and energybalance analysis (to predict reaction rates and chemical potentials in genome-scale networks), and Gillespie-like
algorithms (to simulate the stochastic dynamics of small reaction modules). Continuous cross-reference to genomic,
proteomic and metabolic databases will be crucial, since the project will focus strongly on the analysis of real cellular
biochemical networks.
Planning: The type of message-passing algorithms that are most effective on random graphs (e.g. belief propagation)
may not converge on the loopy architecture of real metabolic networks. Less ambitious alternatives (e.g. warning
propagation) are however known to be excellent substitutes in such cases, at the cost of being slightly more costly in
terms of CPU time.
Position: ESR9 (Rome-2)
Title: Inference of coupling of waves in nonlinear disordered media
Research Objectives: In random lasers, interactions among competing modes depend on the mutual spatial overlap of
their electromagnetic fields modulated by a non-linear susceptibility. So far, localized mode distributions and nonlinear susceptibilities have never been successfully recovered from the analysis of the measurements of random laser
spectra; this is a fundamental aim both conceptually (with the aim of experimentally testing theories of disordered
systems that are in widespread use) and technologically (e.g., to determine the tolerance threshold to random
impurities in photonic crystals). Different statistical mechanics methodologies will be adopted in this investigation:
approaches to disordered systems (cavity method, random graphs, replica symmetry breaking), Monte Carlo
simulation algorithms, and computational techniques for the inverse problem of reconstructing the mode network from
the analysis of experimental measurements of phases and intensities in random laser systems.
Description of work and research methodology: Cavity method, random graph theory, replica symmetry breaking
theory approaches to disordered systems; Monte Carlo simulation algorithms for equilibrium and out of equilibrium
systems. Finite size scaling techniques for data elaboration and quantitative estimates of finite size effects.
Computational techniques for the inverse problem of reconstructing the mode network from the analysis of
experimental measurements of phases and intensities in random laser systems.
Planning: The step of reconstructing the mode interactions by means of statistical inference strongly depends on
experimental data available at the time the ESR will have acquired the skill to perform such analysis. Where that data
is too scarce, the direct numerical simulation of networks with microscopic features taken from the behaviour of
experimentally studied compounds under laser pumping would be an alternative goal in its own right and lead to a
satisfactory modeling of experimental properties, e.g. to determine the tolerance threshold to random impurities in
photonic crystals or test theories of disordered systems.
Position: ESR10 (NTNU)
Title: Efficient inference of interactions from non-equilibrium data and applications to multi-electrode neural
recording
Research Objectives: With the advent of neural multi-electrode arrays and gene microarrays, we are now entering the
stage where data required for reverse engineering large neuronal and genetic networks is becoming available. Current
techniques for this reverse engineering, however, are mostly restricted to equilibrium/static models that usually do not
give meaningful results, and/or cannot exploit all the information available in the data e.g. temporal patterns. Here, we
first develop inference methods for kinetic models taking into account crucial features of biological networks and
importantly limitations of real data, e.g. noise and experimental access to only small parts of a network at a time.
Second, we apply these methods to multi-neural data collected at the Kavli Institute to understand the circuitry that
underlies spatial navigation in mammals.
Description of work and research methodology: We construct efficient inference methods for fitting models such as
Generalized Linear Model and kinetic Ising networks with memory. For this we will use methods from nonequilibrium statistical physics mainly generating functional method and mean-field methods for network
reconstruction recently developed for simpler models (Roudi & Hertz 2011). Using these methods on data from
Hippocampus and Entorhinal Cortex, we will find how different neuronal types in the mammalian navigation system
interact with each other. Various cell types have been recently discovered in these areas e.g. grid and border cells, but
the way they interact and work together is unknown. Using the methods that we develop, we aim to understand the
microcircuitry that these neurons form.
Planning: In real life one can only observe a part of the network (hundreds of cells out of millions in a local cortical
network), and even this is noisy. To compensate for this, we will use Bayesian inference and regularization methods,
taking into account prior knowledge, e.g. graph sparsity and known connections. Significant complementarities with
network reconstruction work in Berlin and Torino will help implementing these Bayesian techniques.
Position: ESR11 (KTH)
Title: Cavity method for non-equilibrium states
Research Objectives: The stationary probability distributions of dynamics on graphs are important in many areas,
ranging from distributed information systems to chemical reactions in discrete geometries. In physical processes
obeying detailed balance these stationary distributions are of known, Boltzmann-Gibbs, form but in general no such
universal description exists. Approximate methods for calculations of (marginals of) equilibrium distributions by
Belief Propagation (BP), closely linked to the cavity method of statistical physics, have therefore attracted
considerable attention, and underlie very important technical systems (iterative decoding). This project will study to
what extent BP can be genera to the description of non-equilibrium states, and applications thereof.
Description of work and research methodology: The project will develop dynamic BP methods, delineate their
ranges of applicability in theoretically well-understood physical models, and quantitatively compare to hightemperature expansions known as naïve mean-field and dynamic TAP. Once foundational insights have been
established the ESR will explore applications to models of disease spreading, to communication systems, or to other
areas where non-equilibrium stationary states appear naturally and can be addresses by dynamic BP.
Planning: Dynamic BP might turn out not to be convenient beyond the synchronously updated physical models which
have been sketched in the literature. In this case the project has to be limited to such synchronously updated model,
which instead will be investigated in more depth. Convenient applications may be hard to develop independently. We
will here leverage the strong complementarities with research projects Orsay-1, ENS, ICTP, KCL-2 & NTNU.
Position: ESR12 (ICTP)
Title: Inference in finance and socio-economic networks
Research Objectives: The project will address the effects of interactions in financial and socio-economic systems, by
specifically focusing on the following topics: i) optimal portfolio management and optimal execution strategies in
models of illiquid financial markets; ii) statistical physics approach to systemic financial risk: stability of risk neutral
measures, reconstruction of interbank exposure matrices with message passing techniques and stress tests; iii)
inference of dynamic models of ensembles of prices with modern techniques of statistical learning (e.g. Boltzmann
learning); iv) application of inference techniques for the reconstruction of social networks or behavioural patterns
from incomplete data. We expect strong interaction with the financial partner involved, CFM, and possibly with
Medialab on the network analysis of scientific communities.
Description of work and research methodology: This project involves techniques from the theory of disordered
systems in statistical mechanics (e.g. replica method, message passing techiques) that are rather sophisticated. Part of
the objectives will be addressed by numerical simulation methods. Empirical analysis of financial and network data
will also be performed.
Planning: The project foresees a range of subjects which is broader than the scope of a typical PhD thesis. This is
intended to allow for a minimal degree of independence of the ESR to orientate his/her work in a particular direction,
and to stimulate a critical attitude towards research. So we expect that the ESR will focus only on some of the subjects
above, while others may be developed in collaboration with other partners' ESRs through secondments.