Download Briefing paper: Bionformatics projects and activities in King`s Health

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Briefing paper: Bioinformatics projects and activities in King’s Health Partners
KHP has considerable expertise in Bioinformatics and Computational Biomedicine, spread over different
campuses. The Bioinformaticians located at the Computer Science Department, Strand Campus are mostly
focused on development of algorithms, with strength in Search algorithms, String Theory applied to pattern
recognition, Data Mining and Classification. At the IoP there is a strong focus in omics analysis including next
generation sequencing, systems biology approaches, biomarker studies, data mining and classification. On
the Guy's Campus there are different groups focused a) Genome Informatics, mostly in the Genetics
Department and the cBRC; b) Cancer Bioinformatics at the Division of Cancer Studies focused on
microarrays, miRNA studies and biomarkers; c) The Institute of Mathematical and Computational
Biomedicine, focused on statistical mathematical analysises, systems biomedicine and
structural
bioinformatics.
This paper aims to (1) provide a vision for the future of bioinformatics at KHP and (2) describe current
capabilities that will form one part of a wider survey of informatics capability across KHP. Inevitably it will
remain a work in progress and we recognise that many other KHP researchers are critically involved in the
projects described briefly below. Initiated by Richard Dobson and Franca Fraternali under instruction from
Mike Denis, lead for the KHP Informatics Grand Challenge, we welcome corrections, additions and
dissemination of the paper widely within King's Health Partners.
Section 1. Vision for the future:
Richard Dobson, Franca Fraternali, Michael Simpson, Thomas Schlitt
Kings Health Partners is impressive in terms of it’s current bioinformatics capabilities, however it suffers in
organization and coordination of initiatives, due its fragmented spread of resources and people. We have a
number of small groups each of which display considerable expertise and academic excellence.
The vision for future development is to unify and consolidate this expertise and wider health informatics
initiatives across KHP by creating a single presence. This presence will be steered by a coordinating
executive comittee, but will require centralised activities and resources. We aim to establish this presence in
a number of areas and coordinated research focuses. We currently fall short in service led bioinformatics,
training provision and a central physical presence.
Below we lay out a number key components for our vision that we feel will provide bioinformatics at KHP with
a global presence. This strategic plan is in part inspired by models implemented by leading institutes in the
field of translational bioinformatics including Vanderbuilt University and the Dana Faber institute. These
institutes have developed models that include a core of bioinformaticians, clinical informaticians, biomedical
informaticians and biostasticians providing analytical service support, administration, software development
whilst integrated with academic research groups. Specifically, our focus for improvement is in four areas:
1-Physical Hub: Moving forward, a key component for bioinformatics within KHP is a unified physical
presence, or hub. We propose an appropriately badged site that will house training facilities, service support
and research space. We envisage a modern, fit-for-purpose site that would provide a world leading
environment for training and research and would aid in the recruitment of high calibre staff and students. The
hub will house a core of non-academic, service led bioinformaticians and developers and also provide
opportunity to house existing research groups. Such collocation would create a critical mass of expertise
through which collaborations would be formed, academic excellence maintained and service delivered.
2-Computing infrastructure: Ideally the proposed hub will house current computing infrastructure, presently
managed separately in different locations and administrered by isolated systems administrators. By
consolidating this infrastructure into a single data centre, there will be multiple benefits: fewer systems
administrators, with greater coverage; better use of storage and distribution of resource and greater
purchasing power. There may be logistical challenges including requirements for certain infrastructure to be
housed at specific locations, to access patient records for example. In such situations we would recommend
fast dedicated connections between infrastructure nodes and the physical hub where administration will be
delivered. This resource will be delivered as autonomously as possible.
3-Training: With the widely accepted challenge of the data deluge bearing down upon us, we must ensure
that all staff and students passing through or resident within KHP have the core informatics skills that are
increasingly essential for contemporary biomedical research. We would extend our current offerings and
develop targeted training programs to fit within the current graduate and postgraduate educational
Draft
1
Date 11/12/11
Page 1 of 9
frameworks. These programs would provide basic science and clinicial students with the essential skills
required to design and perform statistically and computationally sound analyses independently. In addition, a
dedicated PhD training programme specifically in bioinformatics with co-supervision provided by clinical
researchers would provide huge opportunity for fostering new collaborations and would be an ideal platform
for establishing training and research excellence in critical areas. Such studentships located within the
proposed hub would be highly prized and would provide world-leading training in a world-leading
environment.
4-Leadership: Within KHP there are a number of emerging leaders in informatics research, these individuals
are not only excelling in niche areas but are opinion formers within the informatics field in general. We
therefore do not feel it necessary to recruit a single figurehead, but rather that current staff can implement
these proposals jointly and autonomously. Researchers will be far more willing to contribute if they feel
empowered and we therefore feel the shared vision of the group would encourage participation from all
involved, more so than having to subscribe to a vision provided by a single figurehead. The most effective
way forward would be the creation of an executive committee who would oversee the consolidation of
bioinformatics within KHP (a unified hub), organise specialised teams in critical areas, recruit specialised
dedicated technical posts and lead the development of in house software that would be dedicated to the
needs of KHP and would allow for flexibility and adaptability. We feel that successful realisation of this vision
would make KHP very attractive to academics at all levels.
Section 1. Current capabilities
Group: Kathleen Steinhöfel, affiliation Computer Science Department, Strand Campus, KCL
CT Image Classification: computer assisted radiology, and in particular, development of CT image
classification algorithms to support the diagnosis of focal lesions in liver tissue.
EPSRC project on Stochastic Local Search Algorithms for Structural Proteomics (2006 - 2009) to
investigate prediction algorithms for protein folding and, in particular, local minima of folded structures and
suboptimal foldings of proteins with applications to drug design.
Analysis of Mammographic Images: development of algorithms for the analysis of mammographic images.
This research was supported by the Royal Society and Dr Sonia Tangaro visited me at King's for 4 weeks.
Personalised Medicine and mircoRNA target Prediction: more recent research aims at personalised
medicine, focusing on disease pathways induced by microRNA, is a new class of RNA that control many
celluar functions.
Key Contacts
Kathleen Steinhöfel ([email protected])
Group: Sophia Tsoka, affiliation Computer Science Department, Strand Campus, KCL
Analysis of biological network properties: Biological entities organise in clusters/modules according to
functional and evolutionary constraints. This work aims to develop algorithms to partition a network in
modules through optimisation methods. Algorithms have been developed for binary networks, weighted
graphs, as well as overlapping communities. Currently, extension of these methods is underway for networks
that may change through time (dynamic networks).
Analysis of the effect of gut bacteria in rats: Analysis of transcriptomics data to look into the affects of gut
bacteria. Microarray data from several tissues (e.g. liver, colon, ileum etc) are being analysed to find
differences between conventional and germ-free rats. Analyses will also be complemented by metabolomics
measurements.
Machine learning methods for data classification in biomedical data: Given a microarray experiment or
other types of high-throughput biochemical characterisation of patient samples, we develop strategies to
classify samples in appropriate phenotypes (e.g. disease type) according to observable features (e.g. gene
expression intensities). This not only helps to characterise unknown samples, but can also serve to
distinguish which genes may be more important to a particular phenotype. Methods are based on decision
trees or hyperbox classification principles. These methods are applied on skin disease data (psoriasis) and
breast cancer.
Draft
1
Date 11/12/11
Page 2 of 9
Analysis of expression in psoriasis skin disease data: Expression data from a mouse model of psoriasis
analysed to reveal which pathways control the disease and its treatment with relevant cytokines.
Network analysis in cardiovascular medicine: Relevant networks from physiological and pathological
cardiac hypertrophy have been constructed from biological experiments and analysed to derive their
topological properties. Comparative analyses showed significant differences that were used to guide
proteomics experiments.
Key Contacts
Sophia Tsoka ([email protected])
Group: Schalkwyk, Leonard, SGDP, IoP
Research interests: genomics, epigenetics, genetics, especially mouse.
Computing facilities: 132 processor Linux cluster with aggregate 376 Gb RAM and 29 Tb storage
Group: Senior postdoc, system manager, PhD student in informatics part of program. We offer a yearly
course in the R statistical computing environment, this year as part of the SGDP
summer school: http://www.kcl.ac.uk/schools/summerschool/si/sgdp/
Key Contacts
Leo Schalkwyk ([email protected])
Group: Eric Blanc, BMS MRC Centre for Developmental Neurobiology, Guy's Campus
Research interests: statistical methods to process and extract signal from large data sets, especially
expression micro-arrays. In particular, clustering methods and co-expression network reconstruction are
among principal interests. Eric is also involved in the automation of the recording of climbing assays in flies.
Experienced in microarray data processing, programming (including R), and Bayesian statistics.
Suggestions developments for bioinformatics across KHP:
a) for meeting present bioinformatics needs in the college: central computing facilities with staff (technical for
system admin & scientific for bioinformatics support)
b) for improving research: PhD studentship programme.
Key Contacts
Eric Blanc ([email protected])
Group: Institute for Mathematical and Molecular Biomedicine (IMMB), Guy's Campus
Led by: ACC Coolen (Maths), F Fraternali (Randall Division)
The IMMB spearheads the College’s research and teaching activities at the interface between biology,
medicine, mathematics and computation. It aims to become a leading research centre devoted to the
development of quantitative tools for biomedical problems, and an efficient source of mathematical and
computational expertise for biomedical researchers. It will contribute to training a new generation of
biomedical researchers with a strong theoretical background, and organise workshops, conferences, and
short courses.
The IMMB’s research is centred around mathematical and computational aspects of the following themes:
1. Signalling and cooperation in complex intra-cellular and inter-cellular networks (PIs Fraternali and
Coolen)
current projects:
(a) Information-theoretic analysis of cellular signalling networks
Development of rigorous mathematical and computational tools for quantifying structure and complexity of
protein-protein interaction networks (PPIN) and gene regulation networks (GRN), with applications in (i)
comparative interactomics, (ii) the generation of unbiased null model networks for hypothesis testing, (iii)
decontamination of PPIN and GRN data for method-specific experimental bias, and (iv) Bayes-optimal
Draft
1
Date 11/12/11
Page 3 of 9
detection of modularity. Collaboration with NIMR (Kleinjung) and Universita di Roma La Sapienza (De
Martino). Funded by EPSRC and BBSRC.
(b) Proteomic reaction equations
Mathematical and numerical analysis and parameter exploration of the EGFR signalling system.
Development of nonequilibrium statistical mechanical methods for the analysis of complex formation and
dissociation dynamics in large protein interaction networks, based on generating functional analysis.
Collaboration with various partners at KCL, RAL, and the University of Oxford. Funded by BBSRC (LOLA).
(c) Analysis of inter-cellular communication networks
Mathematical modeling of neural information processing systems (based on synaptic communication) and
of immune networks (based on communication via cytokines), including adaptation (learning) and dynamical
response to perturbation (operation), and with a focus on understanding and modeling known pathologies
such as lymphocytosis and autoimmunity. Collaboration with Universita di Roma La Sapienza (Barra).
(d) Theory of epigenetic cell reprogramming
Development of mathematical theory describing the reprogramming of gene regulation networks, by
intervention at the proteomic level, in order to achieve targeting cell phenotype modification. Based on
simplified but explicit description of the proteome-transcription-production-proteome cycle, and on theoretical
methods developed earlier (including programming protocols) for the manipulation of neural networks.
2. Computational biomedicine Structural Bioinformatics (PI Fraternali)
current projects:
(a) Role of flexibility in molecular recognition The present project is exclusively computational: we hope to
characterize in-silico these dynamical properties of protein folds and to suggest new experimental work to
confirm our findings. Funded by Leverhulme FF PI.
(b) Transcriptional programs in melanoma metastasis. We will focus on identifying gene patterns driven by
Rho GTPases and controlling the two types of movement. Microarray data analysis coupled to ProteinProtein interaction analysis will generate sub-networks characteristic of the underlying phenotype.
Mechanistic studies with RNAi and over-expression approaches and in vivo studies to understand the
roles of these genes in metastasis will be carried out. Funded by Cancer Research UK (CRUK) FFCoapplicant, collaboration with Vicky Moreno, Anne Ridley Tony NG
(c) Validation of the APOBEC3G-Vif interaction as a drug target. Funded by Wellcome Trust FF Coapplicant, collaboration with Hendrick Huthoff and Michael Malim.
(d) Novel tools to map allosteric networks in proteins. We aim to parameterise our method using a test-set of
allosteric protein structures, to apply the parameterised method to the Abl kinase of the oncogenic BcrAbl protein and to develop the software into a coherent package. Funded by BBSRC FF PI.
(e) Protein-protein interaction networks and DNA methylation. A long-term objective of the presented
research could be the prediction of methylation sites and the effects of environmental factors onto these
occurrences in healthy and diseased individuals. The partnership with Nestle’-research centre in
Lausanne is particularly interesting because one of their main focus is on ‘Good Food, Good Life’ with
specialised research on nutrition and health. Funded by BBSRC FF PI Partnership with Nestle’
Lausanne.
3. Stochastic methods in medicine
current projects:
(a) Bayesian analysis of FLIM/FRET imaging data
Development of mathematical and computational methods for the Bayes-optimal extraction of molecular
information from fluorescence imaging data in the low photon number regime, to improve significantly the
timescales over which proteomic events can be detected in vivo, and to handle fluorescence lifetime
heterogeneity. Collaboration with the University of Oxford (Vojnovic). Funded by EPSRC.
(b) Survival analysis for heterogeneous cohorts with competing risks
Development of a Bayesian generalisation of survival analysis that allows for cohort heterogeneity and
informative population-level competing risks. The method leads to formulae for extracting regression
parameters and identifying latent classes, tools for quantifying heterogeneity and risk correlations, and
formulae for overall and individual survival curves that are decontaminated for the effects of heterogeneity
and informative competing risks. Collaboration with the University of Uppsala (Holmberg, Garmo). Funded
my Prostate Action.
Draft
1
Date 11/12/11
Page 4 of 9
(c) Dimension reduction and biomarker integration
Development of Bayesian models for dealing with fundamental problems in predicting medical outcome from
gene expression data, viz. dimension mismatch (high dimensional signals with modest cohorts, leading to
overfitting and nonreproducibility of signatures) and integration with other biomarkers (clinical, imaging or
molecular). Dimension reduction is based either on using PPIN/GRN data as constraints, or on unsupervised
latent variable (‘metagene’) analysis. Collaboration with KCL and international partners (FP7 network
IMAGINT) and Imperial College (Guo). Funded by EU and EPSRC.
Key contacts
Franca Fraternali ([email protected]), Ton Coolen ([email protected])
Group:Statistical Genentics Unit, Division of Genetics and Molecular Medicine, Guys & IoP
Led by Cathryn Lewis and Mike Weale:
The Statistical Genetics Unit (SGDP, IoP, and Genetics & Molecular Medicine, SoM) has substantial capacity
in analysis of genetic data (genotype and expression), integrating with clinical data, to identify and
characterise the genetic contributions to complex disease and traits. Four academics, 6 postdocs/fellows, 4
students. Capacity to provide expertise in study design, analysis and interpretation. We have no core
funding for consultancy-level statistical analysis, but encourage longer-term collaborative research, jointresearch projects and PhD students.
http://www.kcl.ac.uk/medicine/research/divisions/gmm/sections/clusters/statisticalgenetics.aspx The four
academics are Cathryn Lewis, Mike Weale, Tom Price and Fruhling Rijsdijk.
Cathryn Lewis: Research includes genetic risk estimation - statistical research programme with allied
software to determine an individual's risk of developing a disease using genotype data, combined with
relevant epidemiological risk factors and clinical covariates. Capacity to perform population-wide analyses to
determine risk profiles (e.g. what proportion of the population will be at >5-fold increased risk of disease),
and individual-level risk estimation. Risk categorisation based on confidence with which information is
known. Current projects in schizophrenia, rheumatoid arthritis, Crohn's disease, with three clinical research
fellows.
Mike Weale: Research component focuses, in part, on the integration of different types of data in order to
make better joint inferences about, for example, whether or how a particular gene is causally affecting a
particular medical trait. Particular projects include: (1) the statistical integration of genomic annotations with
genome-wide association signals; (2) the joint analysis of genotype with gene expression data in different
brain tissues; (3) methods for detecting gene-gene interaction signals in large genetic datasets; (4) methods
for inferring the origin of unknown forensic DNA samples; (5) study design and power calculations for Next
Generation Sequencing experiments
Facilities:
In terms of facilities and equipment, regularly use but do not own two computer clusters based at Guy's
Campus (Athena and Hera) and one based at SGDP (Mumak belonging to Leo Schalkwyk).
Key contacts
Cathryn Lewis ([email protected]), Mike Weale
([email protected])
Group: Rebecca Oakey, Division of Genetics and Molecular Medicine, Guy's campus
Skills, areas/interests: Group focuses on the epigenome and understanding the methylome, histone
modifications and the role of DNA binding proteins in health and disease using next generation sequencing
and bioinformatics.
Facilities/equipment: Equipment is housed and operated by the BRC and KCL Genomics core facilities
located on the 7th floor of the Tower Wing on the Guy's campus and comprise: three Illumina Genome
Analyser IIx (GAIIx) sequencers, one HiSeq 2000, two cBots,Applied Biosystems 7900HT Real-time PCR
system, Covaris Adaptive Focused Acoustics The computing facilities are also owned and managed by the
BRC:
Draft
1
Date 11/12/11
Page 5 of 9
HPC Cluster (to which Mike Weale and Cathryn Lewis of the SGU refer) comprises some 33 servers for
computation provided by 16x IBM iDataplex 2U chassis containing 32x IBM IDataplex nodes each fitted with:



2x 2.66Ghz Intel Nehalem processors (8 cores per node and a total of 256 processor cores for the
cluster excluding heads nodes and the large node)
48GB memory
250Gb hard drive
A larger compute node an IBM x3755 server fitted with:



2x 2.6Ghz 6 core processors (a total of 12 cores)
128GB memory
2x 146Gb hard drives
A Small HPC Cluster comprises some 9 servers for computation with:



4x HP G6 x86_64 compute nodes (2.6GHZ, 28GB RAM) running
4 Tesla S1070 Nvidia GPUs for novel GPU processing applications
3x Supermicro compute nodes (2.6GHz, 32GB RAM)
Size and details of the group:
Research group has five basic scientist researchers. Collaborate with RCUK fellow Reiner Schulz and his
PhD student, both bioinformaticians and work together with the sBRC Systems administrator and two
bioinformaticians.
Key contacts
Rebecca Oakey ([email protected])
Group: Nicolas Smith, Head of Biomedical Engineering, Imaging Sciences & Biomedical Engineering
Division, St Thomas’ Hospital
Current interests are developing multi-scale models from gene regulation to whole organ function which
integrate experimental data at each level to provide a mechanistic framework for analysing physiological
function. The majority of activities are in the cardiovascular systems although we also work in cancer and in
both cases there is a significant focus on the clinical translation of these models.
Skills are in computational simulation of physiological systems, development of efficient numerical
techniques, analysis of sub-cellular and cellular regulation and the integration of cellular and whole organ
responses.
Facilities: Central to these facilities is the 500 processor shared memory High Performance computer which
is about to be acquird and a range of imaging scanners. The department of biomedical engineering is 15
academics with 7 specifically focused on computational modelling.
Key contacts
Nicolas Smith ([email protected])
Group: Richard Dobson, Lecturer, Bioinformatics lead, BRC-MH, South London and The Maudsley &
IoP
The assembled team of 6 postdocs have developed expertise in the analysis, integration and modeling of
complex large molecular datasets such as expression and SNP arrays, next generation sequencing,
approaches for network (gene regulatory, co-expression and protein interaction) and pathway studies
providing novel insight into disease mechanisms and biomarker discovery.
We are performing a range of studies through the combined analysis of clinical, imaging, proteomics,
transcriptomics and genomic datasets, and have multiple active academic and industrial collaborations.
Draft
1
Date 11/12/11
Page 6 of 9
We are developing infrastructure to enable sharing of datasets supported by funding for a pilot project from a
joint KHP/UCSF initiative in collaboration with the NIH funded Clinical and Translational Science Initiative
(CTSI).
Through grant funding we have expanded the high performance computing Linux cluster
(http://compbio.brc.iop.kcl.ac.uk/cluster/index.php), purchased with NIHR capital finding (August 2009) to sit
within the SLaM firewall. This now enables them to text mine all SLaM patient records (>170k) in under 1 day
and align a lane of next generation sequencing against a human reference genome in hours rather than
days. This has significantly enhanced capacity to support the rapid analysis and interpretation of large
variable imaging and 'omics datasets. Utilising this infrastructure the imaging theme has introduced
automated quantitative neuroimaging into the routine assessment of MRI scans for people with dementia.
Key contacts
Richard Dobson ([email protected])
Group: Matt Arno, Waterloo campus
Service led model. Includes: design of project, generation of data, data analysis with researcher, pathway
analysis
Key contacts
Matt Arno ([email protected])
Group: Andrew Pickles, Department of Biostatistics & BRC-MH, IoP
Primarily a group of statisticians several of whom have expertise in bioinformatics methods and applications,
primarily in the area of expression and SNP array data and in neurophysiology, ERP in particular. We also
have expertise in formal statistical modelling e.g. of biomarker mediation models that account for
measurement error, particularly in software such as Mplus and gllamm (a powerful and very general
modelling software that originated from this Department, see www.gllamm.org).
Key Contacts
Andrew Pickles ([email protected])
Group: Thomas Schlitt, Lecturer in Bioinformatics King's College London Dept of Medical and Molecular
Genetics 8th floor Tower Wing Guy's Hospital
Skills, areas/interests: The main research focus of the group is in the analysis of gene and protein networks.
We develop novel approaches for de-novo pathway discovery to support the discovery of genes and
pathways underlying complex diseases independent of pathway annotation. We have experience in the
integrated analysis of GWAS/gene expression/NGS data and gene and protein networks. We are involved in
several next-generation sequence analysis projects such as analysis of exome data to find disease genes
Crohn’s disease. In collaboration with Prof Ahlers (Hannover, Germany) we develop our network analysis
and visualisation tool BioGranat. And in collaboration with Dr Brazma (EBI) we work on dynamic models for
the simulation of gene regulatory networks.
Facilities/equipment: The group owns two dedicated computer servers, one mainly used as MySQL database
server, the other is mainly used for computational tasks. We also make intensive use of the cBRC Athena
HPC cluster at the Department of Medical and Molecular Genetics. We maintain our own database of
protein-protein interactions which we integrated from six public databases.
Size and details of the group: Dr Thomas Schlitt (group leader), Nick Dand (PhD student 2011-2014), Russel
Sutherland (Illumina CASE PhD student, 2011-2015)
Key Contacts
Thomas Schlitt ([email protected])
Group: Emanuele de Rinaldis, Senior Research Fellow, Breakthrough Breast Cancer Unit, King’s College
London, UK
Skills, areas/interests: Integration of -omics data (SNP, CGH, miRNA, ExonArray, Methylation) and
clinical/pathological data for the study of the molecular mechanisms underpinning breast cancer
Draft
1
Date 11/12/11
Page 7 of 9
Discovery of novel biomarkers, prognostic factors and therapeutic targets in breast cancer, using genomics
technologies and bioinformatic/biostatistical data analysis.
Clinical-Genomics data integration and analysis: development of databases, tools and algorithms
Interest in NGS (although very new in the field)
Facilities/equipment: as part of the Breakthrough Breast Cancer Research Unit we have access to lab
facilities and work in close link with lab scientists for follow-up and validation experiments
- 3 personal workstations, 1 64-bit CentOS, quad core, 16GB DDR2 Linux Server, 1 High Performance
server: HP DL G5380 Quad core X 2 2.83 GHz, 1 Storage server: HP MSA 2312, 6TB, 2 NAS Storage
Devices 4TB
Size and details of the group: (1) Emanuele de Rinaldis (Biology/Bioinformatics) - Team Leader, (2) Anita
Grigoriadis (Biology/Bioinformatics) - Research Fellow, (3) Brian Burford (Computer Scientist) - Research
Associate, (4) Akram Shalaby (Mathematician) - PhD student (co-supervisor: Ton Coolen)
How can existing capability be better coordinated and utilised and /or enhanced through investment and
development
- We think it would be good to have a King's school of bioinformatics with attached training initiatives, PhD
programs, lectures, workshops etc. Maybe it exists already, if so we are not aware of it (in thsi case probably
we have a problem of communication to invest on). A nominated head of such a school would help
coordinating school's activities.
- More in general, there should be more investments in training and exchange opportunities for young
bioinformaticians
- Facilities for data storage, backup and administration
- Facilities for controlled data sharing across different bioinformatics King's groups
- Dana-Faber Cancer Institute in Boston it's a good example of a place where bioinformatics has very strong
impact in cancer research
Key Contacts
Emanuele de Rinaldis ([email protected])
Group: Michael Simpson, Lecturer in Medical and Molecular Medicine, King's College London Dept of
Medical and Molecular Genetics 8th floor Tower Wing Guy's Hospital
The main focus of the group is the application of contemporary genomic technologies to the elucidation of
the genetic basis of human diseases and traits. An integral component of our research programme is centred
on genome informatics. We have established data analysis pipelines for the interpretation of next-generation
sequencing data and continue to evaluate and develop novel methodologies for the analysis and integration
of large-scale genomic datasets to build a comprehensive understanding of the role of the genome and its
variation in health and disease.
Our data analysis is principally undertaken on the GSTT BRC HPC Computational Cluster. We work closely
with the BRC Genomics and Bioinformatics Core Facilities and collaborate with both clinical researchers and
computer scientists within KHP, the UK and beyond.
Key Contacts:
Michael Simpson ([email protected])
Group: Prof Michael Luck, Dept of Computer Science, KCL
Members of the Group worked on various topics on medical oriented research, e.g. arrhythmia classification
on mobile phones, herpes virus identification. In particular they developed algorithms for mapping high
throughput sequencing technologies as well as weighted and degenerate sequences. The team also
developed the first transcriptome map of mouse isochores from three distinct mouse tissues (muscle, liver,
Draft
1
Date 11/12/11
Page 8 of 9
and brain).
The team developed NGS prototype software
– REAL (Read Aligner): an efficient, sensitive, and accurate alignment programme that can match or even
outperform most well-known tools, e.g. SOAP2, Bowtie, BWA
– cREAL (circular REAL): an extension of REAL specifically designed for small genomes with circular
structure, e.g. bacterial chromosomes
– DynMap: yet another alignment programme specifically designed for aligning a set of reads against many
closely related genomes, e.g. individuals of the same species
Members of the group worked on Haplotype Classification Algorithms, on disease gene identification related
to haematopoiesis, Mammographic image analysis as well as Combinatorial Algorithms for Protein Folding
Simulations and lattice protein folding simulations.
Furthermore team members worked on analysis of biological network properties, analysis of the effect of gut
bacteria in rats, machine learning methods for data classification in biomedical data, analysis of expression
in psoriasis skin disease data, as well as network analysis in cardiovascular medicine:
Fundamental research topics of the group include:
● Combinatorics on words and graphs
● Probabilistic analysis
● Constraint programming
● Combinatorial optimisation
● Automata theory
Sequence analysis (alignment, re-alignment, etc.)
● Comparative genomics (evolution, function prediction)
● Classification of biomedical data
● Music analysis (rhythm, melody detection)
● Structure and search of large networks – (www, p2p, wireless, etc.)
● Text compression
● Data mining
Key Contact:
Michael Luck ([email protected])
This paper will be updated on request. Further information relating to health informatics in KHP should be
forwarded to: [email protected] or [email protected]
Draft
1
Date 11/12/11
Page 9 of 9