Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Briefing paper: Bioinformatics projects and activities in King’s Health Partners KHP has considerable expertise in Bioinformatics and Computational Biomedicine, spread over different campuses. The Bioinformaticians located at the Computer Science Department, Strand Campus are mostly focused on development of algorithms, with strength in Search algorithms, String Theory applied to pattern recognition, Data Mining and Classification. At the IoP there is a strong focus in omics analysis including next generation sequencing, systems biology approaches, biomarker studies, data mining and classification. On the Guy's Campus there are different groups focused a) Genome Informatics, mostly in the Genetics Department and the cBRC; b) Cancer Bioinformatics at the Division of Cancer Studies focused on microarrays, miRNA studies and biomarkers; c) The Institute of Mathematical and Computational Biomedicine, focused on statistical mathematical analysises, systems biomedicine and structural bioinformatics. This paper aims to (1) provide a vision for the future of bioinformatics at KHP and (2) describe current capabilities that will form one part of a wider survey of informatics capability across KHP. Inevitably it will remain a work in progress and we recognise that many other KHP researchers are critically involved in the projects described briefly below. Initiated by Richard Dobson and Franca Fraternali under instruction from Mike Denis, lead for the KHP Informatics Grand Challenge, we welcome corrections, additions and dissemination of the paper widely within King's Health Partners. Section 1. Vision for the future: Richard Dobson, Franca Fraternali, Michael Simpson, Thomas Schlitt Kings Health Partners is impressive in terms of it’s current bioinformatics capabilities, however it suffers in organization and coordination of initiatives, due its fragmented spread of resources and people. We have a number of small groups each of which display considerable expertise and academic excellence. The vision for future development is to unify and consolidate this expertise and wider health informatics initiatives across KHP by creating a single presence. This presence will be steered by a coordinating executive comittee, but will require centralised activities and resources. We aim to establish this presence in a number of areas and coordinated research focuses. We currently fall short in service led bioinformatics, training provision and a central physical presence. Below we lay out a number key components for our vision that we feel will provide bioinformatics at KHP with a global presence. This strategic plan is in part inspired by models implemented by leading institutes in the field of translational bioinformatics including Vanderbuilt University and the Dana Faber institute. These institutes have developed models that include a core of bioinformaticians, clinical informaticians, biomedical informaticians and biostasticians providing analytical service support, administration, software development whilst integrated with academic research groups. Specifically, our focus for improvement is in four areas: 1-Physical Hub: Moving forward, a key component for bioinformatics within KHP is a unified physical presence, or hub. We propose an appropriately badged site that will house training facilities, service support and research space. We envisage a modern, fit-for-purpose site that would provide a world leading environment for training and research and would aid in the recruitment of high calibre staff and students. The hub will house a core of non-academic, service led bioinformaticians and developers and also provide opportunity to house existing research groups. Such collocation would create a critical mass of expertise through which collaborations would be formed, academic excellence maintained and service delivered. 2-Computing infrastructure: Ideally the proposed hub will house current computing infrastructure, presently managed separately in different locations and administrered by isolated systems administrators. By consolidating this infrastructure into a single data centre, there will be multiple benefits: fewer systems administrators, with greater coverage; better use of storage and distribution of resource and greater purchasing power. There may be logistical challenges including requirements for certain infrastructure to be housed at specific locations, to access patient records for example. In such situations we would recommend fast dedicated connections between infrastructure nodes and the physical hub where administration will be delivered. This resource will be delivered as autonomously as possible. 3-Training: With the widely accepted challenge of the data deluge bearing down upon us, we must ensure that all staff and students passing through or resident within KHP have the core informatics skills that are increasingly essential for contemporary biomedical research. We would extend our current offerings and develop targeted training programs to fit within the current graduate and postgraduate educational Draft 1 Date 11/12/11 Page 1 of 9 frameworks. These programs would provide basic science and clinicial students with the essential skills required to design and perform statistically and computationally sound analyses independently. In addition, a dedicated PhD training programme specifically in bioinformatics with co-supervision provided by clinical researchers would provide huge opportunity for fostering new collaborations and would be an ideal platform for establishing training and research excellence in critical areas. Such studentships located within the proposed hub would be highly prized and would provide world-leading training in a world-leading environment. 4-Leadership: Within KHP there are a number of emerging leaders in informatics research, these individuals are not only excelling in niche areas but are opinion formers within the informatics field in general. We therefore do not feel it necessary to recruit a single figurehead, but rather that current staff can implement these proposals jointly and autonomously. Researchers will be far more willing to contribute if they feel empowered and we therefore feel the shared vision of the group would encourage participation from all involved, more so than having to subscribe to a vision provided by a single figurehead. The most effective way forward would be the creation of an executive committee who would oversee the consolidation of bioinformatics within KHP (a unified hub), organise specialised teams in critical areas, recruit specialised dedicated technical posts and lead the development of in house software that would be dedicated to the needs of KHP and would allow for flexibility and adaptability. We feel that successful realisation of this vision would make KHP very attractive to academics at all levels. Section 1. Current capabilities Group: Kathleen Steinhöfel, affiliation Computer Science Department, Strand Campus, KCL CT Image Classification: computer assisted radiology, and in particular, development of CT image classification algorithms to support the diagnosis of focal lesions in liver tissue. EPSRC project on Stochastic Local Search Algorithms for Structural Proteomics (2006 - 2009) to investigate prediction algorithms for protein folding and, in particular, local minima of folded structures and suboptimal foldings of proteins with applications to drug design. Analysis of Mammographic Images: development of algorithms for the analysis of mammographic images. This research was supported by the Royal Society and Dr Sonia Tangaro visited me at King's for 4 weeks. Personalised Medicine and mircoRNA target Prediction: more recent research aims at personalised medicine, focusing on disease pathways induced by microRNA, is a new class of RNA that control many celluar functions. Key Contacts Kathleen Steinhöfel ([email protected]) Group: Sophia Tsoka, affiliation Computer Science Department, Strand Campus, KCL Analysis of biological network properties: Biological entities organise in clusters/modules according to functional and evolutionary constraints. This work aims to develop algorithms to partition a network in modules through optimisation methods. Algorithms have been developed for binary networks, weighted graphs, as well as overlapping communities. Currently, extension of these methods is underway for networks that may change through time (dynamic networks). Analysis of the effect of gut bacteria in rats: Analysis of transcriptomics data to look into the affects of gut bacteria. Microarray data from several tissues (e.g. liver, colon, ileum etc) are being analysed to find differences between conventional and germ-free rats. Analyses will also be complemented by metabolomics measurements. Machine learning methods for data classification in biomedical data: Given a microarray experiment or other types of high-throughput biochemical characterisation of patient samples, we develop strategies to classify samples in appropriate phenotypes (e.g. disease type) according to observable features (e.g. gene expression intensities). This not only helps to characterise unknown samples, but can also serve to distinguish which genes may be more important to a particular phenotype. Methods are based on decision trees or hyperbox classification principles. These methods are applied on skin disease data (psoriasis) and breast cancer. Draft 1 Date 11/12/11 Page 2 of 9 Analysis of expression in psoriasis skin disease data: Expression data from a mouse model of psoriasis analysed to reveal which pathways control the disease and its treatment with relevant cytokines. Network analysis in cardiovascular medicine: Relevant networks from physiological and pathological cardiac hypertrophy have been constructed from biological experiments and analysed to derive their topological properties. Comparative analyses showed significant differences that were used to guide proteomics experiments. Key Contacts Sophia Tsoka ([email protected]) Group: Schalkwyk, Leonard, SGDP, IoP Research interests: genomics, epigenetics, genetics, especially mouse. Computing facilities: 132 processor Linux cluster with aggregate 376 Gb RAM and 29 Tb storage Group: Senior postdoc, system manager, PhD student in informatics part of program. We offer a yearly course in the R statistical computing environment, this year as part of the SGDP summer school: http://www.kcl.ac.uk/schools/summerschool/si/sgdp/ Key Contacts Leo Schalkwyk ([email protected]) Group: Eric Blanc, BMS MRC Centre for Developmental Neurobiology, Guy's Campus Research interests: statistical methods to process and extract signal from large data sets, especially expression micro-arrays. In particular, clustering methods and co-expression network reconstruction are among principal interests. Eric is also involved in the automation of the recording of climbing assays in flies. Experienced in microarray data processing, programming (including R), and Bayesian statistics. Suggestions developments for bioinformatics across KHP: a) for meeting present bioinformatics needs in the college: central computing facilities with staff (technical for system admin & scientific for bioinformatics support) b) for improving research: PhD studentship programme. Key Contacts Eric Blanc ([email protected]) Group: Institute for Mathematical and Molecular Biomedicine (IMMB), Guy's Campus Led by: ACC Coolen (Maths), F Fraternali (Randall Division) The IMMB spearheads the College’s research and teaching activities at the interface between biology, medicine, mathematics and computation. It aims to become a leading research centre devoted to the development of quantitative tools for biomedical problems, and an efficient source of mathematical and computational expertise for biomedical researchers. It will contribute to training a new generation of biomedical researchers with a strong theoretical background, and organise workshops, conferences, and short courses. The IMMB’s research is centred around mathematical and computational aspects of the following themes: 1. Signalling and cooperation in complex intra-cellular and inter-cellular networks (PIs Fraternali and Coolen) current projects: (a) Information-theoretic analysis of cellular signalling networks Development of rigorous mathematical and computational tools for quantifying structure and complexity of protein-protein interaction networks (PPIN) and gene regulation networks (GRN), with applications in (i) comparative interactomics, (ii) the generation of unbiased null model networks for hypothesis testing, (iii) decontamination of PPIN and GRN data for method-specific experimental bias, and (iv) Bayes-optimal Draft 1 Date 11/12/11 Page 3 of 9 detection of modularity. Collaboration with NIMR (Kleinjung) and Universita di Roma La Sapienza (De Martino). Funded by EPSRC and BBSRC. (b) Proteomic reaction equations Mathematical and numerical analysis and parameter exploration of the EGFR signalling system. Development of nonequilibrium statistical mechanical methods for the analysis of complex formation and dissociation dynamics in large protein interaction networks, based on generating functional analysis. Collaboration with various partners at KCL, RAL, and the University of Oxford. Funded by BBSRC (LOLA). (c) Analysis of inter-cellular communication networks Mathematical modeling of neural information processing systems (based on synaptic communication) and of immune networks (based on communication via cytokines), including adaptation (learning) and dynamical response to perturbation (operation), and with a focus on understanding and modeling known pathologies such as lymphocytosis and autoimmunity. Collaboration with Universita di Roma La Sapienza (Barra). (d) Theory of epigenetic cell reprogramming Development of mathematical theory describing the reprogramming of gene regulation networks, by intervention at the proteomic level, in order to achieve targeting cell phenotype modification. Based on simplified but explicit description of the proteome-transcription-production-proteome cycle, and on theoretical methods developed earlier (including programming protocols) for the manipulation of neural networks. 2. Computational biomedicine Structural Bioinformatics (PI Fraternali) current projects: (a) Role of flexibility in molecular recognition The present project is exclusively computational: we hope to characterize in-silico these dynamical properties of protein folds and to suggest new experimental work to confirm our findings. Funded by Leverhulme FF PI. (b) Transcriptional programs in melanoma metastasis. We will focus on identifying gene patterns driven by Rho GTPases and controlling the two types of movement. Microarray data analysis coupled to ProteinProtein interaction analysis will generate sub-networks characteristic of the underlying phenotype. Mechanistic studies with RNAi and over-expression approaches and in vivo studies to understand the roles of these genes in metastasis will be carried out. Funded by Cancer Research UK (CRUK) FFCoapplicant, collaboration with Vicky Moreno, Anne Ridley Tony NG (c) Validation of the APOBEC3G-Vif interaction as a drug target. Funded by Wellcome Trust FF Coapplicant, collaboration with Hendrick Huthoff and Michael Malim. (d) Novel tools to map allosteric networks in proteins. We aim to parameterise our method using a test-set of allosteric protein structures, to apply the parameterised method to the Abl kinase of the oncogenic BcrAbl protein and to develop the software into a coherent package. Funded by BBSRC FF PI. (e) Protein-protein interaction networks and DNA methylation. A long-term objective of the presented research could be the prediction of methylation sites and the effects of environmental factors onto these occurrences in healthy and diseased individuals. The partnership with Nestle’-research centre in Lausanne is particularly interesting because one of their main focus is on ‘Good Food, Good Life’ with specialised research on nutrition and health. Funded by BBSRC FF PI Partnership with Nestle’ Lausanne. 3. Stochastic methods in medicine current projects: (a) Bayesian analysis of FLIM/FRET imaging data Development of mathematical and computational methods for the Bayes-optimal extraction of molecular information from fluorescence imaging data in the low photon number regime, to improve significantly the timescales over which proteomic events can be detected in vivo, and to handle fluorescence lifetime heterogeneity. Collaboration with the University of Oxford (Vojnovic). Funded by EPSRC. (b) Survival analysis for heterogeneous cohorts with competing risks Development of a Bayesian generalisation of survival analysis that allows for cohort heterogeneity and informative population-level competing risks. The method leads to formulae for extracting regression parameters and identifying latent classes, tools for quantifying heterogeneity and risk correlations, and formulae for overall and individual survival curves that are decontaminated for the effects of heterogeneity and informative competing risks. Collaboration with the University of Uppsala (Holmberg, Garmo). Funded my Prostate Action. Draft 1 Date 11/12/11 Page 4 of 9 (c) Dimension reduction and biomarker integration Development of Bayesian models for dealing with fundamental problems in predicting medical outcome from gene expression data, viz. dimension mismatch (high dimensional signals with modest cohorts, leading to overfitting and nonreproducibility of signatures) and integration with other biomarkers (clinical, imaging or molecular). Dimension reduction is based either on using PPIN/GRN data as constraints, or on unsupervised latent variable (‘metagene’) analysis. Collaboration with KCL and international partners (FP7 network IMAGINT) and Imperial College (Guo). Funded by EU and EPSRC. Key contacts Franca Fraternali ([email protected]), Ton Coolen ([email protected]) Group:Statistical Genentics Unit, Division of Genetics and Molecular Medicine, Guys & IoP Led by Cathryn Lewis and Mike Weale: The Statistical Genetics Unit (SGDP, IoP, and Genetics & Molecular Medicine, SoM) has substantial capacity in analysis of genetic data (genotype and expression), integrating with clinical data, to identify and characterise the genetic contributions to complex disease and traits. Four academics, 6 postdocs/fellows, 4 students. Capacity to provide expertise in study design, analysis and interpretation. We have no core funding for consultancy-level statistical analysis, but encourage longer-term collaborative research, jointresearch projects and PhD students. http://www.kcl.ac.uk/medicine/research/divisions/gmm/sections/clusters/statisticalgenetics.aspx The four academics are Cathryn Lewis, Mike Weale, Tom Price and Fruhling Rijsdijk. Cathryn Lewis: Research includes genetic risk estimation - statistical research programme with allied software to determine an individual's risk of developing a disease using genotype data, combined with relevant epidemiological risk factors and clinical covariates. Capacity to perform population-wide analyses to determine risk profiles (e.g. what proportion of the population will be at >5-fold increased risk of disease), and individual-level risk estimation. Risk categorisation based on confidence with which information is known. Current projects in schizophrenia, rheumatoid arthritis, Crohn's disease, with three clinical research fellows. Mike Weale: Research component focuses, in part, on the integration of different types of data in order to make better joint inferences about, for example, whether or how a particular gene is causally affecting a particular medical trait. Particular projects include: (1) the statistical integration of genomic annotations with genome-wide association signals; (2) the joint analysis of genotype with gene expression data in different brain tissues; (3) methods for detecting gene-gene interaction signals in large genetic datasets; (4) methods for inferring the origin of unknown forensic DNA samples; (5) study design and power calculations for Next Generation Sequencing experiments Facilities: In terms of facilities and equipment, regularly use but do not own two computer clusters based at Guy's Campus (Athena and Hera) and one based at SGDP (Mumak belonging to Leo Schalkwyk). Key contacts Cathryn Lewis ([email protected]), Mike Weale ([email protected]) Group: Rebecca Oakey, Division of Genetics and Molecular Medicine, Guy's campus Skills, areas/interests: Group focuses on the epigenome and understanding the methylome, histone modifications and the role of DNA binding proteins in health and disease using next generation sequencing and bioinformatics. Facilities/equipment: Equipment is housed and operated by the BRC and KCL Genomics core facilities located on the 7th floor of the Tower Wing on the Guy's campus and comprise: three Illumina Genome Analyser IIx (GAIIx) sequencers, one HiSeq 2000, two cBots,Applied Biosystems 7900HT Real-time PCR system, Covaris Adaptive Focused Acoustics The computing facilities are also owned and managed by the BRC: Draft 1 Date 11/12/11 Page 5 of 9 HPC Cluster (to which Mike Weale and Cathryn Lewis of the SGU refer) comprises some 33 servers for computation provided by 16x IBM iDataplex 2U chassis containing 32x IBM IDataplex nodes each fitted with: 2x 2.66Ghz Intel Nehalem processors (8 cores per node and a total of 256 processor cores for the cluster excluding heads nodes and the large node) 48GB memory 250Gb hard drive A larger compute node an IBM x3755 server fitted with: 2x 2.6Ghz 6 core processors (a total of 12 cores) 128GB memory 2x 146Gb hard drives A Small HPC Cluster comprises some 9 servers for computation with: 4x HP G6 x86_64 compute nodes (2.6GHZ, 28GB RAM) running 4 Tesla S1070 Nvidia GPUs for novel GPU processing applications 3x Supermicro compute nodes (2.6GHz, 32GB RAM) Size and details of the group: Research group has five basic scientist researchers. Collaborate with RCUK fellow Reiner Schulz and his PhD student, both bioinformaticians and work together with the sBRC Systems administrator and two bioinformaticians. Key contacts Rebecca Oakey ([email protected]) Group: Nicolas Smith, Head of Biomedical Engineering, Imaging Sciences & Biomedical Engineering Division, St Thomas’ Hospital Current interests are developing multi-scale models from gene regulation to whole organ function which integrate experimental data at each level to provide a mechanistic framework for analysing physiological function. The majority of activities are in the cardiovascular systems although we also work in cancer and in both cases there is a significant focus on the clinical translation of these models. Skills are in computational simulation of physiological systems, development of efficient numerical techniques, analysis of sub-cellular and cellular regulation and the integration of cellular and whole organ responses. Facilities: Central to these facilities is the 500 processor shared memory High Performance computer which is about to be acquird and a range of imaging scanners. The department of biomedical engineering is 15 academics with 7 specifically focused on computational modelling. Key contacts Nicolas Smith ([email protected]) Group: Richard Dobson, Lecturer, Bioinformatics lead, BRC-MH, South London and The Maudsley & IoP The assembled team of 6 postdocs have developed expertise in the analysis, integration and modeling of complex large molecular datasets such as expression and SNP arrays, next generation sequencing, approaches for network (gene regulatory, co-expression and protein interaction) and pathway studies providing novel insight into disease mechanisms and biomarker discovery. We are performing a range of studies through the combined analysis of clinical, imaging, proteomics, transcriptomics and genomic datasets, and have multiple active academic and industrial collaborations. Draft 1 Date 11/12/11 Page 6 of 9 We are developing infrastructure to enable sharing of datasets supported by funding for a pilot project from a joint KHP/UCSF initiative in collaboration with the NIH funded Clinical and Translational Science Initiative (CTSI). Through grant funding we have expanded the high performance computing Linux cluster (http://compbio.brc.iop.kcl.ac.uk/cluster/index.php), purchased with NIHR capital finding (August 2009) to sit within the SLaM firewall. This now enables them to text mine all SLaM patient records (>170k) in under 1 day and align a lane of next generation sequencing against a human reference genome in hours rather than days. This has significantly enhanced capacity to support the rapid analysis and interpretation of large variable imaging and 'omics datasets. Utilising this infrastructure the imaging theme has introduced automated quantitative neuroimaging into the routine assessment of MRI scans for people with dementia. Key contacts Richard Dobson ([email protected]) Group: Matt Arno, Waterloo campus Service led model. Includes: design of project, generation of data, data analysis with researcher, pathway analysis Key contacts Matt Arno ([email protected]) Group: Andrew Pickles, Department of Biostatistics & BRC-MH, IoP Primarily a group of statisticians several of whom have expertise in bioinformatics methods and applications, primarily in the area of expression and SNP array data and in neurophysiology, ERP in particular. We also have expertise in formal statistical modelling e.g. of biomarker mediation models that account for measurement error, particularly in software such as Mplus and gllamm (a powerful and very general modelling software that originated from this Department, see www.gllamm.org). Key Contacts Andrew Pickles ([email protected]) Group: Thomas Schlitt, Lecturer in Bioinformatics King's College London Dept of Medical and Molecular Genetics 8th floor Tower Wing Guy's Hospital Skills, areas/interests: The main research focus of the group is in the analysis of gene and protein networks. We develop novel approaches for de-novo pathway discovery to support the discovery of genes and pathways underlying complex diseases independent of pathway annotation. We have experience in the integrated analysis of GWAS/gene expression/NGS data and gene and protein networks. We are involved in several next-generation sequence analysis projects such as analysis of exome data to find disease genes Crohn’s disease. In collaboration with Prof Ahlers (Hannover, Germany) we develop our network analysis and visualisation tool BioGranat. And in collaboration with Dr Brazma (EBI) we work on dynamic models for the simulation of gene regulatory networks. Facilities/equipment: The group owns two dedicated computer servers, one mainly used as MySQL database server, the other is mainly used for computational tasks. We also make intensive use of the cBRC Athena HPC cluster at the Department of Medical and Molecular Genetics. We maintain our own database of protein-protein interactions which we integrated from six public databases. Size and details of the group: Dr Thomas Schlitt (group leader), Nick Dand (PhD student 2011-2014), Russel Sutherland (Illumina CASE PhD student, 2011-2015) Key Contacts Thomas Schlitt ([email protected]) Group: Emanuele de Rinaldis, Senior Research Fellow, Breakthrough Breast Cancer Unit, King’s College London, UK Skills, areas/interests: Integration of -omics data (SNP, CGH, miRNA, ExonArray, Methylation) and clinical/pathological data for the study of the molecular mechanisms underpinning breast cancer Draft 1 Date 11/12/11 Page 7 of 9 Discovery of novel biomarkers, prognostic factors and therapeutic targets in breast cancer, using genomics technologies and bioinformatic/biostatistical data analysis. Clinical-Genomics data integration and analysis: development of databases, tools and algorithms Interest in NGS (although very new in the field) Facilities/equipment: as part of the Breakthrough Breast Cancer Research Unit we have access to lab facilities and work in close link with lab scientists for follow-up and validation experiments - 3 personal workstations, 1 64-bit CentOS, quad core, 16GB DDR2 Linux Server, 1 High Performance server: HP DL G5380 Quad core X 2 2.83 GHz, 1 Storage server: HP MSA 2312, 6TB, 2 NAS Storage Devices 4TB Size and details of the group: (1) Emanuele de Rinaldis (Biology/Bioinformatics) - Team Leader, (2) Anita Grigoriadis (Biology/Bioinformatics) - Research Fellow, (3) Brian Burford (Computer Scientist) - Research Associate, (4) Akram Shalaby (Mathematician) - PhD student (co-supervisor: Ton Coolen) How can existing capability be better coordinated and utilised and /or enhanced through investment and development - We think it would be good to have a King's school of bioinformatics with attached training initiatives, PhD programs, lectures, workshops etc. Maybe it exists already, if so we are not aware of it (in thsi case probably we have a problem of communication to invest on). A nominated head of such a school would help coordinating school's activities. - More in general, there should be more investments in training and exchange opportunities for young bioinformaticians - Facilities for data storage, backup and administration - Facilities for controlled data sharing across different bioinformatics King's groups - Dana-Faber Cancer Institute in Boston it's a good example of a place where bioinformatics has very strong impact in cancer research Key Contacts Emanuele de Rinaldis ([email protected]) Group: Michael Simpson, Lecturer in Medical and Molecular Medicine, King's College London Dept of Medical and Molecular Genetics 8th floor Tower Wing Guy's Hospital The main focus of the group is the application of contemporary genomic technologies to the elucidation of the genetic basis of human diseases and traits. An integral component of our research programme is centred on genome informatics. We have established data analysis pipelines for the interpretation of next-generation sequencing data and continue to evaluate and develop novel methodologies for the analysis and integration of large-scale genomic datasets to build a comprehensive understanding of the role of the genome and its variation in health and disease. Our data analysis is principally undertaken on the GSTT BRC HPC Computational Cluster. We work closely with the BRC Genomics and Bioinformatics Core Facilities and collaborate with both clinical researchers and computer scientists within KHP, the UK and beyond. Key Contacts: Michael Simpson ([email protected]) Group: Prof Michael Luck, Dept of Computer Science, KCL Members of the Group worked on various topics on medical oriented research, e.g. arrhythmia classification on mobile phones, herpes virus identification. In particular they developed algorithms for mapping high throughput sequencing technologies as well as weighted and degenerate sequences. The team also developed the first transcriptome map of mouse isochores from three distinct mouse tissues (muscle, liver, Draft 1 Date 11/12/11 Page 8 of 9 and brain). The team developed NGS prototype software – REAL (Read Aligner): an efficient, sensitive, and accurate alignment programme that can match or even outperform most well-known tools, e.g. SOAP2, Bowtie, BWA – cREAL (circular REAL): an extension of REAL specifically designed for small genomes with circular structure, e.g. bacterial chromosomes – DynMap: yet another alignment programme specifically designed for aligning a set of reads against many closely related genomes, e.g. individuals of the same species Members of the group worked on Haplotype Classification Algorithms, on disease gene identification related to haematopoiesis, Mammographic image analysis as well as Combinatorial Algorithms for Protein Folding Simulations and lattice protein folding simulations. Furthermore team members worked on analysis of biological network properties, analysis of the effect of gut bacteria in rats, machine learning methods for data classification in biomedical data, analysis of expression in psoriasis skin disease data, as well as network analysis in cardiovascular medicine: Fundamental research topics of the group include: ● Combinatorics on words and graphs ● Probabilistic analysis ● Constraint programming ● Combinatorial optimisation ● Automata theory Sequence analysis (alignment, re-alignment, etc.) ● Comparative genomics (evolution, function prediction) ● Classification of biomedical data ● Music analysis (rhythm, melody detection) ● Structure and search of large networks – (www, p2p, wireless, etc.) ● Text compression ● Data mining Key Contact: Michael Luck ([email protected]) This paper will be updated on request. Further information relating to health informatics in KHP should be forwarded to: [email protected] or [email protected] Draft 1 Date 11/12/11 Page 9 of 9