* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Martin John Bishop
Survey
Document related concepts
Genomic library wikipedia , lookup
Protein moonlighting wikipedia , lookup
Pathogenomics wikipedia , lookup
History of RNA biology wikipedia , lookup
Non-coding DNA wikipedia , lookup
Human genome wikipedia , lookup
Point mutation wikipedia , lookup
Metabolic network modelling wikipedia , lookup
Helitron (biology) wikipedia , lookup
Genome evolution wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Metagenomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Transcript
Martin John Bishop UK HGMP Resource Centre Hinxton Cambridge CB10 1 SB [email protected] http://www.hgmp.mrc.ac.uk Bioinformatics scope Genome sequences - DNA Transcripts - RNA Proteins Protein interactions Macromolecular assemblies Development and cellular function Genetic linkage analysis Molecular biology needs bioinformatics Biological data molecules Sequences Structures Gene expression Proteomes Pathways Evolution Computer methods analysis – Comparison Modelling Co-regulation Mass spectrometry Knowledge bases Phylogenetics Molecular biology is about information Central dogma Genome repository <-> RNA world -> Protein sequence -> Protein structure -> Protein function -> Phenotype <- Fed back to genome DNA <-> RNA -> protein -> phenotype <- DNA Molecules Processes Central paradigm Information processing The activities of HGMP-RC HGMP-RC Bioinformatics Services MHC Research Fugu Mouse sequencing Biology Services Technology development Biological materials Biological services by mail order including hotel facilities Contract R&D On-line service On-line service Services Mail Network News Files/Backup Information Unrestricted Data Links Analytical tools Registered users Public Data Private Data HGMP-RC SERVICE Web menu X (or VNC) Java Telnet Telnet menu / Unix login GENOME WEB Up to date Relevant Fully searchable Fully verified Extensive INTEGRATED ANALYSIS BLAST NIX PIX GLUE PIE MAGI PINT COMMON OPTIONS EMBOSS GCG PINE CLUSTAL STADEN PASSWORD GENOMICS APPLICATIONS Linkage Analysis Radiation Hybrid Mapping Sequence Ready Clone Maps Genome Databases Polymorphisms Sequence Analysis Gene Prediction Expression Profiling Phylogenetic Analysis Integrated Tools - GLUE, RHYME, NIX, PIE PROTEOMICS APPLICATIONS Protein Sequence Analysis Protein Structure Analysis Protein Structural Modelling Proteome Databases Tools for Peptide Sequence Determination Protein Cellular Localisation Protein Functional Studies Pathways and Protein Interactions Integrated tools and databases PIX NETWORK / JANET SERVICE LONDON Currently 34 Mbps main link Future keep 34 Mbps link for backup CAMBRIDGE Currently 8 Mbps redundant link Future Gigabit Ethernet SERVERS More than 80 servers 1, 4 and 8 cpu SMP Sparc and Intel Solaris and Linux Databases doubling every 14 months LOADS Load is the percentage of processes trying to run Interactive load 50% Job queues load 100% Jobs waiting can be 6-10 times the work being processed PROCESSES AND QUEUES Menu service (hot swop) General analysis (overloaded) Sun BLAST and NIX queue Dell BLAST queue BLAST data file server Interactive Linkage queue Heavy Linkage queue USERS’ REAL WORLD PROBLEMS Comparative method Extrapolate from known to similar Hints to reduce the amount of experimental work that needs to be done SOFTWARE SYSTEMS A variety of technical solutions are used BLAST NCBI Entrez SRS GeneCards NIX ENSEMBL HELPING THE USER Information discovery – completeness Communication – multiple sites Ontology – uniformity? Software integration – ease of use Reasoning about results Monitoring – repeat queries MAJOR CHALLENGES User interface Back end processing Cost recovery NEW TECHNOLOGIES? Web services GRID (EMBnet) Object-orientated computing Multi-agent systems TREASURE Web service with top level container Customise for the user User selects a service and opens it as an application An alternative view can be built around user data as the fundamental objects IMPLEMENTATION EMBREO library written in Java handles web service layer (also CORBA, XML-RPC, JDBC and other connectivity) Also handles file access and transfer and display of results (including use of VNC) Simple Object Access Protocol (SOAP) Browser channel uses XML format USER ACCOUNTING AND CUSTOMIZATION Currently very complex HED NIS+ Filesystem configuration files Future a single database Lightweight Directory Access Protocol (LDAP) CREDITS Gary Menu systems and Genome Web Geoff Gibbs Network and systems Peter Williams Tribble Web servers, Queues, Treasure