Download European Molecular Biology Institute European Bioinformatics Institute

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Expression vector wikipedia , lookup

Chemical biology wikipedia , lookup

Protein–protein interaction wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Ancestral sequence reconstruction wikipedia , lookup

Two-hybrid screening wikipedia , lookup

GENCODE wikipedia , lookup

History of molecular biology wikipedia , lookup

Nuclear magnetic resonance spectroscopy of proteins wikipedia , lookup

Transcript
http://www.embl.de/
from 1974
http://www.ebi.ac.uk/
from 1996
The European Molecular Biology Laboratory
(EMBL) is supported by sixteen countries.
Consists of the main Laboratory in Heidelberg
(Germany), Outstations in Hamburg (Germany),
Grenoble (France) and Hinxton (U. K.), and an
external Research Programme in Monterotondo
(Italy).
The EBI Mission
 To provide Bioinformatics Facilities for the
Scientific Community
 To become a flagship laboratory for research
in bioinformatics
 To provide bioinformatics training
 To help disseminate standards &
technologies
Role of Bioinformatics
 To Support Experimental Biology
 To Collect and Archive Data
 To provide Framework and Integration
 To give Easy Access to Data
 To make New Discoveries through Data
Analysis
 To predict through modelling
 To facilitate application and exploitation of
academic research in Medicine, Agriculture,
Health and Environment
Dramatic Changes in Biology over last 5
years
 Data Explosion & New Types of Data
 Move towards High-Throughput Biology
 Move towards Systems Biology
 Much larger community – often naïve
users
 Growth of Applied Biology – molecular
medicine, agriculture, food, environmental
sciences
Genomes
Literature
Expressionprofiling
Proteome
data
Metabolic
data
Bioinformatics
Comparative
genomics
Biochemistry
Mutant/RNAi
data
Hypotheses and
in silico models
Molecules to Cells to Organisms
Protein
E.coli Genome
Genomes
Systems Biology
Input
AdaptorAdaptor
Methyl
CheB
ATP
CheA CheWCheW
ADP
Pi
Pi
CheY
CheZ
Flim C
Output
Methyl
CheR
Molecular Basis of Disease
p53 tumour suppressor
core domain –
cancers of many types
Cu-Zn Superoxide
Dismutase - Autosomal dominant
Amyotrophic lateral sclerosis
From Structure to Functional Annotation
Linking to
Domain
data,
eFamily
Sequence Mapping,
SIFTS
MSDchem ligand data
Electron Density Visualisation
AstexViewer MSDPro, MSDlite
MSDsite Active sites
SSM fold matching
PQS biological assemblies
Surface Matching
From Structure To
Biochemical Function
Gene  Protein  3D Structure  Function
Given a protein structure:
 Where is the functional site?
 What is the multimeric state of the protein?
 Which ligands bind to the protein?
 What is biochemical function?
High throughput






A new sequence every 4 seconds
600 000 web requests a day
100 000 users
5-10 core databases
20 000 000 cross-references
About 160 other databases
Data Growth
Web requests per day
(excluding Ensembl)
500000
450000
400000
350000
300000
250000
200000
150000
100000
50000
Dec-02
Sep-02
Jun-02
Mar-02
Dec-01
Sep-01
Jun-01
Mar-01
Dec-00
0
ftp
year
2001
2002
2003
2004
2005
million files; Terabytes
4.5
11914
5.6
11809
13.5
43860
17.3
60508
26.3
85396
Web Servers Requests
millions
2002
2003
2004
2005
118
255
354
482
118631650
255399724
354235704
482076196
Distinct hosts served Number users(millions)
2002 1586883
1.5
2003 2784974
2.7
2004 3656109
3.6
2005 3919564
3.9
dynamic pages domains (2005)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
.uk (United Kingdom)
.com (Commercial)
[unknown domain]
[unresolved numerical addresses]
.edu (USA Higher Education)
.net (Networks)
.fr (France)
.it (Italy)
.de (Germany)
.nl (Netherlands)
21.14%
17.16%
13.37%
11.05%
5.29%
5.27%
4.76%
4.68%
2.81%
2.00%
The Services of the EBI
 Nucleotide sequences
 Genes
 Transcription information
 Protein sequences
 Protein families
 Macromolecular structures
 Molecular interactions
 Pathways
 Metabolic information
 Scientific Literature
Structure of EBI: Services
Structure of EBI: Services
Database
Integration
and
External
Services
Lopez
Apweiler,
Stoesser
Stoehr,
Zhu
Henrick
Brazma
Birney
Structure of EBI: Research
Structure of EBI: Research
Text Mining
Schuhmann
Structural
Proteomics
Computational
Genomics
Ouzounis
Thornton
?
Le
Novere
Neuroinformatics
Goldman
Phylogeny &
Evolution
EBI DATA BASES
EMBL-Bank
DNA sequences
EMBL-Bank
DNA sequences
SWISS-PROT
+ TrEMBL
Protein Sequences
EMBL-Bank
DNA sequences
SWISS-PROT
+ TrEMBL
Protein Sequences
EMSD
Macromolecular
Structure Data
EMBL-Bank
DNA sequences
Array-Express
Microarray
Expression Data
SWISS-PROT
+ TrEMBL
Protein Sequences
EMSD
Macromolecular
Structure Data
EMBL-Bank
DNA sequences
Array-Express
Microarray
Expression Data
SWISS-PROT
+ TrEMBL
Protein Sequences
EMSD
Macromolecular
Structure Data
EnsEMBL
Human Genome
Gene Annotation
EMBL-Bank
DNA sequences
Array-Express
Microarray
Expression Data
SWISS-PROT
+ TrEMBL
Protein Sequences
IntAct
Protein Interactions
EMSD
Macromolecular
Structure Data
EnsEMBL
Human Genome
Gene Annotation
EMBL-Bank
DNA sequences
GKB
Pathways
Array-Express
Microarray
Expression Data
SWISS-PROT
+ TrEMBL
Protein Sequences
EnsEMBL
Human Genome
Gene Annotation
IntAct
Protein Interactions
EMSD
Macromolecular
Structure Data
Integration
Integrative science demands
integrative resources

EBI databases have a backbone of integrative
links
 20 000 000 cross-references support transdatabase navigation
 Is this good enough?
 sparse and coarse-grain
 not straight-forward to use
Integrative science
demands
integrative resources
Major efforts involved in integration
 Interpro: database of protein families, domains
and functional sites.
 Interg8: data integration project co-ordinated by
the EBI, to provide an integrated layer for the
exploitation of genomic and proteomic data.
 GRID technologies
European Patent Office
 Support the inclusion of sequence data in
the public databases
 Development of tools to capture sequence
data
 Run their searches at the EBI
 (similar arrangements in USA and Japan
ensure exchange)
 Analogous systems being developed for
structure information
Industry Support
Industry Support
 Current successful Industry programme for
Pharma
 Quarterly meetings
 R&D Training - workshops
 Industry Forum
 Funded by subscriptions
 New SME programme under development
New Data
Expression
Data
Chip-onChip
Proteomic
Data
Metabolome
Data
Human
Variation
Atlases
Disease
Links
Electron
tomographs
??
http://www.ebi.ac.uk/2can/
The Magic Search Box