Download Genomics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Genomics
Genomics is the study of an
organism's genome and the
function of the genes
>>200 microbial genomes
completely sequenced.
Key question:
How to use this rich source of
information?
DNA code: A , G , C , T
D
W
C
I
start
stop
gene
Functional genomics
All genes
Single genes
DNA
Organisation
(HT-sequencing)
RNA
Expression
(DNA-arrays)
PROTEIN
METABOLISM
Synthesis/Structure
(2D gels -MS-NMR-Xray)
Flux
(NMR-kinetics-model)
FUNCTION
GENOME
TRANSCRIPTOME
PROTEOME
METABOLOME
Reading the genome map
Steps
1.
2.
3.
4.
5.
6.
7.
Determine complete DNA sequence
Predict genes
Translate genes to proteins
Predict functions of proteins
Reconstruct metabolic pathways
Predict regulatory elements
Reconstruct regulatory networks
Next: experimental confirmation !

transciptomics, proteomics, metabolomics
Genomics: from sequence to
predicted function
Raw sequence data:
Bacterial sequence of
2.000.000 to 5.000.000
nucleotides
AAACACTTAGACAATCAATATAAAGATGAAGTGAA
CGCTCTTAAAGAGAAGTTGGAAAACTTGCAGGAAC
AAATCAAAGATCAAAAAAGGATAGAAGAACAAGAA
AAACCACAAACACTTAGACAATCAATATAAAGATG
AAGTGAACGCTCTTAAAGAGAAGTTGGAAAACTTG
CAGGAACAAATCAAAGATCAAAAAAGGATAGAAGA
ACAAGAAAAACCACAAACACTTAGACAATCAATAT
AAAGATGAAGTGAACGCTCTTAAAGAGAAGTTGGA
AAACTTGCAGGAACAAATCAAAGATCAAAAAAGGA
TAGAAGAACAAGAAAAACCACAAACACTTAGACAA
TCAATATAAAGATGAAGTGAACGCTCTTAAAGAGA
AGTTGGAAAACTTGCAGGAACAAATCAAAGATCAA
AAAAGGATAGAAGAACAAGAAAAACCACAAACACT
TAGACAATCAATATAAAGATGAAGTGAACGCTCTT
AAAGAGAAGTTGGAAAACTTGCAGGAACAAATCAA
AGATCAAAAAAGGATAGAAGAACAAGAAAAACCAC
AAACACTTAGACAATCAATATAAAGATGAAGTGAA
CGCTCTTAAAGAGAAGTTGGAAAACTTGCAGGAAC
AAATCAAAGATCAAAAAAGGATAGAAGAACAAGAA
AAACCAC
A virtual cell:
overview of predicted pathways
What do we want to learn ?
Overview of
•
complete repertoire of genes and proteins
• complete metabolic network
• complete regulatory network
• diversity and evolution
Systems biology:
understand how a
whole cell works
Genome content
bacteria
Size (Mb)
2
yeast
12
worm
97
fly
man
137
3.500
% genes
total genes
junk ?
2.000
6.300
19.000
14.000
30.000 ?
Microbial genomes
Microbial genome sequencing
1995-2000:
Mainly pathogenic bacteria
2000-present:
Genomes of many food relevant micro-organisms
- Lactic Acid Bacteria
- Food Spoilage Bacteria
Genome Sequencing Projects
2005:
250 complete genomes
600 million bases
600 thousand proteins
1997
2000
2003
Microbial genomes
Archaea
sequenced genomes
size range (Mb)
23
Bacteria
236
0.5-5.8
0.6-9.1
genes
540-4500
470-8300
% GC
31 - 68
22 - 72
Coding density is ~ 85-90%
Average of ~ 1 gene per 1 kb
Status Sept. 2004
Bacterial genomes
Chromosomes
Plasmids
1
0-10
Exceptions
Linear chromosomes
• Borrelia burgdorfei
• Rickettsia typhi
• Desulfotalea psychrophila
• Streptomyces coelicolor
Two chromosomes
• Ralstonia solanacearum
• Agrobacterium tumefaciens
• Vibrio cholerae
• Brucella melitensis
• Deinococcus radiodurans
circular
circular
0.91 Mb
1.11 Mb
3.52 Mb
8.67 Mb
3.72 and 2.09 Mb
2.84 and 2.07 Mb
2.96 and 1.07 Mb
2.12 and 1.18 Mb
2.65 and 0.41 Mb
0.6 - 9 Mb
1 - 250 kb
Biological Databases
Database types:
•
•
•
•
•
•
•
•
sequence
annotation
enzyme
genome
structure
pathway
organism
organizational
EMBL, GenBank
SwissProt
Enzyme, Brenda
Entrez, EBI-Genome Reviews
PDB, SCOP
KEGG, EcoCyc
FlyBase, WormBase
Pfam, COG
Summarized each year in Nucleic Acids Res., January issue
Genome Databases
Main databases
•
NCBI
Entrez
www.ncbi.nlm.nih.gov/genomes/lproks.cgi
•
EBI
Genome Reviews
www.ebi.ac.uk/genomes/bacteria.html
•
TIGR
Comprehensive Microbial Resource (CMR)
www.tigr.org/tdb
•
Integrated Genomics GOLD
www.genomesonline.org
•
CBS
Genome Altas
www.cbs.dtu.dk/services/GenomeAtlas
Genome Databases
Specialized databases
•
Sanger Institute
(UK)
own genomes, many pathogenic bacteria
www.sanger.ac.uk/projects
•
Pasteur Institute
(France)
own genomes, many pathogenic bacteria
www.pasteur.fr/english.html
•
MIPS
(Germany)
PEDANT – all genomes
http://pedant.gsf.de/
•
DOE-JGI
(USA)
own genomes, many microbial - environmental
http://genome.jgi-psf.org/microbial/
Genome Databases
Overviews of databases
•
ABIM
(France)
organism databases
www.up.univ-mrs.fr/~wabim/english/genome.html
Complete Genomes
• COGENT (COmplete GENome Tracking : a flexible data environment for
computational genomics) EBI (UK)
• Complete genomes NCBI (Haemophilius influenza, E. Coli, Mycoplasma
genitalium)
• Completed Genomes at the EBI EBI (UK)
• Completed microbial genomes InfoBioGen (France)
• Completed microbial genomes TIGR
• Completely sequenced genomes Rockfeller (USA)
• EMGLib (completely sequenced bacterial genomes and the yeast genome) PBIL
(France)
• Fully Sequenced Genomes Present In The Public DataBases GOLD (USA)
• Integr8 (integrated views of complete genomes and proteomes) EBI (UK)
• PEDANT (Protein Extraction, Description, and Analysis Tool) MIPS (more 200
Genome Databases
Comparative genomics databases
•
•
ERGO
(USA)
Comparative genomics analysis
http://ergo.integratedgenomics.com/ERGO
Genome Databases
Comparative genomics databases
•
STRING
(De)
Search Tool for the Retrieval of Interacting Genes/Proteins
http://www.bork.embl-heidelberg.de/STRING
Genome Databases
Metabolic pathway-genome databases (PGDB)
•
KEGG
(Japan)
Kyoto Encyclopedia of Genes and Genomes
http://www.genome.jp/kegg/kegg2.html
•
EcoCyc
E.coli metabolic pathways
(highly curated)
http://www.ecocyc.org.
(USA)
•
BioCyc
collection of PGDBs
http://www.biocyc.org
Modeling metabolic networks:
what are the questions?
•
modeling the components and their
wiring (roadmap)
•
modeling regulatory interactions
(traffic lights)
•
modeling fluxes and dynamics
(traffic)
•
predictive modeling: rational
design (solve traffic jams)
•
“genomics modeling”: provide
biological interpretation of omics
data
Genome sequence annotation
Related documents