Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetic code wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Gene nomenclature wikipedia , lookup

Public health genomics wikipedia , lookup

Gene expression programming wikipedia , lookup

Epigenomics wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Gene expression profiling wikipedia , lookup

Metagenomics wikipedia , lookup

Molecular cloning wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Mutagen wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Human genome wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Epistasis wikipedia , lookup

Genetic engineering wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Pathogenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Genome (book) wikipedia , lookup

Oncogenomics wikipedia , lookup

Gene wikipedia , lookup

Minimal genome wikipedia , lookup

Non-coding DNA wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Designer baby wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Genome editing wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genomic library wikipedia , lookup

Frameshift mutation wikipedia , lookup

Mutation wikipedia , lookup

Genomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Microevolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

Genome evolution wikipedia , lookup

Point mutation wikipedia , lookup

Transcript
Welcome to
Integrated Bioinformatics
Friday, 8 September 2006
• Comparison of genomes – Scenario
• Installing and running Blast
• Weekend/Monday – How to find differences
• Nature of research articles
E. coli: What makes it kill?
Escherichia coli . . .
. . . very small lab rats
Courtesy of Kent State University Microbiology
E. coli: What makes it kill?
Escherichia coli . . .
haemorrhagic colitis
E. coli: What makes it kill?
E. coli K12
TCTACTTATA
AAGAGTCTGT
TTCTGTCTGC
TGGATTTCGG
GAACCTTAGT
CTCCGTAAAC
TGAATAAACT
AAGAGTTTAA
AAACCTGTAT
TTATATATTT
CCCCAGCTGT
GACAGCACTG
GCTGAAATTC
CCCTGCACCA
ATGAATGACT
TTCAATCCAC
TGAATGAACA
TCTGACCTCT
AACTCTAGCC
GACTTCTGCT
CTCTAACATG
TTGTTAAAGG
AGTTAAAAAC
GGTTACATGA
TAAGAAATTA
CATTAAAAAG
ACCCTCAAGA
CGCTGAGAGC
GGTCTTTCCT
GAACGAACGA
AGGGCTACAC
CATACATGGT
GGCAGCTTTC
TGCCCCACTC
ATACCAAAGT
ATGTCAGCAA
TACAAATGAA
GAATTGCAGT
ACTGCCTAAA
ATTGCAATTA
AGGCAAATAC
AGGCACCGGC
AGAGTGGTAC
GTGGGCACTG
TTGAATGAAA
Gene finder
E. coli O157:H7
TCTACTTATA
AAGAGTCTGT
TTCTGTCTGC
TGGATTTCGG
GAACCTTAGT
CTCCGTAAAC
TGAATAAACT
AAGAGTTTAA
AAACCTGTAT
TTATATATTT
CCCCAGCTGT
GACAGCACTG
GCTGAAATTC
CCCTGCACCA
ATGAATGACT
TTCAATCCAC
TGAATGAACA
TCTGACCTCT
AACTCTAGCC
GACTTCTGCT
CTCTAACATG
TTGTTAAAGG
AGTTAAAAAC
GGTTACATGA
TAAGAAATTA
CATTAAAAAG
ACCCTCAAGA
CGCTGAGAGC
GGTCTTTCCT
GAACGAACGA
AGGGCTACAC
CATACATGGT
GGCAGCTTTC
TGCCCCACTC
ATACCAAAGT
ATGTCAGCAA
TACAAATGAA
GAATTGCAGT
ACTGCCTAAA
ATTGCAATTA
AGGCAAATAC
AGGCACCGGC
AGAGTGGTAC
GTGGGCACTG
TTGAATGAAA
Gene finder
E. coli: What makes it kill?
E. coli K12
TCTACTTATA
AAGAGTCTGT
TTCTGTCTGC
TGGATTTCGG
GAACCTTAGT
CTCCGTAAAC
TGAATAAACT
AAGAGTTTAA
AAACCTGTAT
TTATATATTT
CCCCAGCTGT
GACAGCACTG
GCTGAAATTC
CCCTGCACCA
ATGAATGACT
TTCAATCCAC
TGAATGAACA
TCTGACCTCT
AACTCTAGCC
GACTTCTGCT
CTCTAACATG
TTGTTAAAGG
AGTTAAAAAC
GGTTACATGA
TAAGAAATTA
CATTAAAAAG
ACCCTCAAGA
CGCTGAGAGC
GGTCTTTCCT
GAACGAACGA
AGGGCTACAC
CATACATGGT
GGCAGCTTTC
TGCCCCACTC
ATACCAAAGT
ATGTCAGCAA
TACAAATGAA
GAATTGCAGT
ACTGCCTAAA
ATTGCAATTA
AGGCAAATAC
AGGCACCGGC
AGAGTGGTAC
GTGGGCACTG
TTGAATGAAA
Gene finder
E. coli O157:H7
TCTACTTATA
AAGAGTCTGT
TTCTGTCTGC
TGGATTTCGG
GAACCTTAGT
CTCCGTAAAC
TGAATAAACT
AAGAGTTTAA
AAACCTGTAT
TTATATATTT
CCCCAGCTGT
GACAGCACTG
GCTGAAATTC
CCCTGCACCA
ATGAATGACT
TTCAATCCAC
TGAATGAACA
TCTGACCTCT
AACTCTAGCC
GACTTCTGCT
CTCTAACATG
TTGTTAAAGG
AGTTAAAAAC
GGTTACATGA
TAAGAAATTA
CATTAAAAAG
ACCCTCAAGA
CGCTGAGAGC
GGTCTTTCCT
GAACGAACGA
AGGGCTACAC
CATACATGGT
GGCAGCTTTC
TGCCCCACTC
ATACCAAAGT
ATGTCAGCAA
TACAAATGAA
GAATTGCAGT
ACTGCCTAAA
ATTGCAATTA
AGGCAAATAC
AGGCACCGGC
AGAGTGGTAC
GTGGGCACTG
TTGAATGAAA
Gene finder
E. coli: What makes it kill?
Killer protein
Killer functions
Membrane protein, sodium transporter
Iron responsive transcriptional regulator
Calcium-dependent protein kinase
Unknown protein
Unknown protein
Similarity finder
Unknown protein
...
ideas for new antibiotics
Welcome to
Integrated Bioinformatics
Friday, 8 September 2004
TCTACTTATA
AAGAGTCTGT
TTCTGTCTGC
TGGATTTCGG
GAACCTTAGT
CTCCGTAAAC
TGAATAAACT
AAGAGTTTAA
AAACCTGTAT
TTATATATTT
CCCCAGCTGT
GACAGCACTG
GCTGAAATTC
CCCTGCACCA
ATGAATGACT
TTCAATCCAC
TGAATGAACA
TCTGACCTCT
AACTCTAGCC
GACTTCTGCT
CTCTAACATG
TTGTTAAAGG
AGTTAAAAAC
GGTTACATGA
TAAGAAATTA
CATTAAAAAG
ACCCTCAAGA
CGCTGAGAGC
GGTCTTTCCT
GAACGAACGA
AGGGCTACAC
CATACATGGT
GGCAGCTTTC
TGCCCCACTC
ATACCAAAGT
ATGTCAGCAA
TACAAATGAA
GAATTGCAGT
ACTGCCTAAA
ATTGCAATTA
AGGCAAATAC
AGGCACCGGC
AGAGTGGTAC
GTGGGCACTG
TTGAATGAAA
Gene finder
TCTACTTATA
AAGAGTCTGT
TTCTGTCTGC
TGGATTTCGG
GAACCTTAGT
CTCCGTAAAC
TGAATAAACT
AAGAGTTTAA
AAACCTGTAT
TTATATATTT
CCCCAGCTGT
GACAGCACTG
GCTGAAATTC
CCCTGCACCA
ATGAATGACT
TTCAATCCAC
TGAATGAACA
TCTGACCTCT
AACTCTAGCC
GACTTCTGCT
CTCTAACATG
TTGTTAAAGG
AGTTAAAAAC
GGTTACATGA
TAAGAAATTA
CATTAAAAAG
ACCCTCAAGA
CGCTGAGAGC
GGTCTTTCCT
GAACGAACGA
AGGGCTACAC
CATACATGGT
GGCAGCTTTC
TGCCCCACTC
ATACCAAAGT
ATGTCAGCAA
TACAAATGAA
GAATTGCAGT
ACTGCCTAAA
ATTGCAATTA
AGGCAAATAC
AGGCACCGGC
AGAGTGGTAC
GTGGGCACTG
TTGAATGAAA
Gene finder
Welcome to
Integrated Bioinformatics
Friday, 8 September 2006
• Nature of research articles
• Comparison of genomes - Scenario
• Weekend/Monday – How to find differences
– Parsing programs
– Regular expressions
Welcome to
Integrated Bioinformatics
Friday, 8 September 2006
• Nature of problem sets
• Nature of research articles
• Comparison of genomes - Scenario
• Weekend/Monday – How to find differences
• Today – Why differences
How do differences arise between genomes?
Addition/deletion of DNA
Where do they come from?
How to distinguish
– GC-content
from
?
How do differences arise between genomes?
Addition/deletion of DNA
Point mutation
organism 1 TTT TCT GAA TCC GTA GAC GTT
organism 2 TTT TCT GAA TCA GCA GAC GTG
What kind of mutations arise?
How do differences arise between genomes?
Addition/deletion of DNA
Point mutation
Keeping track of gene variants
– Concepts of ortholog / paralog
How do differences arise between genomes?
Infection
Phage
Phage genome
Bacterial chromosome
Lysogenic
Phage genome pathway
Death
General transduction
Lytic
pathway
How do differences arise between genomes?
Infection
Phage
Phage genome
Bacterial chromosome
Lysogenic
Phage genome pathway
Lytic
pathway
Life!
How do differences arise between genomes?
Infection
Phage
Phage genome
Bacterial chromosome
Lysogenic
Phage genome pathway
Lytic
pathway
Life!
Specialized transduction
The gene encoding diphtheria toxin (tox)
is carried on corynephage b
b
tox – C.d.
tox + C.d.
Lysogenic conversion by corynephage b confers toxogenicity!!
How to distinguish foreign from native genes?
GC-content =
[G] + [C]
[total nucleotides]
SQ2: List the two triplets that code for Lys. What
proportion of each is used in Borrelia burgdorferi
compared to Mycobacterium tuberculosis? Is this
finding surprising? Why or why not?
Borrelia burgdorferi
AAU Asn 0.80
AAC Asn 0.20
AAA Lys 0.80
AAG Lys 0.20
Mycobacterium tuberculosis
AAU Asn 0.21
AAC Asn 0.79
AAA Lys 0.26
AAG Lys 0.74
29% GC content
65% GC content
How to distinguish foreign from native genes?
SQ4: The GC content of Bacillus anthracis is
33.97%. By analysis of codon use, would it likely be
easier to detect a foreign gene originating from
Borrelia burgdorferi or from Mycobacterium
tuberculosis?
Borrelia burgdorferi
AAU Asn 0.80
AAC Asn 0.20
AAA Lys 0.80
AAG Lys 0.20
Mycobacterium tuberculosis
AAU Asn 0.21
AAC Asn 0.79
AAA Lys 0.26
AAG Lys 0.74
29% GC content
65% GC content
DNA mutation has multiple causes
• Errors during DNA replication
• base mis-incorporation
• polymerase slippage / repeat amplification
• Errors during recombination or cell division
• chromosome loss or rearrangement
• large insertions or deletions
• Environmental factors – mutagens:
• radiation – UV or ionizing radiation
• chemical – many mechanism of action
• Spontaneous events:
• tautomerisation
• depurination
• deamination
• Viral infection or transposons
How do differences arise between genomes?
Addition/deletion of DNA
Point mutation
organism 1 TTT TCT GAA TCC GTA GAC GTT
organism 2 TTT TCT GAA TCA GCA GAC GTG
GUU
GUC
GUA
GUG
Val
Val
Val
Val
GCU
GCC
GCA
GCG
Ala
Ala
Ala
Ala
How do differences arise between genomes?
Addition/deletion of DNA
Point mutation
organism 1 TTT TCT GAA TCC GTA GAC GTT
organism 2 TTT TCT GAA TCA GCA GAC GTG
Silent mutation
GUU
GUC
GUA
GUG
Val
Val
Val
Val
GCU
GCC
GCA
GCG
Ala
Ala
Ala
Ala
Single base mutations
Transitions
Purine for purine
or
pyrimidine for pyrimidine
Transversions
Purine for pyrimidine
or
pyrimidine for purine
How do differences arise between genomes?
Addition/deletion of DNA
Point mutation
organism 1 TTT TCT GAA TCC GTA GAC GTT
organism 2 TTT TCT GAA TCA GCA GAC GTG
Transition:
Transversion:
purine
pyrimidine
purine
pyrimidine
purine
pyrimidine
Tautomerization of bases
C
T
G
A
C* T*
A
G
DNA replication can “lock in” a mutation
Mutations can arise as a consequence of misincorporation during replication
How to distinguish foreign from native genes?
SQ7: There are two codons each for 9 of the amino
acids. Choose any one of these 18 codons.
• Create a transition mutation in the third position of
the codon. What is the result?
• Create a transversion mutation in the third position.
What is the result?
• In the third position, are transition mutations or
transversion mutations more likely to result in a
change in the amino acid encoded?
How do differences arise between genomes?
Addition/deletion of DNA
Point mutation
Keeping track of gene variants
– Concepts of ortholog / paralog
How do differences arise between genomes?
Addition/deletion of DNA
Point mutation
Keeping track of gene variants
– Concepts of ortholog / paralog
Orthologs, Paralogs, and Xenologs
Speciation event
leading to orthologs
Horizontal transfer
leads to xenologs
Gene duplication
gives rise to paralogs
Orthologs vs Paralogs
SQ5: Are genes B1
and C2 orthologs or
paralogs?
How to predict
orthology with
imperfect
information?
A1 AB1
Species A
B1
B2 C1 C2
Species B
C3
Species C