Download GeneticsLecture2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Genetics I (prokaryotes)
IT Carlow Bioinformatics
September 2006
Biochemistry
• How biology works
• Mechanisms
Genetics
• How things are inherited
• Why you are like your parents but also
different
• Where genes, pathways, wings, flippers
come from
• How things develop from zygote
• But it’s all molecular biology nowadays
Genetics
• The interesting stuff
• “Nothing in biology makes sense except in
the light of evolution” Dobzhansky
• “Nothing in bioinformatics makes sense except in
the light of evolution” Higgs & Attwood
• Evolution = change in gene frequency over time
• What is gene? What is frequency? What is
change? What is time? What is life?
Genome size and differences
•
•
•
•
•
•
•
Species Genome size BP
Human 3,000,000,000
Yeast
16,000,000
E.coli
4,000,000
Gene = 1000bp = 300AA
All descended from LUCA
How?
Genome size genes
25,000
6,500
4,000
DNA
• Double helix
–
–
–
–
10Å radius (1 or better 1.2nm)
34Å for single turn
3.4Å for single base (0.34nm)
10 bp per turn
• E.coli 4Mb how many Å3/nm3 of DNA?
• Size of E.coli? About 1x2m
• Thinking exercise: % E.coli vol is DNA?
Mutation
•
•
•
•
•
•
DNA damage from UV light, coal-tar
Replicative failure (DNApol is good but..)
Humans 2.5 *10-8/bp/cell div
E.coli 1*10-7/bp/div
Humans/Chimps 1% diff but 35m diffs
You have 1014 cells now from start of 1
Bases
A
G
Purines R big
Purines Are biG
T
Pyrimidines Y
CUT tinY
C
Base pairs
Tm!
Weak
Strong
Mutation 2
• E.coli has mutations
• Humans have somatic and germline mutations
• Point mutation
– missense
• transition R – R, Y – Y, C – T, G – A
• transversion R – Y A-T C-G C-A
– nonsense  TGA, TAG, TAA
– Non-coding
• Splice, 5’ 3’, Intron
Mutation 3
• Insertions and deletions
–
–
–
–
One bp is sometimes called “point”
Frameshift
ATGCCCTGCAATGAC
ATGCCCCTGCAATGAC
Ooops
Methylation of C
Mutations 4
• Chromosomal rearrangement
– Inversion
– Translocation
• Chromosome copy
– Aneuploidy (Down’s)
– Polyploidy (tetraploid)
– Whole genome duplication WGD
• Mutational hotspots
• Repeats GCGCGCGCGC slip = microsatellites
Genetic code
The “Universal” Genetic Code.
Phe UUU
UUC
Leu UUA
UUG
Ser UCU
UCC
UCA
UCG
Tyr UAU
UAC
ter UAA
ter UAG
Cys UGU
UGC
ter UGA
Trp UGG
Leu CUU
CUC
CUA
CUG
Pro CCU
CCC
CCA
CCG
His CAU
CAC
Gln CAA
CAG
Arg CGU
CGC
CGA
CGG
Ile AUU
AUC
AUA
Met AUG
Thr ACU
ACC
ACA
ACG
Asn AAU
AAC
Lys AAA
AAG
Ser AGU
AGC
Arg AGA
AGG
Val GUU
GUC
GUA
GUG
Ala GCU
GCC
GCA
GCG
Asp GAU
GAC
Glu GAA
GAG
Gly GGU
GGC
GGA
GGG
Willie Taylor’s AAs
Mutations 5
• Synonymous usually 3rd base
• Non-synonymous
– Conservative AAA – AGA Lys - Arg
– Radical AAA – UAU Lys - Tyr
• CpG methylation  mutational hotspot
• CpG islands 5’ mamm housekeeping genes
Mutations Quiz
T
A
U
C
G
3’
5’
Exon
Intron
Which mutations AUTCG are most likely to be baaaad?
Mutations & evolution
• Most bacteria have a characteristic mutational
bias.
• This will give a species specific G+C ratio
– E.coli 50%
– B.subtilis 40%
– Extreme Mycoplasma, Micrococcus
• Many bacteria have strand bias because the
Okazaki enzymes have a different bias
• Hi GC and Lo GC gram positive.
Quiz “answers”
•
•
•
•
•
•
•
Location
5’
Synon
NonSyn
Intron
3’

Rate (subst/site/year*10-9)
2.36
4.65
0.88
Synon Not neutral?
3.7
4.46
(Pseudogene)
4.85
Substitution
• A mutation that’s been sieved by selection
• Selection is a population/probability term
• Probability that a mutation will
a) survive?
b) become polymorphism?
c) replace existing?
• Depends on population size
Bacterial genes/genomes
• E.coli about 4000 genes, 4 Mbases
• Tightly packed, usually no overlap
– Viruses ++ tightly packed, overlapping genes
• Origin of replication
– Usually near dnaA
• DNA polymerase
–
–
–
–
Binds and copies
Needs gyrase, helicase etc.
5’-3’ strand = read through
3’-5’ strand read in chunks: Okazaki fragments
Operons
• Jacob and Monod (and Lwoff)
• Lac operon
i
p
o
z
y
a
• lacZ lacY lacA induced and transcribed
together
• lacI adjacent but separate transcript
• MolBiol? Measure mRNA levels, -gal
• Evol? Co-transcription for better control
Odd operons
• Easy explanation when only E. coli and B.
subtilis available
• But M. jannaschii (first archaea sequenced)
– Linked, cotranscribed but biochemically mad
• Fallout from genome sequencing
• tRNA complement informs expression
Bioinformatic consequences
• RNA polymerase needs binding site
• Promoter site upstream from transcrip start
• -35
-10
TTGACANNNNNNNNNNNNNNNNNTATATT
•
•
•
•
Site directed mutagenesis can parse the info
Remember lacZ,Y,A cotranscribed
Then 3’ trailer after last stop codon
Try to think of 3-D picture
Gene structure
•
•
•
•
•
•
Upstream control regions
Start codon
Open Reading Frame (ORF)
Stop codon UGA UAG UAA
3’ downstream
So gene prediction is “easy”
Consequences
• This view of how the process works
– Colours our view of sequences
• Central dogma:
– DNA makes RNA makes PROTEIN makes
everything else
• RNA makes DNA means inheritance of
acquired characteristics (Lamarck).
• Leads to a particular definition of “gene”
Translation
•
•
•
•
•
•
•
•
Transcription gives you mRNA
Translation gives you protein
In bacteria transcrp transl simultaneous
Ribosome – complex (cottageloaf) of two
subunits 50S and 30S = 70S
30S 21 proteins rpsX and 16S RNA
50S 34 proteins rplX and 23S+5S RNA
Needs tRNA, mRNA
Ribosome binding site RBS upstream from
ATG
Summary
• What we know about the genetics can help
us identify genes bioinformatically
– DNA signatures (RBS, Promoter)
– Start - ORF - stop pattern
– Consistent codon usage
• Have we predicted a real gene?
– Is it present as mRNA?
Related documents