Download Gene!

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene expression programming wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

DNA damage theory of aging wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Mitochondrial DNA wikipedia , lookup

Genealogical DNA test wikipedia , lookup

DNA vaccination wikipedia , lookup

Nucleosome wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Metagenomics wikipedia , lookup

Oncogenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Ridge (biology) wikipedia , lookup

Pathogenomics wikipedia , lookup

Molecular cloning wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

DNA supercoil wikipedia , lookup

Epigenomics wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Primary transcript wikipedia , lookup

Nucleic acid double helix wikipedia , lookup

Point mutation wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Gene expression profiling wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

RNA-Seq wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Genetic engineering wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Transposable element wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Genomic library wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Human genome wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Minimal genome wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome (book) wikipedia , lookup

Genomics wikipedia , lookup

Gene wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genome editing wikipedia , lookup

Designer baby wikipedia , lookup

Microevolution wikipedia , lookup

Genome evolution wikipedia , lookup

History of genetic engineering wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
Bioinformatics Practical
for
Biochemists
Andrei Lupas, Birte Höcker, Steffen Schmidt
SS 2012
01. History of DNA
Description
•
Lectures about general topics in
Bioinformatics & history
•
Tutorials will provide you with a toolbox of
bioinformatics programs to analyze data
•
Open Sessions will give you the opportunity to
use these tools
Course Outline
•
•
•
•
•
Mon
– DNA & Genomics
Tue
– Introduction to Proteins
Wed
– Annotation of Sequence Features
Thr
– Evolution & Design
Fri
– Protein Classification
Course Outline
• 13:00-14:00
• 14:15-17:30
Presentation
Tutorial (2 x 30min)
& hands-on practical
• You will need to keep an electronic lab
notebook
• Fri afternoon: Test Exercises
Software Requirements
•
•
•
Browser (e.g. Firefox)
“Advanced” Word Processor
PyMOL (www.pymol.org – free for teaching)
History of DNA
1953 Model of DNA (F. Crick)
What is the “genetic material”?
•
1865 Gregor Mendel
•
•
1869 Friedrich Miescher
•
•
discovery of ‘nuclein’ (DNA),
only published in 1871 since Hoppe-Seyler
1881 Edward Zacharias
•
•
basic rules of heredity
chromosomes are composed of nuclein
1899 Richard Altmann
•
renaming nuclein to nucleic acid
wikipedia.org
DNA is the “transforming material”
•
1928 Frederick Griffith
•
•
“transforming principle” Str. pneumoniae experiment
1944
Avery & McCarty
•
•
Griffith’s “transforming principle” is DNA
Isolation of DNA/RNA
DNA is the genetic material
•
1950 Erwin Chargaff
•
•
A/T, C/G same amount in different tissues
1952 Hershey & Chase
•
DNA is the genetic material using
P32/S35 Phage/E. coli experiment
Solving the DNA structure
•
1952/53 Linus Pauling
•
beat Cavendish lab in discovery of
α-helix
•
Cavendish (Cambridge) allows
Watson & Crick
to work full-time on DNA
•
Manuscript shared with Cavendish
lab before publication
http://osulibrary.oregonstate.edu/specialcollections/coll/pauling/dna/notes/1952a.22-ms-01.html
Solving the DNA structure
•
NATURE | VOL 421 | 23 JANUARY 2003 | ww
•
1952 Franklin & Wilkins
•
X-ray of B-DNA - Wilkins showed
results to Watson & Crick
•
periodicity, phosphates are outside
1953 Crick & Watson
•
model of B-DNA
ature.com/nature
Solving the DNA structure
© 2003 Nature Publishing Group
397
DNA structure
Getting the “code”
•
1953 George E. Palade
•
•
“RNA organelles” (ribosomes)
1957 Crick et.al
•
•
•
suggest non-overlapping triplets
only 20 out of 64 triplet code for an amino acid
“comma-free code”
(d) The code is probably
‘degenerate’;
that is, in
general, one particular
ammo-acid
can be coded by
one of several tripieta of bases.
The Reading ofthe
the Codecode
Getting
‘report
hers our work ,on the mutant
P 13 (now
renamed
FC 0) in the Bl segment of the B cistron.
Thie mutant
was originally
produced
by the action
of proflavins.
We@ have previously
argued that acridines
such
aa pro5vin
act as mutagens
because they add or
dslsts a base or bases. The most striking evidence in
favour of this is that mutants
produced by a&dines
are seldom ‘leaky’ ; they are almost always completely
Since our note
lacking in the function
of the gene.
was published,
experimental
data from two eourcsa
have been added to 0u.1: previous
evidence:
(1) we
have examined
a set of 126 pn mutants
made with
polyF
acridine yellow; of these only 6 are IeaLT- (typically
about half the mutants
made with base analogues
are leaky) ; (2) Streisinger lo has found that whereas
mutants
of the lysozyme of phage T4 produced
by
all lysozyme
baas-analogues
are usually
leaky,
mutants
produced by proflavin
are negative, that is,
the function
is completely
lacking.
If an acridine mutant i,3 produced by, say, adding a
base, it should revert to ‘lvild-type’
by deleting a bass.
Our work on revertants
of FC-0 shows that it-usually
The evidence that the genetic cods is not overlapping (see Fig. 1) doss not come from our work.
but from that, of Wittmannl
and of Tsugita
and
Frasnkel-Conrat
on the mutants
of tobacco mosaic
virus produced
by nitrous asid.
In an overlapping
triplet code, an alteration
to one baas will in general
change three adjacent amino-acids in the polypeptide
produces
chain. Their work on the polyU
alterationsmRNA
produced
in the
protein
of the virus show that usually
only one
amino-acid at a time is changed
a8 a result
of treating
complete
genetic
code
the ribonuclsic
acid (RNA) of the virus with nitrous
acid.
In the rarer cases where two amino-acids
are
altered (owing presumably
to two separate deammations by the nitrous
acid on one piece of RNA), the
altered amino-acids
ars not in adjacent
positions
in
the polypeptide
chain.
Brsnnera had previously
shown that, if the code
were universal
(that is, the same throughout
Nature),
then all overlapping
triplet
codes were impossible.
no
overlapping
codes
Starlinq point
Moreover,
all the abnormal
human
hremoglobins
3
,, ;$I
Overlappirq
code
studied in detail4 show only single amino-acid changes.
The newer experimental
rssulta ssssntially
rule out
concept
of mRNA
+7
all simple codes of the overlapping
type.
NUCLEIC ACID *
I’
’ ’ ’ ’ ’ ’ --If the code is not overlapping,
then there must be
,-J+-~---triplet
Code
Borne arrangement
to show
how to
select the correct
ETC.
1
triplets (or quadruplets,
or(Crick,
whatever Brenner,
it may be)Barnett,
along
3
'
the continuous
sequence
of bases.
One obvious
Non-overlapplnq
Code
Watts-Tobin)
suggestion is that, say, every
fourth baas is a ‘comma’.
Fig. 1. To show the difference
between
an overlapping
code and
&other
idea is that certain triplets
make ‘sense’,
a non-overlappinu
code.
The short
wrticnl
lines represent
the
whereas others make ‘nonsense’, as in the comma-free
bases of the nucleic acid.
The czw illustrated
is for a triplet
code
•
1961 Nirenberg & Matthaei
•
•
•
1961 Sydney Brenner
•
•
•
Gene Structure
•
1977 Sharp & Roberts
•
•
1982 Cech
•
•
pre-mRNA is processed
ribo(nucleic en)zymes
1980 Joan A. Steitz
•
role of snRNPs in splicing
Genomic era
•
1975 Frederick Sanger
•
•
•
dideoxy sequencing
1986 Human Genome Initiative
Genomes
•
•
•
•
•
1995
H. influenca
1.8 Mb
1.7k
genes
1997
E. coli
4.6 Mb
4.3k
genes
1996
S. cerevisiae
12.5 Mb
5.7k
genes
1998
C. elegans
100 Mb
21.7k
genes
2000
D. melanogaster
121 Mb
17k
genes
The human genome
•
2001
Draft H. sapiens
2.9 Mb
20-30k genes
Science (2001), Nature (2001)
Gene content
Excursion: Packing of DNA
•
human:
•
•
2 x 3e9 base pairs
packed in a nucleus of
6µm ∅
Histone tails
Histones
Chromosome
Qui, Nature 2006
•
E. coli
•
•
6 Mbp
1 by 2 µm cell size
Kavanoff, Nature Education : Supercoiled chromosome of E. coli.
Eukaryote!
!
• Large&(10&Mb&–&100,000&Mb)&
Size!
Content!
• There&is&not&generally&a&
relationship&between&organism&
complexity&and&its&genome&size&
(many&plants&have&larger&
genomes&than&human!)&
• Most&DNA&is&nonLcoding&
• Generally&small&(<10&Mb;&most&<&5Mb)&
• Complexity&(as&measured&by&#&of&genes&
and&metabolism)&generally&proportional&
to&genome&size&
• DNA&is&“coding&gene&dense”&
• Circular&DNA,&doesn't&need&telomeres&
Telomeres/!
Centromeres!
• Present&(Linear&DNA)&
Number!of!
chromosomes!
• More&than&one,&(often)&including&
those&discriminating&sexual&
identity&
Chromatin!
&
Prokaryote!
• Don’t&have&mitosis,&hence,&no&
centromeres.&
• Often&one,&sometimes&more,&Lbut&
plasmids,&not&true&chromosome.&
• Histone&bound&(which&serves&as&a& • No&histones&
genome&regulation&point)&
• Uses&supercoiling&to&pack&genome&
Gene Structure – Prokaryotic Operons
lac Operon
1: Regulatory gene
3: ß-galactosidase
4: ß-gal permease
8: ß-gal transacetylase
Promotor region
Griswold, A. (2008) Nature Education 1(1)
Understanding Bioinformatics, Zvelebil & Baum, 2007
Gene structure
Miller, O. L. et al. Visualization of bacterial genes in action. Science 169, 392–395
Gene Structure – Eukaryotic Gene
Scale
chr1:
SMG5
4_
hg19
10 kb
156,225,000 156,230,000 156,235,000 156,240,000 156,245,000 156,250,000
UCSC Genes (RefSeq, UniProt, CCDS, Rfam, tRNAs & Comparative Genomics)
Placental Mammal Basewise Conservation by PhyloP
Mammal Cons
-4 _
Simple Nucleotide Polymorphisms (dbSNP 135) Found in >= 1% of Samples
Common SNPs(135)
Repeating Elements by RepeatMasker
RepeatMasker
Griswold, A. (2008) Nature Education 1(1)
Understanding Bioinformatics, Zvelebil & Baum, 2007
Eukaryote!
!
Prokaryote!
• Often&have&introns&
• Intraspecific&gene&order&and&number&
generally&relatively&stable&&
Genes!
• many&non8coding&(RNA)&genes&
• There&is&NOT&generally&a&relationship&
between&organism&complexity&and&gene&
number&
Gene!regulation!
• Promoters,&often&with&distal&long&range&
enhancers/silencers,&MARS,&transcriptional&
domains&
• Generally&mono8cistronic&
Repetitive!sequences!
Organelle!
(subgenomes)!
• No&introns&
• Gene&order&and&number&may&
vary&between&strains&of&a&species&
• Promoters&
• Enhancers/silencers&rare&&
• Genes&often&regulated&as&
polycistronic&operons&
• Generally&highly&repetitive&with&genome&wide& • Generally&few&repeated&
sequences&
families&from&transposable&element&
propagation&
• Relatively&few&transposons&
• Mitochondrial&(all)&
• chloroplasts&(in&plants)&
• Absent&
Gene content
Human Genome Content
LTR retrotransposons
DNA transposons
Simple sequence
8.3%
repeats
2.9%
3%
Segmental
duplications
5%
Miscellaneous
heterochromatin
SINEs
13.1%
20.4%
8%
LINEs
1.5%
11.6%
Miscellaneous
unique sequences
25.9%
Protein-coding
genes
Introns
Gregory (2005), Nature
Transposable Element - Mobile Elements /
Jumping genes
•
Barbara McClintock (1902 - 1992)
•
studies in the 40’s & 50’s of spotted kernels in
maize
•
•
discovery of “controlling elements”
•
Nobel prize in 1983
initially thought to be unique to maize but later
also found in eukaryotes, bacteria, viruses,
phages & plasmids
wikipedia.org
Transposable Element - Mobile Elements /
Jumping genes
•
•
DNA Transposons
•
transposase cuts out transposon
& inserts it at the target site
•
•
“cut-and-paste” mechanism
prokaryotes & eukaryotes
Retrotransposons
•
•
•
•
transposon DNA transcribed to RNA
insertion to genome by reverse transcription
LTR, LINEs, SINEs
eukaryotes only
wikipedia.org
Transposable Elements
! ! ! ! ! ! ! In!both!cases!
d s D N A!
intermediate!is!
integrated!into!
the! target! site!
i n ! D N A ! t o!
c o m p l e t e!
movement!
DNA Transposon
•
•
•
•
genes for mobilization and insertion
encode only genes for mobilization and insertion
768 bp - 5 kb
all contain inverted terminal repeats (ITRs)
Inverted
Repeats
Genes for
transposition
Structural genes
Inverted IS
➡ disruption of genes (insertion / deletion / regulation)
➡ crossing-over
wikipedia.org
Retrotransposons
•
•
•
~ 45 % of human genome
random insertion
many ‘dead’ copies - only few are active
Whitley, genome.welcome.ac.uk