Download DozeRepetition_dh

Document related concepts

Neuronal ceroid lipofuscinosis wikipedia , lookup

Non-coding DNA wikipedia , lookup

Population genetics wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Oncogenomics wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Adaptive evolution in the human genome wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Point mutation wikipedia , lookup

Epistasis wikipedia , lookup

Genetic engineering wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Ridge (biology) wikipedia , lookup

Pathogenomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene therapy wikipedia , lookup

Public health genomics wikipedia , lookup

Human genome wikipedia , lookup

Transposable element wikipedia , lookup

Genomic imprinting wikipedia , lookup

Copy-number variation wikipedia , lookup

Gene nomenclature wikipedia , lookup

History of genetic engineering wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Epigenetics of human development wikipedia , lookup

The Selfish Gene wikipedia , lookup

Gene desert wikipedia , lookup

Minimal genome wikipedia , lookup

Gene expression programming wikipedia , lookup

RNA-Seq wikipedia , lookup

Genome editing wikipedia , lookup

Gene wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genome (book) wikipedia , lookup

Helitron (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Microevolution wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
4: Genome evolution
Gene Duplication
Gene Duplication - History
1936: The first observation of a duplicated gene was in
the Bar gene of Drosophila.
1950: Alpha and beta chains of hemoglobin are
recognized to have been derived from gene duplication
1970: Ohno developed a theoretical framework of
gene duplication
1995: Gene duplications are studied in fully sequenced
genomes
Types of Genomic Duplications
•Part of an exon or the entire exon is duplicated
•Complete gene duplication
•Partial chromosome duplication
•Complete chromosome duplication
•Polyploidy: full genome duplication
Mechanism of Gene Duplication
Genes are duplicated mainly due to unequal
crossing over
Mechanism of Gene Duplication
If these regions are complementary, it increases
the chance of unequal crossing over. For example,
if both of these regions are the same repeated
sequence (microsatellite, transposon, etc’…)
After a Gene is Duplicated
Alternative fates:
1. It can die and become a pseudogene.
2. It can retain its original function, thus allowing
the organism to produce double the amount of
the derived protein.
3. The two copies can diverge and each one will
specialize in a different function.
Divergence
One copy dies
Identical copies
Invariant repeats
If the duplicated genes are identical or nearly
identical, they are called invariant repeats. Many
times the effect is an increase in the quantity of the
derived protein, and this is why these duplications
are also called “dose repetitions”.
Classical examples are the genes
encoding rRNAs and tRNAs
needed for translation.
Invariant repeats
Variant repeats
Some classic examples:
Trypsin, the digestive enzyme and
Thrombin (cleaves fibrinogen during
blood clotting) were derived from a
complete gene duplication.
Lactalbumin, connected with lactose
synthesis and Lysozyme, which
degrades bacteria cell wall are also a
result of an ancient gene duplication.
Variant repeats
4: Genome evolution
Dose Repetition
Gene duplication in mosquito
as a response to insecticides
Kingdom = Metazoa (humans are also Metazoa)
Phylum = Arthropoda (humans are Chordata)
Class = Insecta (humans are Mammalia)
Order = Diptera (humans are Primates)
Genus = Culex (humans are Homo)
Species = pipiens (sapiens)
Organophosphorous insecticides
Organophosphorous insecticides (e.g., parathion
and malathion) interact with many enzymes and in
particlar they inhibit the acetylcholinesterase
(AChE) activity in the central nervous system,
inducing lethal conditions.
Organophosphorous insecticides
The acetylcholine is a is a neurotransmitter
that, upon release from neurons, stimulates the
opening of a Na+ and K+ channels.
These channels regulate the function of the
brain as well as the heart, lungs, and skeletal
muscles.
The acetylcholinesterase catalyzes the
hydrolysis of acetylcholine to form inactive
acetate and choline.
Acetylcholinesterase
Acetyl-CoA
+
Choline
Cholinergic
neuron
Acetylcholine
Acetylcholinesterase
Postsynaptic tissue
Acetylcholinesterase
Acetyl-CoA
+
Choline
Cholinergic
neuron
Acetylcholine
Insecticide
Acetylcholinesterase
Postsynaptic tissue
Esterases
Esterases are detoxifying carboxylester hydrolase
which are responsible for the resistance to
organophosphorous insecticides.
These enzymes are none specific.
Detoxifying esterases
Acetyl-CoA
+
Choline
Cholinergic
neurone
Acetylcholine
Insecticide
Esterase
Postsynaptic tissue
Esterases
Culex pipiens typically has 2 genes encoding
esterases: Est-3 and Est-2. These genes are
separated by an intergenic DNA fragment varying
between 2–6 kb.
Est-3
Est-2
Alignment of predicted estα2 and
estβ2 amino acid sequences
of Culex quinquefasciatus
~47% similarity between the two sequences
[Biochem.J.(1997) 325,359-365]
Esterases
Resistance alleles correspond to an esterase overproduction (which binds or metabolizes the
insecticide) relative to basal esterase production
of susceptibility alleles. Several resistance allele
have been described.
Esterase starch gel
Different allele show 85-90% of similarity
Esterases
For most alleles, the over-production of esterase is
the result of gene duplication. This concerns either
one locus or both.
A
Est-3
47 % of similarity
A
~100 % of similarity
B
Est-2
B
Nomenclature for the various resistance genes
and their products at the Ester resistance locus
Genetica 112–113: 287–296, 2001
Esterases
The duplication of the two esterase loci, explains
the tight statistical association of some
electromorphs, like A2 and B2. Although, A4, A2
and A1 are coded by alleles of the Est-locus , and
B2 and B4 by alleles at the Est-2 locus, A1, A4B4 and A2-B2 are considered as alleles of a single
superlocus (named Ester).
Independent amplifications have occurred only a
few times.
Esterases
The level of gene duplication varies between the
different alleles:
EsterB1 could reach easily 100 copies in the field
Ester4 has never been found above few copies.
It varies also within and among populations for a
given amplified allele.
Why the various amplified alleles have distinct
limits of amplification is unknown.
Frequency of resistance allele
in Montpellier (France)
A1
A1
1 11 21 31 41 km
1 11 21 31 41 km
A1
1 11 21 31 41 km
Treatment area
A1
Esterases
Resistance allele has a cost for the mosquito. In
absence of insecticide in the environment non
resistant-mosquitoes have the best fitness.
Geographic distribution of
resistance allele
Genetica 112–113: 287–296, 2001
Esterases
The level of gene duplication varies between the
different alleles:
EsterB1 could reach easily 100 copies in the field
Ester4 has never been found above few copies.
It varies also within and among populations for a
given amplified allele.
Why the various amplified alleles have distinct
limits of amplification is unknown.
Gene Duplication in Aphids as a response for
insecticide.
Same story than the mosquitoes
Few Words About Aphids
Kingdom=Metazoa (humans are also metazoa)
Phylum=Arthropoda (humans are Chordata)
Class=Insecta (humans are Mammalia)
Order=Hemiptera (humans are Primates)
Genus=Myzus (humans are Homo)
Species=persicae (sapiens)
Around 4,000 species, ~250 are pests.
Few Words About Aphids
The Myzus persicae likes…lettuce.
In fact, it is the most important aphid pest on lettuce
E4 & FE4
Myzus persicae has 2 genes encoding esterases E4
and FE4, which are responsible for the resistance
to organophosphorous insecticides.
These genes show 99% identity in nucleotide
sequences, both have exactly the same exonintron structure (same size and same positions).
Many copies of E4 and FE4
Resistance strains of the aphid were found to
contain multiple copies of E4 and FE4. The
sequences of all copies are 100% identical.
It is believed that this duplication occurred within
the last 50 years, with the introduction of the
selective agent.
Take home message I:
Increase in gene number can occur quite
rapidly under selection pressure.
Take home message II:
Mutations of gene duplication are not the
limiting step (in evolution). It is selection that
counts most.
4: Genome evolution
Duplications of RNA-specifying genes
Ribosome
Ribosome is a complex of proteins and RNA (called rRNA) on
which proteins are built, based on the information in the mRNA.
Ribosomes are always composed of two units – big and small.
Ribosome
In prokaryotes the entire ribosome is 70S, and is composed of a
50S large subunit, and a 30S small subunit.
In eukaryotes the entire ribosome is 80S, and is composed of a
60S large subunit and a 40S small subunit.
Each subunit contain different rRNA.
The S value is the sedimentation coefficient in ultracentrifuge.
rRNA
There are also
ribosomal genes
coded by the
mitochondrial
genome.
In fact, the
mitochondrial
ribosome is coded
by both nuclear and
mitochondrial
genes.
Comparison of ribosome structure in
Bacteria, Eukaryotes, and Mitochondria
Large Subunit
rRNAs
(1 of each)
Bacterial (70S)
Eukaryotic (80S)
Mitochondrial (55S)
50S
60S
39S
23S (2904 nts)
28S (4700 nts)
16S (1560 nts)
5S (120 nts)
5S (120 nts)
5.8S (160 nts)
Proteins
33
~49
48
Small Subunit
30S
40S
28S
rRNA
16S (1542 nts)
18S (1900 nts)
12S (950 nts)
Proteins
20
~33
29
16S, 18S are the most commonly used genes
in phylogenetic analysis
Eukaryotic rRNA genes
• 28S, 5.8S, and 18S rRNAs are encoded
by a single transcription unit (45S)
separated by 2 internally transcribed
spacers (ITS) and bounded by externally
transcribed spacers (ETS).
ETS
ITS 1
ITS 2
18S
ETS
28S
5.8 S
Human rRNA genes
• In Human the 45S rDNA is organized into
5 clusters (each has 30-40 repeats)
• These clusters are located on
chromosomes 13, 14, 15, 21, and 22.
• These clusters are transcribed by the RNA
polymerase I.
18S
28S
18S
28S
18S
28S
18S
28S
Human rRNA genes
• 5SrRNA genes occurs in tandem arrays
and there are about ~200-300 true 5S
genes and many dispersed pseudogenes.
• In human there are two gene cluster on
chromosome 1 (in dogs there is a single
gene cluster).
• 5S rRNA is transcribed by RNA
polymerase III.
Correlation between the number of rRNA
genes and the genome size
Numbers of rRNA and tRNA genes per haploid genome in various organisms
__________________________________________________________________________
Genome Source
Number of
Number of Approximate
rRNA sets
tRNA genesa genome size (bp)
__________________________________________________________________________
Human mitochondrion
1
22
2  104
Nicotiana tabacum chloroplast
2
37
2  105
Escherichia coli
7
~ 100
4  106
Neurospora crassa
~ 100
~ 2,600
2  107
Saccharomyces cerevisiae
~ 140
~ 360
5  107
Caenorhabditis elegans
~ 55
~ 300
8  107
Tetrahymena thermophila
1
~ 800c
2  108
Drosophila melanogaster
120-240
590-900
2  108
Physarum polycephalum
80-280
~ 1,050
5  108
Euglena gracilis
800-1,000 ~ 740
2  109
Human
~ 300
~ 1,300
3  109
Rattus norvegicus
150-170
~ 6,500
3  109
Xenopus laevis
500-760
6,500-7,800
8  109
__________________________________________________________________________
Correlation between number of rRNA genes
and genome size: an exception
Numbers of rRNA and tRNA genes per haploid genome in various organisms
__________________________________________________________________________
The
general pattern:Number
biggerofgenomes  more
genes to
Genome
Source
Approximate
sets
genome size (bp)
transcribed  morerRNA
rRNA
needed.
__________________________________________________________________________
Human mitochondrion
1
2  104
Nicotiana tabacum chloroplast
2
2  105
Escherichia coli
7
4  106
Neurospora crassa
~ 100
2  107
Saccharomyces cerevisiae
~ 140
5  107
Caenorhabditis elegans
~ 55
8  107
Tetrahymena thermophila
1
2  108
Drosophila melanogaster
120-240
2  108
Physarum polycephalum
80-280
5  108
Euglena gracilis
800-1,000
2  109
Human
~ 300
3  109
Rattus norvegicus
150-170
3  109
Xenopus laevis
500-760
8  109
__________________________________________________________________________
4: Genome evolution
Concerted Evolution
51
Lottia
Tribolium
Apis
Trichinella
Caenorhabditis
Schmidtea
Drosophila
Anopheles
Trichoplax
Hydra
Stylophora
Nematostella
Sycon
Leucetta
Caulophocus
Walteria
Chondrosia
Chondrilla
Negombata
Amphimedon
Biemna
Monosiga
Cryptococcus
Ustilago
Neurospora
Schizosaccaromyces
Kluyveromyces
Cnidaria
Calcarea
Hexactinellida
Demospongiae
0.1
Bilateria
18S rRNA
tree
Bos
Homo
Ornithorhyncus
Gallus
Xenopus
Danio
Tetraodon
Branchiostoma
Saccoglossus
Strongylocentrotus
Capitella
Aplysia
Evolution of rRNA genes
• Although there are many copy of the same
gene in the genome and the duplication is
an ancient phenomena (since all
organisms have many copies). All copies
present in one genome are almost
identical.
Divergent (classical) evolution
Duplication
Time
Mutation
Speciation
Divergent (classical) evolution
vs.
concerted evolution
Divergent evolution
Concerted evolution
Concerted evolution
Duplication
Time
Mutation
Speciation
Question ?
• How is it possible that all the
ribosomal copies remain
identical ??
????
(a) Stringent selection.
(b) Recent multiplication.
(c) Concerted evolution.
(a) Stringent selection.
Refuted by the fact that the
ITS regions are as
conserved as the functional
rRNA sequences.
(b) Recent multiplication.
Refuted by the fact that the
intraspecific homogeneity
does not decrease with
evolutionary time.
(c) Concerted evolution.
CONCERTED EVOLUTION
A member of a gene family does not evolve
independently of the other members of the
family.
It exchanges sequence information with
other members reciprocally or nonreciprocally.
Through genetic interactions among its
members, a multigene family evolves in
concert as a unit.
62
CONCERTED EVOLUTION
Concerted evolution
results in a homogenized
set of nonallelic
homologous sequences.
CONCERTED EVOLUTION REQUIRES:
(1) the horizontal transfer of
mutations among the family
members (homogenization).
(2) the spread of mutations in the
population (fixation).
64
Mechanisms of concerted
evolution
1. Unequal crossing-over
2. Gene conversion
3. Duplicative transposition.
Mechanisms of concerted evolution
1- Unequal crossing
1
2
Mechanisms of concerted evolution
1- Unequal crossing
3
4
Gene conversion
Gene conversion
(one possible origin)
(a) Heteroduplexes formed by the resolution
of Holliday structure or by other mechanisms.
Gene conversion
(one possible origin)
(b) The blue DNA uses the invaded segment (e') as
template to "correct" the mismatch, resulting in gene
conversion.
Gene conversion
(one possible origin)
(c) Both DNA molecules use their original sequences as
template to correct the mismatch. Gene conversion does
not occur.
Gene conversion has been
found in all species and at all
loci that were examined in
detail.
The rate of gene conversion
varies with genomic location.
concerted evolution:
Advantages of Gene Conversion over
Unequal Crossing-Over
1. Unequal crossing-over changes
the number of repeats, and may
cause a dosage imbalance. Gene
conversion does not change repeat
number.
concerted evolution:
Advantages of Gene Conversion over
Unequal Crossing-Over
2. Gene conversion can act on
dispersed repeats. Unequal crossingover is severely restricted when
repeats are dispersed.
deletion
duplication
concerted evolution:
Advantages of Unequal Crossing-Over over
Gene Conversion
1. Unequal crossing-over is faster
and more efficient in bringing about
concerted evolution.
At the mutation level, UCO occurs more
frequently than GC.
77
concerted evolution:
Advantages of Unequal Crossing-Over
over Gene Conversion
2. In a gene-conversion event,
only a small region is involved.
In yeast, an unequal crossing-over
event involves on average ~20,000
bp. A gene-conversion track cannot
exceed 1,500 bp.
79