Download powerpoint

Document related concepts

RNA-Seq wikipedia , lookup

Community fingerprinting wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Non-coding DNA wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Genome evolution wikipedia , lookup

Molecular evolution wikipedia , lookup

Transcript
Figure 21.1
Chapter 21
Genomes and their
Evolution
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Overview: Reading the Leaves from the Tree of
Life
• Genome sequences exist human, chimpanzee, E.
coli, brewer’s yeast, corn, fruit fly, house mouse,
rhesus macaque, ……….
• Provides information about the evolutionary
history of genes and taxonomic groups
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Genomics is the study of whole sets of genes
and their interactions
• Bioinformatics is the application of
computational methods to the storage and
analysis of biological data
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Genome Sequencing
• Human Genome Project began in 1990 largely
completed by 2003
• 3 stages
– Genetic (or linkage) mapping
– Physical mapping
– DNA sequencing
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Linkage Mapping  maps location of several
thousand genetic markers on each chromosome
• genetic marker  gene or other identifiable DNA
sequence
• Recombination frequencies used to determine
the order & relative distances between genetic
markers
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.2-1
Chromosome
bands
Cytogenetic map
Genes located
by FISH
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.2-2
Chromosome
bands
Cytogenetic map
Genes located
by FISH
1 Linkage mapping
Genetic
markers
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.2-3
Chromosome
bands
Cytogenetic map
Genes located
by FISH
1 Linkage mapping
Genetic
markers
2 Physical mapping
Overlapping
fragments
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.2-4
Chromosome
bands
Cytogenetic map
Genes located
by FISH
1 Linkage mapping
Genetic
markers
2 Physical mapping
Overlapping
fragments
3 DNA sequencing
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
• Physical Map  distance between genetic
markers, (number of bp)
• Constructed by cutting DNA molecule into short
fragments and arranging them in order by
identifying overlaps
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Sequencing  determines the complete
nucleotide sequence of each chromosome
• Human genome = 3.2 billion bp
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Whole-Genome Shotgun Approach to Genome
Sequencing
• Developed by J. Craig Venter (1992)
• Skips genetic and physical mapping and
sequences random DNA fragments directly
• Powerful computer programs are used to order
fragments
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.3-1
1 Cut the DNA into
overlapping fragments short enough
for sequencing.
2 Clone the fragments
in plasmid or phage
vectors.
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.3-2
1 Cut the DNA into
overlapping fragments short enough
for sequencing.
2 Clone the fragments
in plasmid or phage
vectors.
3 Sequence each
fragment.
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.3-3
1 Cut the DNA into
overlapping fragments short enough
for sequencing.
2 Clone the fragments
in plasmid or phage
vectors.
3 Sequence each
fragment.
4 Order the
sequences into
one overall
sequence
with computer
software.
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
• 3-stage process and shotgun used for the Human
Genome Project and for genome sequencing of
other organisms
• Newer sequencing techniques  massive
increases in speed and decreases in cost
$3,000,000,000.00  $1,000.00
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Metagenomics  environmental sample is
sequenced
• Eliminates need to culture species in the lab
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Using bioinformatics to analyze genomes and their
functions
• The Human Genome Project has accelerated
progress in DNA sequence analysis
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Centralized Resources for Analyzing Genome
Sequences
– National Library of Medicine and the National
Institutes of Health (NIH) created the National
Center for Biotechnology Information (NCBI)
– BGI in Shenzhen, China
– European Molecular Biology Laboratory
– DNA Data Bank of Japan
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Genbank, the NCBI database of sequences,
doubles its data approximately every 18 months
• Software is available that allows online visitors to
search Genbank for matches to
– A specific DNA sequence
– A predicted protein sequence
– Common stretches of amino acids in a protein
• The NCBI website also provides 3-D views of all
protein structures that have been determined
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.4
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
• Identification of protein coding genes within DNA
sequences in a database is called gene
annotation
• Comparison of unknown genes to known genes in
other species provides clues about function
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Proteomics  systematic study of all proteins
encoded by a genome
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.5
Systems biology
approach define gene
circuits and protein
interaction networks
Translation and
ribosomal functions
Glutamate
biosynthesis
Mitochondrial
functions
Vesicle
fusion
RNA processing
Peroxisomal
functions
Transcription
and chromatinrelated functions
Metabolism
and amino acid
biosynthesis
Nuclearcytoplasmic
transport
Secretion
and vesicle
transport
Nuclear migration
and protein
degradation
Mitosis
DNA replication
and repair
Cell polarity and
morphogenesis
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Protein folding,
glycosylation, and
cell wall biosynthesis
Serinerelated
biosynthesis
Amino acid
permease pathway
Figure 21.5a
Mitochondrial
functions
Translation and
ribosomal functions
RNA processing
Peroxisomal
functions
Transcription
and chromatinrelated functions
Metabolism
and amino acid
biosynthesis
Nuclearcytoplasmic
transport
Secretion
and vesicle
transport
Nuclear migration
and protein
degradation
Mitosis
DNA replication
and repair
Cell polarity and
morphogenesis
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Protein folding,
glycosylation, and
cell wall biosynthesis
Figure 21.5b
Glutamate
biosynthesis
Vesicle
fusion
Serinerelated
biosynthesis
Amino acid
permease pathway
Metabolism
and amino acid
biosynthesis
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Systems Biology in Medicine
– The Cancer Genome Atlas project is currently
seeking all the common mutations in three types
of cancer by comparing gene sequences and
expression in cancer versus normal cells
– Silicon and glass “chips” have been produced
that hold a microarray of most known human
genes
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.6
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Genomes
• By early 2010, 1,200 genomes were completely
sequenced, including 1,000 bacteria, 80 archaea,
and 124 eukaryotes
• Sequencing of over 5,500 genomes and over 200
metagenomes is currently in progress
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Genome Size
• Bacteria and archaea 1 to 6 million base pairs
(Mb)
• Plant & animal  greater than 100 Mb; humans
 3,000 Mb
• Within each domain there is no systematic
relationship between genome size and phenotype
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Table 21.1
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Number of Genes
• Bacteria and archaea have 1,500 to 7,500 genes
• Eukaryotes from 40,000 genes
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Number of genes is not correlated to genome size
• Vertebrate genomes can produce more than one
polypeptide per gene because of alternative
splicing of RNA transcripts
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Multicellular eukaryotes have much noncoding
DNA and many multigene families
• Previously called “junk DNA” plays important roles
in the cell
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Sequencing of the human genome reveals that
98.5% does not code for proteins, rRNAs, or
tRNAs
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• About 25% of the human genome  introns and
gene-related regulatory sequences (5%)
• Intergenic DNA is noncoding DNA found between
genes
– Pseudogenes are former genes that have
accumulated mutations and are nonfunctional
– Repetitive DNA is present in multiple copies in
the genome
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• About three-fourths of repetitive DNA is made up
of transposable elements
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.7
Exons (1.5%)
Regulatory
sequences
(20%)
Repetitive
DNA that
includes
transposable
elements
and related
sequences
(44%)
L1
sequences
(17%)
Introns (5%)
Unique
noncoding
DNA (15%)
Repetitive
DNA
unrelated to
transposable
elements
(14%)
Alu elements
(10%)
Simple sequence
DNA (3%)
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Large-segment
duplications (56%)
Transposable Elements
• First evidence came from geneticist Barbara
McClintock’s breeding experiments with Indian
corn
• Identified changes in the color of kernels that
made sense only by mobile genetic elements
• Present in both prokaryotes and eukaryotes
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.8
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.9
Transposon
DNA of
genome
Transposon
is copied
Mobile transposon
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
New copy of
transposon
Insertion
Figure 21.10
Retrotransposon
New copy of
retrotransposon
Formation of a
single-stranded
RNA intermediate
RNA
Insertion
Reverse
transcriptase
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Sequences Related to Transposable Elements
• In primates, a large portion are a family called
Alu elements
• Function, if any, is unknown
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Other Repetitive DNA, Including Simple Sequence
DNA
• Many copies of tandemly repeated short
sequences
• Series of repeating units of 2 to 5 nucleotides is
called a short tandem repeat (STR)
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Genes and Multigene Families
• Collections of identical or very similar genes
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.11
DNA
RNA transcripts
Nontranscribed
Transcription unit
spacer
-Globin
-Globin
Heme
DNA
18S
5.8S
28S
rRNA
28S
5.8S
18S
(a) Part of the ribosomal RNA gene family
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
-Globin gene family
Chromosome 16

Embryo
   2 1 
2
1
-Globin gene family
Chromosome 11

G
A
Fetus
and adult Embryo Fetus



Adult
(b) The human -globin and -globin gene families
Duplication, rearrangement, and mutation of DNA
contribute to genome evolution
• Earliest forms of life  minimal number of genes,
(only those necessary for survival and
reproduction)
• Size of genomes has increased over evolutionary
time, (extra genetic material  raw material for
gene diversification)
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Alterations of Chromosome Structure
• Humans have 23 pairs of chromosomes, while
chimpanzees have 24 pairs
• 2 ancestral chromosomes fused in the human
line
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.12
Human
chromosome 2
Chimpanzee
chromosomes
Telomere
sequences
Centromere
sequences
Telomere-like
sequences
12
Human
chromosome 16
Centromere-like
sequences
13
(a) Human and chimpanzee chromosomes
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Mouse
chromosomes
7
8
(b) Human and mouse chromosomes
16
17
Evolution of Genes with Related Functions: The
Human Globin Genes
• Globin genes evolved from common ancestral
globin gene, which duplicated and diverged about
450–500 mya
• Differences arose from accumulation of mutations
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.14
Ancestral globin gene
Evolutionary time
Duplication of
ancestral gene
Mutation in
both copies

Transposition to
different chromosomes
Further duplications
and mutations






   2 1 
2
1
-Globin gene family
on chromosome 16
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings



G

A


-Globin gene family
on chromosome 11

Evolution of Genes with Novel Functions
• Some duplicated genes have diverged so much
that the functions of encoded proteins are now
very different
• e.g. lysozyme gene was duplicated and evolved
into the gene that encodes α-lactalbumin in
mammals (milk production role)
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Rearrangements of Parts of Genes: Exon
Duplication and Exon Shuffling
• Has contributed to genome evolution
• Mixing and matching of exons
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.15
EGF
EGF
EGF
EGF
Epidermal growth
factor gene with multiple
EGF exons
F
F
F
Exon
shuffling
Exon
duplication
F
Fibronectin gene with multiple
“finger” exons
F
EGF
K
K
K
Plasminogen gene with a
“kringle” exon
Portions of ancestral genes
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Exon
shuffling
TPA gene as it exists today
How Transposable Elements Contribute to
Genome Evolution
• Multiple copies of similar transposable elements
may facilitate recombination, or crossing over,
between different chromosomes
• Insertion of transposable elements within a
protein-coding sequence may block protein
production
• Insertion of transposable elements within a
regulatory sequence may increase or decrease
protein production
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Transposable elements may carry a gene or
groups of genes to a new position
• Transposable elements may also create new
sites for alternative splicing in an RNA transcript
• In all cases, changes are usually detrimental but
may on occasion prove advantageous
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Comparing genome sequences provides clues to
evolution and development
• Genome comparisons of closely related species
help us understand recent evolutionary events
• Genome comparisons of distantly related species
help us understand ancient evolutionary events
• Relationships among species can be represented
by a tree-shaped diagram
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.16
Bacteria
Most recent
common
ancestor
of all living
things
Eukarya
Archaea
4
1
3
2
Billions of years ago
0
Chimpanzee
Human
Mouse
70
60
50
40
30
20
Millions of years ago
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
10
0
Comparing Distantly Related Species
• Highly conserved genes have changed very little
over time
• Clarify relationships among species
• Bacteria, archaea, and eukaryotes diverged from
each other between 2 and 4 billion years ago
• Results from model organisms applied to other
organisms
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Comparing Closely Related Species
• Human and chimpanzee genomes differ by 1.2%,
at single base-pairs, and by 2.7% because of
insertions and deletions
• Several genes are evolving faster in humans than
chimpanzees
• These include genes involved in defense against
malaria and tuberculosis, regulation of brain size,
and genes that code for transcription factors
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Humans and chimpanzees differ in the expression
of the FOXP2 gene, (vocalization gene)
• May explain why humans but not chimpanzees
communicate by speech
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Comparing Genomes Within a Species
• Human species only 200,000 years old  low
within-species genetic variation
• Variation due to single nucleotide
polymorphisms, inversions, deletions, and
duplications
• Variations useful for studying human evolution
and human health
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Comparing Developmental Processes
• Evolutionary developmental biology, or evo-devo,
is the study of the evolution of developmental
processes in multicellular organisms
• Minor differences in gene sequence or regulation
can result in striking differences in form
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Widespread Conservation of Developmental Genes
Among Animals
• Molecular analysis of the homeotic genes in
Drosophila has shown that they all include a
sequence called a homeobox
• An identical or very similar nucleotide sequence
has been discovered in the homeotic genes of
both vertebrates and invertebrates
• Homeobox genes code for a domain that allows a
protein to bind to DNA and to function as a
transcription regulator
• Homeotic genes in animals are called Hox genes
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.18
Adult
fruit fly
Fruit fly embryo
(10 hours)
Fly chromosome
Mouse
chromosomes
Mouse embryo
(12 days)
Adult mouse
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.18a
Adult
fruit fly
Fruit fly embryo
(10 hours)
Fly chromosome
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.18b
Mouse
chromosomes
Mouse embryo
(12 days)
Adult mouse
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
• Related homeobox sequences have been found
in regulatory genes of yeasts, plants, and even
prokaryotes
• In addition to homeotic genes, many other
developmental genes are highly conserved from
species to species
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
• Sometimes small changes in regulatory
sequences of certain genes lead to major
changes in body form
• For example, variation in Hox gene expression
controls variation in leg-bearing segments of
crustaceans and insects
• In other cases, genes with conserved sequences
play different roles in different species
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.19
Thorax
Genital
segments
Thorax
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Abdomen
Abdomen
Comparison of Animal and Plant Development
• In both plants and animals, development relies on
a cascade of transcriptional regulators turning
genes on or off in a finely tuned series
• Molecular evidence supports the separate
evolution of developmental programs in plants
and animals
• Mads-box genes in plants are the regulatory
equivalent of Hox genes in animals
© 2005 Pearson Education, Inc. publishing as Benjamin Cummings
©Copyright
2011 Pearson
Education, Inc.
Figure 21.UN01
Bacteria
Genome
size
Number of
genes
Gene
density
Introns
Other
noncoding
DNA
Archaea
Most are 16 Mb
1,5007,500
Higher than in eukaryotes
None in
protein-coding
genes
Present in
some genes
Very little
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Eukarya
Most are 104,000 Mb, but a
few are much larger
5,00040,000
Lower than in prokaryotes
(Within eukaryotes, lower
density is correlated with larger
genomes.)
Unicellular eukaryotes:
present, but prevalent only in
some species
Multicellular eukaryotes:
present in most genes
Can be large amounts;
generally more repetitive
noncoding DNA in
multicellular eukaryotes
Figure 21.UN02
Human genome
Protein-coding,
rRNA, and
tRNA genes (1.5%)
Introns and
regulatory
sequences (26%)
Repetitive DNA
(green and teal)
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.UN03
-Globin gene family
-Globin gene family
Chromosome 16

   2 1 
2
1
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Chromosome 11

G
A



Figure 21.UN04
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.UN05
Crossover
point
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings
Figure 21.UN06
Copyright © 2005 Pearson Education, Inc. publishing as Benjamin Cummings