Download Genomics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
3c
Gene Geography
Dan Graur
Department of Biology and Biochemistry
1
Gene density (genes/Kb)
Mycoplasma genitalium
Escherichia coli
Saccharomyces cerevisiae
Caenorhabditis elegans
Arabidopsis thaliana
Homo sapiens
Alu in Homo sapiens
0.8
0.6
0.5
0.2
0.2
0.03
1.1
2
Genes are distributed evenly among the 16
chromosomes of Saccharomyces cerevisiae.
3
Periodicity in gene density along
chromosome XI of Saccharomyces
cerevisiae.
4
In large plant genomes, most proteincoding genes are clustered in long DNA
segments (gene space, urban
aggregations) that represent a small
fraction (12-24%) of the nuclear
genome, and which are separated from
one another by vast expanses of gene-
empty regions (deserts).
5
Only ~1/3 genes in eukaryotes are essential
for viability. The proportion does not vary
much between organisms (25-35%).
• Organisms with a large number of genes
(e.g., humans, fish).
• Organisms with an intermediate number of
genes (nematodes, Drosophila).
• Organisms with a low gene number (e.g.,
yeast).
6
Genetic material
Chromosomes
Extrachromosomal
material
Plasmids
Cryptic
(linear)
Episomes
Giant
(circular)
7
Chromosomes contain genes
that are unconditionally
essential.
Extrachromosomal elements
contain genetic information
that is not necessary under all
conditions.
8
plasmid
episome
9
Brucella
2
1
=
2
1
10
Even in Bacteria
chromosome number
does not correlate
with DNA content.
11
Classification of eukaryotic chromosomes
by centromere position.
12
Gene loss
13
Gene addition
14
Gene rearrangement
15
Exchanges of genetic information between two
nonhomologous chromosomes.
16
Mouse-human synteny. Human chromosomes
can be cut into a relatively small number pieces,
then shuffled into a reasonable approximation of
17
the mouse genome.
Regions of
conserved
synteny
between
human
chromosome
22 and the
mouse
genome.
18
Chromosome-number reduction
Chinese water deer (Hydropotes inermis)
Brown-brocket deer (Mazama gouazoubira)
Chinese muntjac (Munitacus reevesi)
Black muntjac (M. muntiacus crinifrons)
Indian muntjac (M. muntiacus vaginalis)
n = 70
n = 70
n = 46
n=8
n=6
19
Muntiacus reevesi
20
2N = 44 + (XX or XY)
2N = 6 + (XX or XY
Y
21 1 2
1
3
2
3
2
5
4
4
5
5
Inferring the number of
gene-order-rearrangement
events
22
The alignment-reduction method
by
David Sankoff
deletion distance (D) = the minimal number
of deletions or insertions necessary to turn
genome content A into genome content B.
rearrangement distance (R) = the minimal
number of inversions and transpositions
necessary to convert gene order of A into the
gene order of B.
23
evolutionary edit distance (E):
E=D+R
24
To estimate E, we employ three geometrical procedures:
deletion, bundling, and inversion
D=2
bundle is w/o price
25
1
3
2
3
2
5
R=3
4
4
5
5
26
Tsuzumi drum
Tsuzumi graph
27
The conserved S10 region. The three arrows represent operons in
E. coli. A dot () indicates the existence of a gene at a site; a
minus sign (–) indicates that the gene has been translocated
elsewhere in the genome;  indicates that the gene was not found
in the genome.
L and S = large and small ribosomal-proteins; prlA = preproteintranslocation secY subunit; adk = adenylate kinase; map =
methionine aminopeptidase; infA = initiation-factor 1; rpoA =
DNA-directed RNA-polymerase a chain.
28
Evolutionary-edit distance between pairs of animal
mitochondria. Rearrangement distances and deletion distances
are above and below the diagonal, respectively.
OTUs a
Hs
Hs
aHs
Gg
1
Sp
Ap
Po
Dy
As
18
16
19
13
25
19
17
17
12
26
2
1
26
27
1
22
25
23
24
Gg
0
Sp
0
0
Ap
4
4
4
Po
1
1
1
5
Dy
0
0
0
4
1
As
1
1
1
5
2
28
1
= Homo sapiens; Gg = Gallus gallus; Sp = Strongylocentrotus purpuratus (sea
urchin); Ap = Asterina pectinifera (starfish); Po = Pisaster ochraceus (starfish); Dy =
29
Drosophila yakuba; As = Ascaris suum (pig roundworm).
Sorting by reversals
Nicotiana
Lobelia
30
Synteny = occurrence of two or more genes on the
same chromosome.
Conserved synteny = synteny of two or more
homologous genes in two species.
Conserved linkage = conservation of both synteny
and gene order of homologous genes between
species.
Disrupted synteny = a pair of genes are syntenic
in one species but their orthologs are located on
different chromosomes in the second species.
Disrupted linkage = a difference in gene order
between the species.
31
32
Empirical variables:
(1) number of conserved syntenies
(2) distribution of number of genes
among conserved syntenies
(3) number of conserved linkages
(4) distribution of number of genes
among conserved linkages.
33
Assumption:
A uniform distribution of genes over
the genome
Estimate:
Number of genomic disruptions
required to explain the differences
between two genomes.
34
Conclusions:
(1) gene-order
rearrangements occur at high
rates.
35
Conclusions:
(2) rates of synteny disruption
vary widely among mammalian
lineages.
The mouse lineage has a rate of
synteny disruptions that is 25 times
higher than that of the cat lineage.
36
Conclusions:
(3) interchromosomal
rearrangements occur
approximately four times
more frequently than
intrachromosomal ones.
37
Related documents