Download Low X/Y divergence in four pairs of papaya sex

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cre-Lox recombination wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Transposable element wikipedia , lookup

Skewed X-inactivation wikipedia , lookup

Copy-number variation wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Ridge (biology) wikipedia , lookup

Gene desert wikipedia , lookup

Non-coding DNA wikipedia , lookup

Point mutation wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Ploidy wikipedia , lookup

Minimal genome wikipedia , lookup

Pathogenomics wikipedia , lookup

Human genome wikipedia , lookup

History of genetic engineering wikipedia , lookup

RNA-Seq wikipedia , lookup

Genomics wikipedia , lookup

Metagenomics wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genomic imprinting wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene expression programming wikipedia , lookup

Genome editing wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene wikipedia , lookup

Y chromosome wikipedia , lookup

Genomic library wikipedia , lookup

Neocentromere wikipedia , lookup

Genome evolution wikipedia , lookup

Helitron (biology) wikipedia , lookup

Genome (book) wikipedia , lookup

Chromosome wikipedia , lookup

Designer baby wikipedia , lookup

X-inactivation wikipedia , lookup

Polyploid wikipedia , lookup

Microevolution wikipedia , lookup

Karyotype wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Transcript
The Plant Journal (2008) 53, 124–132
doi: 10.1111/j.1365-313X.2007.03329.x
Low X/Y divergence in four pairs of papaya sex-linked genes
Qingyi Yu1, Shaobin Hou2, F. Alex Feltus3, Meghan R. Jones1, Jan E. Murray1, Olivia Veatch1,4, Cornelia Lemke3,
Jimmy H. Saw2, Richard C. Moore5, Jyothi Thimmapuram6, Lei Liu6, Paul H. Moore7, Maqsudul Alam2, Jiming Jiang8,
Andrew H. Paterson3 and Ray Ming1,9,*
1
Hawaii Agriculture Research Center, Aiea, HI 96701, USA,
2
Center for Advanced Studies in Genomics, Proteomics and Bioinformatics, University of Hawaii, Honolulu, HI 96822, USA,
3
Plant Genome Mapping Laboratory, University of Georgia, Athens, GA 30602, USA,
4
Department of Molecular Bioscience and Bioengineering, University of Hawaii, Honolulu, HI 96822, USA,
5
Department of Botany, Miami University, Oxford, OH 45056, USA,
6
W.M. Keck Center for Comparative and Functional Genomics, University of Illinois at Urbana-Champaign, Urbana,
IL 61801, USA,
7
USDA-ARS, Pacific Basin Agricultural Research Center, Hilo, HI 96720, USA,
8
Department of Horticulture, University of Wisconsin, Madison, WI 53706, USA, and
9
Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
Received 21 July 2007; revised 15 August 2007; accepted 11 September 2007.
*
For correspondence (fax +1 217 244 1336; e-mail [email protected]).
Summary
Sex chromosomes in flowering plants, in contrast to those in animals, evolved relatively recently and only a
few are heteromorphic. The homomorphic sex chromosomes of papaya show features of incipient sex
chromosome evolution. We investigated the features of paired X- and Y-specific bacterial artificial
chromosomes (BACs), and estimated the time of divergence in four pairs of sex-linked genes. We report the
results of a comparative analysis of long contiguous genomic DNA sequences between the X and
hermaphrodite Y (Yh) chromosomes. Numerous chromosomal rearrangements were detected in the malespecific region of the Y chromosome (MSY), including inversions, deletions, insertions, duplications and
translocations, showing the dynamic evolutionary process on the MSY after recombination ceased. DNA
sequence expansion was documented in the two regions of the MSY, demonstrating that the cytologically
homomorphic sex chromosomes are heteromorphic at the molecular level. Analysis of sequence divergence
between four X and Yh gene pairs resulted in a estimated age of divergence of between 0.5 and 2.2 million
years, supporting a recent origin of the papaya sex chromosomes. Our findings indicate that sex chromosomes
did not evolve at the family level in Caricaceae, and reinforce the theory that sex chromosomes evolve at the
species level in some lineages.
Keywords: Carica papaya, chromosomal rearrangements, molecular evolution, male-specific region of the Y
chromosome (MSY), sex chromosomes.
Introduction
Papaya (Carica papaya) belongs to the small family Caricaceae, which comprises 35 species, including 32 dioecious,
two trioecious and one monoecious species. Papaya is the
only species in the genus Carica. It is in the order Brassicales,
which includes Brassica and Arabidopsis in the family
Brassicaceae that diverged from a common ancestor with
Caricaceae approximately 72 million years ago (MYA;
Wikström et al., 2001). The closest family to Caricaceae is
124
Moringaceae; these families diverged approximately
60 MYA (Wikström et al., 2001). The time of divergence
between papaya and its close relative Vasconcellea is
unknown.
Papaya is a widely cultivated, fast-growing, semi-woody
tropical plant that produces nutritious fruits rich in vitamins
A and C (Chandrika et al., 2003; Yamamoto, 1964). It is
trioecious, with three basic sex forms: female, male and
ª 2007 The Authors
Journal compilation ª 2007 Blackwell Publishing Ltd
Low X/Y divergence of papaya sex-linked genes 125
hermaphrodite. Hermaphrodite papaya trees produce pearshaped fruits, which are the ones that are preferred in the
market. However, seeds from female and hermaphrodite
trees segregate into hermaphrodites and females.
Because papaya has all three sex types and environmental
factors elicit frequent sex reversals, the sex determination
system in papaya is particularly intriguing (Hofmeyr, 1938,
1939). It was hypothesized originally, based on segregation
ratios from crosses among the three sex types, that sex
determination in papaya is controlled by a single gene with
three alleles, M, Mh and m (Hofmeyr, 1938; Storey, 1938).
Male individuals (Mm) and hermaphrodite individuals
(Mhm) are heterozygous, whereas female individuals (mm)
are homozygous recessive. The dominant combinations
MM, MhMh and MMh are lethal, resulting in a segregation
ratio of 2:1 hermaphrodite to female from self-pollinated
hermaphrodite trees, and a 1:1 segregation of male to
female, or hermaphrodite to female, from cross-pollinated
female trees (Hofmeyr, 1938; Storey, 1938).
Four additional hypotheses to explain the genetic basis
of sex determination in papaya include a group of closely
linked genes (Storey, 1953), a genic balance of sex chromosomes over autosomes (Hofmeyr, 1967), classical XY
chromosomes (Horovitz and Jiménez, 1967), and regulatory
elements of the flower development pathway (Sondur
et al., 1996). More recently, high-density genetic mapping
of the papaya genome has revealed severe suppression of
recombination around the sex determination locus (Ma
et al., 2004). Physical mapping and sample sequencing of
bacterial artificial chromosomes (BACs) in the sex determination region concluded that papaya sex determination is
controlled by a pair of primitive sex chromosomes with a
small male-specific region on the Y chromosome (MSY)
(Liu et al., 2004). Morphological and molecular data indicate
that papaya has two slightly different Y chromosomes. The
male Y and hermaphrodite Y chromosomes are nearly
identical based on their MSY sequences (Liu et al., 2004).
Any combinations of the two Y chromosomes are lethal, as
demonstrated by self-pollination of hermaphrodite and
male flowers (occasionally a few male flowers may have
carpels on a male tree) and cross-pollination between
hermaphrodite and male flowers (Hofmeyr, 1938; Storey,
1938). The divergence of these two Y chromosomes could
be as recent as a few thousand years ago, resulting from
human selection for hermaphrodites, as 32 of the 35
species in the family Caricaceae are dioecious (Storey,
1976). To distinguish the two Y chromosomes, the symbol
Y was kept for the male Y chromosome, and the hermaphrodite Y chromosome was designated Yh (Ming et al.,
2007).
We recently reported in situ mapping of the papaya MSY
close to the centromere, and the sequence features of five
MSY BACs without any functional genes (Yu et al., 2007).
The objectives of the study presented here were to analyze
contiguous genomic sequences of homologous X and Y
BACs, and to estimate the time of divergence of homologous
genes on the X and Y chromosomes. The X-specific BAC and
functional gene pairs on the X and Y chromosomes were not
available previously. The results reported here provide
detailed information on local chromosomal arrangements
of the MSY and insights into the origin of sex chromosomes
in the family Caricaceae.
Results
Physical mapping of the papaya MSY region
We fingerprinted all 39 168 clones of the papaya hermaphrodite BAC library (Ming et al., 2001) using the high-information-content fingerprinting method to produce
high-resolution fingerprints (Luo et al., 2003). Previously
identified MSY BACs were confirmed by fluorescent in situ
hybridization (FISH). The positive BAC clones were used as
bait to detect contigs from the genome-wide physical map.
Chromosome walking was carried out to extend the contigs.
The relative positions of a set of MSY BACs were verified by
fiber FISH and pachytene FISH mapping. To date, a total of
8.0 Mbp of the MSY region has been mapped in four ordered
contigs. The physical size of papaya’s currently mapped
MSY is greater than the original estimate of 4–5 Mbp (Liu
et al., 2004). Our revised estimate of the size of the papaya
MSY is 8–9 Mbp based on size estimates for the remaining
three gaps (Q.Y., P.H.M., J.J., A.H.P and R.M., unpublished
data).
Isolation of X chromosome BACs corresponding
to the MSY region
Because an X chromosome is present in all three sex forms,
male (XY), female (XX) and hermaphrodite (XYh), isolating X
chromosome-specific BACs from a hermaphrodite BAC
library is particularly challenging. We used the genomic
resources of a recently developed papaya unigene database
and whole-genome shotgun (WGS) sequences for this purpose. Papaya’s MSY DNA sequences were used to search
the unigene set of 8571 genes derived from five cDNA
libraries of male, female and hermaphrodite flower buds at
pre-meiosis (all three sexes) and post-meiosis (female and
hermaphrodite only) developmental stages (Q.Y., P.H.M.
and R.M., unpublished data). After excluding unigenes that
matched more than five MSY sequences on different BACs,
considered as members of either gene families or transcribed retro-elements, six unigene sequences were identified as good matches with MSY BACs 85B12 and 95B12.
These matches had a nucleotide sequence identity greater
than 95%, but none were 100%. All six unigenes were from
female cDNA libraries. These six unigene contigs and EST
sequences were used to identify X-specific matching
ª 2007 The Authors
Journal compilation ª 2007 Blackwell Publishing Ltd, The Plant Journal, (2008), 53, 124–132
126 Qingyi Yu et al.
sequences from 1.3 million reads of whole-genome shotgun
sequences of SunUp females (S.H. and M.A., unpublished
data). One unigene contig matched 4460 WGS contigs and
39 468 reads. This unigene contig was excluded from further
analyses as it was considered likely to be a highly abundant
retro-element. The remaining five unigenes matched either
single reads or fewer than three WGS contigs. The physical
positions of the X and Yh BACs reported here are shown in
Figure S1.
Five primer pairs were designed from the EST
sequences for amplifying X-specific probes from female
genomic DNA (without Y or Yh chromosomes), and these
probes were used to screen the papaya hermaphrodite
BAC library. This screening produced 38 positive BACs,
which were confirmed by PCR amplification, and 14 Xspecific BACs, which were confirmed by sequencing the
PCR products matching the X-specific sequences obtained
from female plants with XX chromosomes using the same
primers. These 14 X-specific BACs formed two contigs,
one showing correspondence of four anchor loci to MSY
BAC 95B12 (150 kb) and the other showing correspon-
(a)
61H02
40K
20K
60K
Sequence comparison of two pairs of X and Yh BACs
The alignments of two X/Yh BAC pairs are shown in Figure 1.
The less likely alternative alignments are shown in
Figure S2. Aligning the X BAC 61H02 (168 kb) and the Yh
BAC 95B12 (150 kb) in Figure S2a creates a small unaligned
region of 95B12 on the left, which is not a problem as the
aligned region of 61H02 reaches the end, and a large
unaligned region (>40 kb) on the right ends of both 61H02
and 95B12, which makes this reversed orientation less likely
than the one shown in Figure 1(a). The alternative alignment
of X BAC 53E18 and Yh BAC 85B24 shows the aligned region
of the X and Yh as a single inversion (Figure S2b), even
though the degree of Yh sequence conservation differs
dramatically in two distinctive blocks of sequences as
100K
Gene5 Gene6
Gene4
95B12
80K
dence to MSY BAC 85B24 (294 kb) with one anchor locus.
Two X-specific seed BACs, 53E18 (252 kb) and 61H02
(168 kb), from these two X-specific contigs were
sequenced by the shotgun approach, together with their
Yh chromosome counterparts.
80K
140K
Gene7
Gene6 Gene5
40K
120K
Gene4
120K
160K
200K
240K
(b)
53E18
Gene1
Gene2
Gene3
85B24 Gene2
Figure 1. DNA sequence comparison between X and Yh BAC pairs.
(a) X BAC 61H02 is shown as the blue line and Yh BAC 95B12 (inverted) is shown as the green line.
(b) X BAC 53E18 is shown as the blue line and Yh BAC 85B24 (inverted) is shown as the green line.
Arrows next to the bar indicate stretches of highly homologous sequences. Each crossing line represents a homologous block ranging from 50 to 5000 bp, with a
minimum sequence identity of 66%.
ª 2007 The Authors
Journal compilation ª 2007 Blackwell Publishing Ltd, The Plant Journal, (2008), 53, 124–132
Low X/Y divergence of papaya sex-linked genes 127
indicated by the different density of lines connecting the
homologous sequences (Figure S2b).
Direct alignment of these two pairs of X and Yh BACs
revealed chromosomal rearrangements. One translocation
event, involving the unaligned part of the X BAC 61H02
(168 kb) and the aligned part of Yh BAC 95B12 (150 kb),
appeared to have occurred (Figure 1a). One inversion,
involving the entire aligned portion of the X and Yh BAC
pair 61H02 and 95B12, probably occurred on the Yh chromosome (Figure 1a).
Three inversions may have occurred in the aligned parts
of the X and Yh BAC pairs 53E18 and 85B24 (Figure 1b). The
first inversion involving the X BAC 53E18, from approximately 80 to 120 kb, and the corresponding Yh BAC 85B24,
from 0 to 40 kb, showed the highest level of sequence
conservation among the aligned sequences of this X and Yh
BAC pair, based on the density of crossing lines representing
homologous sequences. The second inversion, which
included the first inversion, involving the X BAC sequence
from approximately 44 to 150 kb, and the corresponding Yh
BAC sequence, from 0 to 110 kb, showed higher levels of
sequence conservation than did the remaining parts of this X
and Yh BAC pair. The postulated third inversion, involving
the X BAC sequence from approximately 120 to 235 kb and
the corresponding Yh BAC sequence from 50 to 290 kb,
probably occurred much earlier, as it shows extensive
sequence divergence and greater than two-fold DNA
sequence expansion of the Yh chromosome in this region.
Other types of chromosomal rearrangements on the MSY
(deletions, insertions and duplications) were also observed
from the alignments of these two X and Yh BAC pairs.
Deletions on the MSY were shown by the gaps on the X BACs
within the aligned regions, such as the regions on X BAC
61H02 from 84 to 89 kb and from 93 to 98 kb and on X BAC
53E18 from 192 to 212 kb. Insertions on the MSY were shown
by the gaps on the Y BACs within the aligned regions, such as
the regions on Yh BAC 95B12 from 88 to 100 kb and on Yh BAC
85B24 from 32 to 44 kb. Duplications were shown by crossing
lines from one location on the X BACs to two or more
locations on the Yh BACs, such as the region on X BAC 53E18
from 174 to 192 kb, which is aligned to two locations
approximately 140 kb apart on the Yh BAC 85B24 (Figure 1b).
Further analysis of the aligned sequences of the two X and
Yh BAC pairs provided quantitative assessment of DNA
sequence expansion and chromosomal rearrangement of
the Yh chromosome. Based on the beginning and end of the
aligned sequences of each paired X and Yh BACs, a total of
102 069 bp of the X BAC61H02 aligned with 111 888 bp of Yh
BAC 95B12. This alignment suggests a 9792 bp (9.6%)
expansion on the Yh chromosome. For the second pair of
X and Yh BACs, 214 541 bp of 53E18 aligned with 290 085 bp
of MSY BAC 85B24, suggesting a 75 544 bp (35.2%) expansion on the Yh chromosome. After excluding the repetitive
sequences and the insertions and deletions (Indels) using
Table 1 Summary of gapless X and Yh BAC sequence comparison
by BLASTZ
Length of X BAC (bp)
Length of Yh BAC (bp)
Aligned sequence (bp)
Span of aligned X BAC (bp)
Span of aligned Yh BAC (bp)
Percentage of aligned X covered
Percentage of aligned Yh covered
Total contigs
Average percentage identity
Average contig length (bp)
Longest contig (bp)
53E18(X)–
85B24(Yh)
61H02(X)–
95B12(Yh)
251 868
294 333
89 075
214 541
290 085
41.5
30.7
432
83.6
206.2
4205
168 440
146 496
83 294
102 069
111 888
81.6
74.5
341
87.4
244.3
2370
BLASTZ (Table 1), a stringent gapless alignment resulted in
an average of 83.6% DNA sequence identity between 53E18
(X) and 85B24 (Yh) and 87.4% identity between 61H02 (X) and
95B12 (Yh). The percentages of aligned gapless nucleotides
were 41.5% in X BAC 53E18 and 30.7% in Yh BAC 85B24. The
percentages of aligned gapless nucleotides for the second
pair were 81% in X BAC 61H02and 74.5% in Yh BAC 95B12.
Based on BLASTN results, the total number of Indel nucleotides on the aligned part of the X BAC 61H02 was 3212,
compared with 13 004 on the aligned region of Yh BAC
95B12. The extensive sequence divergence on Yh BAC 85B24
and DNA sequence expansion through insertions and
duplications make it problematic to obtain an accurate
estimate of the number of Indel nucleotides between 53E18
(X) and 85B24 (Yh).
Identification of four X and Yh gene pairs
Despite intensive efforts to identify functional genes in the
papaya MSY sequence, no expressed gene was uncovered
until we had sequenced the above two pairs of X and Yh
BACs. Gene expression analyses of the sequenced X and Yh
BAC pairs began with a BLASTN search of the papaya unigene
set. Five perfect matches, genes 1, 2, 4, 5 and 6, were found
in the two X BACs 53E18 and 61H02 (Table 2), while no
perfect matches were found for the Yh BACs. RT-PCR analysis of the predicted genes using GenScan (see Experimental procedures) identified gene 3 on X BAC 53E18 and
gene 7 on X BAC 61H02. All seven genes were present on X
BACs, whereas only four genes, genes 2, 4, 5 and 6, were
present on their MSY counterparts. Genes 1 and 7 were
located in regions of the X BACs that were beyond the two
MSY BACs 85B24 and 95B12. Gene 3, whose ortholog in
Arabidopsis encodes ASYMMETRIC LEAVES 2, may have
been deleted or translocated from the corresponding region
of MSY BAC 85B24. Sequencing the RT-PCR fragments of
genes 2, 4, 5 and 6, amplified from the hermaphrodite
SunUp, showed two mixed peaks for those nucleotides that
ª 2007 The Authors
Journal compilation ª 2007 Blackwell Publishing Ltd, The Plant Journal, (2008), 53, 124–132
128 Qingyi Yu et al.
Table 2 Confirmed functional genes using papaya EST database
Gene
ID
BAC
ID
Location on BACs
Result of BLAST in GenBank
GenBank
accession ID
Amino acid
identitya
Amino acid
similaritiesa
1
2
3
4
5
6
7
53E18
53E18
53E18
61H02
61H02
61H02
61H02
19 577–30 372
119 229–86 982C
201 739–202 356
6956–33 328
69 940–75 548
79 843–77 370C
107 397–118 515
Protein kinase-like protein
Phosphoglucosamine mutase (Medicago truncatula)
AS2 (ASYMMETRIC LEAVES 2) (Arabidopsis thaliana)
SEC6 (Arabidopsis thaliana)
Somatic embryogenesis receptor kinase 1 (Citrus unshiu)
Hypothetical protein (Solanum tuberosum)
Nucleotide binding (Arabidopsis thaliana)
NP_196976
ABE93069
NP_176739
AAL87122
BAD32780
BAE46414
NP_177329
483/680 (71)
398/557 (71)
146/211 (69)
497/540 (92)
539/551 (97)
312/539 (57)
263/407 (64)
562/680 (82)
469/557 (84)
159/211 (75)
522/540 (96)
546/551 (99)
390/539 (72)
311/407 (76)
Values are expressed as n (%).
a
The values for amino acid identity and similarities are between papaya genes and the matched genes in the GenBank specified by the accession ID.
C: complementary strand.
had diverged between the X and Yh alleles (Figure S3), verifying the expression of these four Yh alleles. Transcripts of
all seven genes were present in root, leaf and flower tissues.
Quantitative real-time PCR analyses of all seven genes did
not detect sex-specific expression (no significant difference
among male, female and hermaphrodite samples) or dosage
effect (no significant difference between males and females
of the AU9 variety or between hermaphrodite and female
samples of the SunUp variety) (data not shown).
The structure of genes 2, 4, 5 and 6 based on the EST
sequence database and homologous genes in other species
is shown in Table S1. The number of nucleotides in each
exon and intron of these four gene pairs was, with the
exception of introns 5 and 7 in gene 2, nearly the same
between the X and Yh homologs. Nevertheless, the nucleotide sequences of these conserved exons and introns on
the Yh chromosome shared sequence identity with the X
chromosome ranging from 94.8% to 100% in the exons and
82.0–99.2% in the introns. The weighted-average sequence
identity for the 44 exons, totaling 7350 bp, was 97.9%
between X and Yh. The weighted-average for the 40 introns,
totaling 59 354 bp, was 80.7%.
sequence data to determine the degree of synonymous (Ks)
and non-synonymous (Ka) divergence between them
(Table 3). It is possible to infer the degree of functional
constraint acting on the X and Yh gene pairs by assessing the
ratio of non-synonymous to synonymous divergence (Ka/Ks)
between the X and Yh alleles. Ka/Ks ratios less than 1 indicate
that sequence divergence is selectively constrained; a ratio
equal to 1 indicates that the sequences are neutrally
diverging, and ratios greater than 1 suggest that selection is
driving functional divergence. The total number of synonymous and non-synonymous sites, mutations and degree of
divergence between each gene pair are summarized in
Table 3. All four X–Yh gene pairs have Ka/Ks ratios that are
considerably less than 1, ranging from 0.04 to 0.5, suggesting their divergence has been functionally constrained. It
will be necessary to compare lineage-specific divergence
rates with those of an out-group in order to determine
whether this constraint is restricted to the X copies as would
be expected if the Y is actively degenerating.
We also assessed the degree of silent site nucleotide
divergence (Ksil) between the X and Yh gene pairs. Sequence
divergence at silent sites (e.g. synonymous and non-coding
sites) is reflective of the underlying neutral mutation rate,
and can be used, assuming a molecular clock, to estimate
the time of divergence between two sequences. The degree
of silent site divergence between all X–Yh gene pairs was
similar and relatively low, ranging between 0.016 and 0.066
Sequence divergence of the four X and Yh gene pairs
We analyzed the coding sequences of the four X–Yh gene
pairs (genes 2, 4, 5 and 6) that are supported by EST
Table 3 Estimates of synonymous and non-synonymous nucleotide divergence of Carica papaya X- and Yh-linked gene sequences
No. sites (bp)
No. mutations
Sequence divergence
Gene
ID
Total
sites
Total
coding
sites
Syn.
sites
Non-syn.
sites
Syn.
mutations
Non-syn.
mutations
Syn.
sites (Ks)
Non-syn.
sites (Ka)
Ka/Ks
2
4
5
6
23653
11391
5571
5162
1479
864
1872
2538
358.5
193.1
452.0
594.6
1120.5
670.9
1420.0
1943.4
15
13
24
23
18
6
3
37
0.043
0.071
0.055
0.039
0.016
0.009
0.002
0.020
0.37
0.13
0.04
0.50
syn., synonymous; non-syn., non-synonymous.
ª 2007 The Authors
Journal compilation ª 2007 Blackwell Publishing Ltd, The Plant Journal, (2008), 53, 124–132
Low X/Y divergence of papaya sex-linked genes 129
Table 4 Estimated age of divergence of Carica papaya X- and
Yh-linked genes
Gene
ID
Silent
sites
Silent
mutations
Silent site
divergence (Ksil)
Estimated
age (MYA)a
2
4
5
6
22 532.47
10 712.1
4151.0
3218.5
1414
568
128
51
0.066
0.055
0.032
0.016
2.18
1.83
1.05
0.53
Based on a synonymous substitution rate of 1.5 · 10)8 substitutions
per synonymous site per year (Koch et al., 2000).
a
(Table 4). Assuming a synonymous substitution rate of
1.5 · 10)8 synonymous substitutions per site per year for
dicot nuclear genes (Koch et al., 2000), we estimate that the
time of divergence for X–Yh gene pairs ranges from 0.5 to
2.2 MYA (Table 4).
Discussion
Because 32 of the 35 species of Caricaceae are dioecious, it
was thought that sex chromosomes could be ancestral at the
family level. Our estimate of 0.5–2.2 million years of divergence in four pairs of X/Y genes distributed across more
than half the MSY indicated that they probably evolved at
species level (or genus level as there is only one species in
Carica), long after the divergence of Caricaceae from its
closest related family Moringaceae at approximately
60 MYA. Genomic investigation on the X and Y gene pairs in
male and female plants of multiple species in the closest
related genus Vasconcellea is under way and will further
clarify this issue.
Their recent origin would make papaya sex chromosomes
younger than any other known sex chromosomes of animals
or plants. The three other well-studied young sex chromosome systems of Silene latifolia, medaka fish and stickleback
fish have been dated recently, with estimated times of origin
of 10–20, 10 and 10 MYA, respectively (Bergero et al., 2007;
Kondo et al., 2004; Peichel et al., 2004). The neo-Y chromosome of Drosophila miranda was formed approximately
1.0 MYA from the fusion of an ordinary autosome with the
ancestral Y chromosome. This neo-Y chromosome shares
features with incipient Y chromosomes such as degeneration of Y-linked genes and accumulation of transposable
elements (Bachtrog, 2005; Bachtrog and Charlesworth,
2002).
Sequencing two pairs of papaya X and Yh BACs allowed
direct comparison of approximately 400 kb of contiguous
DNA sequences. These syntenic X and Yh sequences provide
clear evidence of numerous chromosomal rearrangements
and extensive sequence divergence between the X and Yh
chromosomes after recombination ceased in the MSY.
Some chromosomal rearrangements could have contributed to the suppression of recombination in the MSY
region, whereas others may have occurred after the recombination suppression, as has been reported for the inversion
detected in Silene (Zluvova et al., 2005). Local insertions and
duplications may have contributed to the DNA sequence
expansion on the MSY.
The sex chromosomes of papaya, medaka fish and
stickleback fish are homomorphic, and are considered to
be among the most recently evolved sex chromosomes. On
the other hand, the sex chromosomes in Silene latifolia are
heteromorphic, with the Y chromosome the largest in the
genome. Complete sequencing of the MSY in medaka
revealed a small, 258 kb, non-recombining region after
10 MYA of evolution. This small MSY appeared to be
delimited by a pair of duplicated genes, OlaflnkL and
OlaflnkR, on the left and right borders, respectively, which
may recombine with the X counterpart OlaflnkX (Kondo
et al., 2006). In contrast, the papaya sex chromosomes have
a MSY of approximately 8–9 Mb after possibly only 2–3 MYA
of divergence (Q.Y., P.H.M., J.J., A.H.P. and R.M., unpublished data). The relatively rapid extension of the MSY in
papaya could be due to a combination of its gene-poor
nature (Yu et al., 2007) and the lack of confining sequences
similar to the duplicated Olaflnk genes at the borders of the
MSY in medaka. The MSY in Silene latifolia has spread over
90% of the Y chromosome, or approximately 513 Mbp (Liu
et al., 2004). The various chromosomal origins and the
variable pace of MSY extension in these young sex chromosome systems demonstrate that diverse mechanisms are
involved in sex chromosome evolution. However, the general pattern and selection forces appear to be conserved
even across the plant and animal kingdoms (Charlesworth,
1991).
Although the primitive X and Yh chromosomes in papaya
are cytologically homomorphic, molecular evidence revealed 9.6% and 35.2% DNA sequence expansion on the Yh
chromosome in the regions corresponding to the X BACs
61H02 and 53E18, respectively. These results suggest that
the X and Yh chromosomes in papaya are becoming
heteromorphic at the molecular level, but that this level of
sequence expansion is not yet sufficient to be observed
cytologically. A similar phenomenon was also reported for
the small MSY of the homomorphic sex chromosomes in
medaka and stickleback fish (Kondo et al., 2006; Peichel
et al., 2004). The autosomal region of linkage group 9 in
medaka, where the Y-specific region originated, is 42.9 kb,
but the corresponding MSY sequence is 72.1 kb, showing
68.1% DNA sequence expansion (Kondo et al., 2006). The
stickleback Y-specific BAC also showed 37% sequence
expansion compared with its X-specific BAC, mainly due to
local duplications and accumulation of transposable
elements (Peichel et al., 2004).
To isolate functional genes from plant X and Y
chromosomes, researchers must overcome the obstacles
of low gene density on the Y chromosome and the presence
ª 2007 The Authors
Journal compilation ª 2007 Blackwell Publishing Ltd, The Plant Journal, (2008), 53, 124–132
130 Qingyi Yu et al.
of X chromosome genes in both males and females.
Internationally collaborative efforts among five renowned
research groups in four countries have resulted in identification of eight pairs of X and Y genes from the sex
chromosome model species Silene latifolia in the past
decade (Atanassov et al., 2001; Bergero et al., 2007; Delichère et al., 1999; Filatov, 2005; Moore et al., 2003; Nicolas
et al., 2005). Much of our knowledge on plant sex chromosome evolution is gained from detailed analyses of these
eight X/Y gene pairs. The addition of four X/Yh gene pairs in
papaya has broadened the scope of plant sex chromosome
research to a trioecious species in a different order.
Expression analysis of all seven genes revealed neither
differential expression among sex types nor dosage effects.
Although gene 3 appeared to be lost from the Yh chromosome by deletion, the expression level of gene 3 was the
same in male, female and hermaphrodite leaf, root, and
flower bud tissues. Gene 3 may have developed a dosage
compensation mechanism (i.e. an equalized expression
level of X-liked genes from one copy in the XY genotype
and two copies in the XX genotype), or it may have been
translocated to a new location in the MSY. Independent
evidence suggests that these seven genes are not likely to be
directly related to sex determination. Specifically, we
detected a large segmental deletion in a male-to-female
sex reversal mutant that was generated by c-ray irradiation
of male pollen. This deleted region is located near the left
border of the MSY shown in Figure S1, approximately 1 Mb
from genes on X and Yh BACs 61H02 and 95B12 and 5 Mb
from genes on X and Yh BACs 53E18 and 85B24 (Q.Y., P.H.M.
and R.M., unpublished data).
Analysis of sequence divergence (Ka/Ks) between X and Yh
gene pairs indicated that all X–Yh gene pairs investigated
exhibit functional constraints. This is a characteristic of
young sex chromosomes, such as those of the dioecious
plant Silene latifolia, which contain a number of functional
X- and Y-linked alleles for widely expressed ‘housekeeping
genes’ (Atanassov et al., 2001; Delichère et al., 1999; Moore
et al., 2003). In contrast, Y chromosomes of evolutionarily
ancient sex chromosomes, such as the human Y chromosome, have lost these types of genes. It is also possible that
only the X alleles are functionally constrained and that the Y
chromosome has an accelerated non-synonymous mutation
rate relative to X. This pattern is seen in some S. latifiolia
sex-linked genes, and supports degeneration of the Y
chromosome in that system (Filatov and Charlesworth,
2002; Filatov, 2005). We are currently obtaining out-group
sequence data in order to test for differential rates of
evolution on the X and Y chromosomes.
Both the conservation of X/Yh gene pairs and analysis of
silent site divergence add support to our hypothesis of a
recent origin of the papaya sex chromosomes. Our estimates of Ksil in particular allow us to calculate an estimated
time of recombination suppression for the MSY regions
containing these gene pairs as between 0.5 and 2.2 MYA.
However, it is possible that the divergence of these gene
pairs does not reflect that of the entire MSY region. We could
be sampling a region of the MSY that has only recently
become recombinationally isolated from the X chromosome. However, physical mapping and sequencing of the
papaya MSY indicate that the Yh BAC 85B24, which contains
gene 2, and the Yh BAC 95B12, which contains genes 4–6, are
found on opposite sides of the MSY approximately 5 Mb
apart (Figure S1). Both BACs have genes (gene 2 and gene 4)
with similar Ksil values despite the physical distance
between them, suggesting that they ceased recombination
at about the same time. While there is considerable variation
between the estimated divergence times for genes 4–6 (0.5–
1.8 MYA), these genes are physically close to each other on
MSY BAC 95B12; thus, this difference may be due to
sampling variance.
Experimental procedures
Plant material
Leaf tissue from female and hermaphrodite plants of the Hawaiian
gynodioecious papaya cultivars Kapoho and SunUp, plus male and
female plants of the Australian dioecious variety AU9, were used for
PCR. The root, leaf and young flower buds from female and
hermaphrodite plants of SunUp and male plants of AU9 were used
for RT-PCR.
Sequencing X and Yh BACs and sequence assembly
The papaya BAC clones were sequenced using the shotgun
approach with at least 10-fold coverage. BAC DNAs were randomly
sheared using Hydroshear (Genomic Solutions; http://www.
genomicsolutions.com) to generate insert fragments of approximately 3 kb. The sheared fragments were size-selected on an agarose gel, purified, end-repaired and ligated to the pUC118 vector
(Takara; http://www.takarabiousa.com). DNAs from the 3 kb
libraries were cycle-sequenced using ABI BigDye Terminator version 3.1 and analyzed on a 3730XL DNA analyzer (Applied Biosystems, http://www.appliedbiosystems.com/).
The Phred/Phrap/Consed (http://bozeman.mbt.washington.edu/
phredphrapconsed.html) and CAP3 (http://genome.cs.mtu.edu/
cap/cap3.html) packages were used for sequence assembly. Gaps
in assembly and regions of low quality were resolved by
re-sequencing sub-clones identified by Autofinish (http://www.
bozeman.mbt.washington.edu/phredphrapconsed.html), sequencing PCR products, and/or additional random sub-clone
sequencing. All BAC clones were manually examined for signs
of mis-assembly. Suspect regions were clarified either by
ambiguous read removal, PCR amplification and sequencing,
and/or alignment with a neighboring BAC. A BAC was not
considered complete until all inconsistent read pairs had been
resolved and Consed reported an error rate of less than 1 per
10 000 bases. The GenBank accession numbers for the four BACs
are: EF661025 (MSY BAC 95B12), EF661024 (MSY BAC 85B24),
EF661026 (X-specific BAC 53E18) and EF661023 (X-specific BAC
61H02). The papaya EST sequences are being submitted to
GenBank under GenomeProject ID 20267.
ª 2007 The Authors
Journal compilation ª 2007 Blackwell Publishing Ltd, The Plant Journal, (2008), 53, 124–132
Low X/Y divergence of papaya sex-linked genes 131
Identification of transcription units and
comparative sequence analyses
The potential transcripts were identified using GeneScan (http://
genes.mit.edu/GENSCAN.html). The predicted transcripts were
tested by RT-PCR, and the BAC sequences were aligned using a
genome comparison browser (http://sun1.softberry.com).
RT-PCR and real-time quantitative RT-PCR assays
At least one intron was covered by primers designed for RT-PCR
experiments to control genomic DNA contamination. Total RNA
was extracted from three different sex types (male, female and
hermaphrodite) of young flower bud (0.4–0.7 cm, before meiosis),
root and leaf tissues. Approximately 2 lg of total RNA was treated
with RNase-free DNase I (Promega, http://www.promega.com/) and
reverse-transcribed using a RETROscript kit (Ambion; http://
www.ambion.com). The synthesized cDNAs served as templates for
RT-PCR. cDNA products were diluted 5.5-fold for use in real-time
RT-PCR amplification. Platinum quantitative PCR SuperMix-UDG
(Invitrogen, http://www.invitrogen.com/) was used for real-time
RT-PCR amplification. Reactions were run on a DNA Engine
OPTICON 2 real-time PCR detection system (BIO-RAD, formerly MJ
Research; http://www.bio-rad.com). Thermocycler conditions were
2 min at 50C, followed by 10 sec at 95C and 40 cycles of 15 sec at
95C, 20 sec at 55C and 30 sec at 72C. An RNA sample from each
tissue without reverse transcriptase was included as a control for
genomic DNA contamination. All PCR data presented were generated from a minimum of three independent reactions for each biological replicate. The Actin gene was used as an internal control
gene for normalization to account for variation in the total amount
of RNA in the sample reactions.
Sequence divergence of X and Yh gene pairs
Exon and intron regions of the X and Yh gene pairs that corresponded to EST-supported sequences were manually aligned using
BioEdit (Hall, 1999). The numbers of synonymous substitutions per
synonymous site (Ks), non-synonymous substitutions per nonsynonymous site (Ka), and synonymous and non-coding (silent)
substitutions per silent site (Ksil) were estimated according to the
method described by Nei and Gojobori (1986), and implemented in
DnaSP 4.0 (Rozas et al., 2003). Divergence times for the paired X
and Yh alleles were determined using Ksil and the methods
described by Li (1997), using the synonymous substitution rate for
dicot nuclear genes estimated by Koch et al. (2000).
Acknowledgements
We thank Jianping Wang, Jong-Kuk Na, Zdenek Kubat, Wenli
Zhang, Andrea Gschwend and Virginie Lachaise for technical
assistance, and Henrik Albert and Stephanie Whalen for reviewing
the manuscript. This work was supported by a grant from National
Science Foundation to R.M., Q.Y., P.H.M., J.J. and A.H.P. (award
number DBI-0553417), and a US Department of Agriculture,
Agricultural Research Services Cooperative Agreement (CA 583020-8-134) with the Hawaii Agriculture Research Center.
Supplementary Material
The following supplementary material is available for this article
online:
Figure S1. Physical positions of two MSY BACs and two X BACs.
Figure S2. DNA sequence comparison between X and Yh BAC pairs
in opposite orientations.
Figure S3. Sequencing of mixed X and Yh alleles of gene 5 from the
hermaphrodite genotype.
Table S1. Gene structure and sequence comparison of confirmed
functional genes between X- and MSY-BACs.
This material is available as part of the online article from http://
www.blackwell-synergy.com.
Please note: Blackwell Publishing are not responsible for the content
or functionality of any supplementary materials supplied by the
authors. Any queries (other than missing material) should be
directed to the corresponding author for the article.
References
Atanassov, I., Delichère, C., Filatov, D.A., Charlesworth, D.,
Negrutiu, I. and Monéger, F. (2001) Analysis and evolution of two
functional Y-linked loci in a plant sex chromosome system. Mol.
Biol. Evol. 18, 2162–2168.
Bachtrog, D. (2005) Sex chromosome evolution: molecular aspects
of Y-chromosome degeneration in Drosophila. Genome Res. 15,
1393–1401.
Bachtrog, D. and Charlesworth, B. (2002) Reduced adaptation
on a non-recombining neo-Y chromosome. Nature, 416, 323–
326.
Bergero, R., Forrest, A., Kamau, E. and Charlesworth, D. (2007)
Evolutionary strata on the X chromosomes of the dioecious plant
Silene latifolia: evidence from new sex-linked genes. Genetics,
175, 1945–1954.
Chandrika, U.G., Jansz, E.R., Wickramasinghe, S.M.D.N. and
Warnasuriya, N.D. (2003) Carotenoids in yellow- and redfleshed papaya (Carica papaya L.). J. Sci. Food Agric. 83,
1279–1282.
Charlesworth, B. (1991) The evolution of sex chromosomes.
Science, 251, 1030–1033.
Delichère, C., Veuskens, J., Hernould, M., Barbacar, N., Mouras, A.,
Negrutiu, I. and Monéger, F. (1999) SlY1, the first active gene
cloned from a plant Y chromosome, encodes a WD-repeat
protein. EMBO J. 18, 4169–4179.
Filatov, D.A. (2005) Substitution rates in a new Silene latifolia sex
linked gene, SlssX/Y. Mol. Biol. Evol. 22, 402–408.
Filatov, D.A. and Charlesworth, D. (2002) Substitution rates in the Xand Y- linked genes of the plants, Silene latifola and S. dioica.
Mol. Biol. Evol. 19, 898–907.
Hall, T.A. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic
Acids Symp. Ser. 41, 95–98.
Hofmeyr, J.D.J. (1938) Genetic studies of Carica papaya L. S Afr. J.
Sci. 35, 300–304.
Hofmeyr, J.D.J. (1939) Sex reversal in Carica papaya L. S Afr. J. Sci.
26, 286–287.
Hofmeyr, J.D.J. (1967) Some genetic breeding aspects of Carica
papaya L. Agron. Trop. 17, 345–351.
Horovitz, S. and Jiménez, H. (1967) Cruzamientos interespecificos e
intergenericos en Caricaceas y sus implicaciones fitotecnias.
Agron. Trop. 17, 353–359.
Koch, M.A., Haubold, B. and Mitchell-Olds, T. (2000) Comparative
evolutionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera
(Brassicaceae). Mol. Biol. Evol. 17, 1483–1498.
Kondo, M., Nanda, I., Hornung, U., Schmid, M. and Schartl, M.
(2004) Evolutionary origin of the medaka Y chromosome. Curr.
Biol. 14, 1664–1669.
ª 2007 The Authors
Journal compilation ª 2007 Blackwell Publishing Ltd, The Plant Journal, (2008), 53, 124–132
132 Qingyi Yu et al.
Kondo, M., Hornung, U., Nanda, I. et al. (2006) Genomic organization of the sex-determining and adjacent regions of the sex
chromosomes of medaka. Genome Res. 16, 815–826.
Li, W.-H. (1997) Molecular Evolution. Sunderland, MA: Sinauer
Associates, Inc.
Liu, Z., Moore, P.H., Ma, H. et al. (2004) A primitive Y chromosome
in papaya marks incipient sex chromosome evolution. Nature,
427, 348–352.
Luo, M.C., Thomas, C., You, F.M., Hsiao, J., Ouyang, S., Buell, C.R.,
Malandro, M., McGuire, P.E., Anderson, O.D. and Dvorak, J.
(2003) High-throughput fingerprinting of bacterial artificial
chromosomes using the SNaPshot labeling kit and sizing of
restriction fragments by capillary electrophoresis. Genomics, 82,
378–389.
Ma, H., Moore, P.H., Liu, Z., Kim, M.S., Yu, Q., Fitch, M.M.M.,
Sekioka, T., Paterson, A.H. and Ming, R. (2004) High-density
linkage mapping revealed suppression of recombination at the
sex determination locus in papaya. Genetics, 166, 419–436.
Ming, R., Moore, P.H., Zee, F., Abbey, C.A., Ma, H. and Paterson,
A.H. (2001) Construction and characterization of a papaya BAC
library as a foundation for molecular dissection of a tree-fruit
genome. Theor. Appl. Genet. 102, 892–899.
Ming, R., Yu, Q. and Moore, P.H. (2007) Sex determination in
papaya. Semin. Cell. Dev. Biol. 18, 401–408.
Moore, R.C., Kozyreva, O., Lebel-Hardenack, S., Siroky, J., Hobza,
R., Vyskot, B. and Grant, S.R. (2003) Genetic and functional
analysis of DD44, a sex-linked gene from the dioecious plant
Silene latifolia provides clues to early events in sex chromosome
evolution. Genetics, 163, 321–334.
Nei, M. and Gojobori, T. (1986) Simple methods for estimating the
numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3, 418–426.
Nicolas, M., Marais, G., Hykelova, V., Janousek, B., Laporte, V.,
Vyskot, B., Mouchiroud, D., Neqrutiu, I., Charlesworth, D. and
Moneqer, F. (2005) A gradual process of recombination restriction
in the evolutionary history of the sex chromosomes in dioecious
plants. PLoS Biol. 3, 47–56.
Peichel, C.L., Ross, J.A., Matson, C.K., Dickson, M., Grimwood, J.,
Schmutz, J., Myers, R.M., Mori, S., Schluter, D. and Kingsley, D.M.
(2004) The master sex-determination locus in threespine sticklebacks is on a nascent Y chromosome. Curr. Biol. 14, 1416–1424.
Rozas, J., Sánchez-DelBarrio, J.C., Messeguer, X. and Rozas, R.
(2003) DnaSP, DNA polymorphism analyses by the coalescent
and other methods. Bioinformatics, 19, 2496–2497.
Sondur, S.N., Manshardt, R.M. and Stiles, J.I. (1996) A genetic
linkage map of papaya based on randomly amplified polymorphic DNA markers. Theor. Appl. Genet. 93, 547–553.
Storey, W.B. (1938) Segregations of sex types in Solo papaya and
their application to the selection of seed. Proc. Am. Soc. Hort. Sci.
35, 83–85.
Storey, W.B. (1953) Genetics of papaya. J. Hered. 44, 70–78.
Storey, W.B. (1976) Papaya. In Evolution of Crop Plants (Simmonds,
N.W., ed). London: Longman, pp. 21–24.
Wikström, N., Savolainen, V. and Chase, M.W. (2001) Evolution of
the angiosperm: calibrating the family tree. Proc. R. Soc. Lond.
(Biol.) 268, 2211–2220.
Yamamoto, H. (1964) Comparison of the carotenoids in yellow- and
red-fleshed carica papayas. Nature, 201, 1049–1050.
Yu, Q., Hou, S., Hobza, R. et al. (2007) Chromosomal location and
gene paucity of the male specific region on papaya Y chromosome. Mol. Genet. Genomics, 278, 177–185.
Zluvova, J., Janousek, B., Negrutiu, I. and Vyskot, B. (2005)
Comparison of the X and Y chromosome organization in Silene
latifolia. Genetics, 170, 1431–1434.
ª 2007 The Authors
Journal compilation ª 2007 Blackwell Publishing Ltd, The Plant Journal, (2008), 53, 124–132