Download Evolutionary relationships of the Tas2r receptor gene families in

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Long non-coding RNA wikipedia , lookup

Gene expression programming wikipedia , lookup

Transposable element wikipedia , lookup

Segmental Duplication on the Human Y Chromosome wikipedia , lookup

Genomic library wikipedia , lookup

Public health genomics wikipedia , lookup

Neocentromere wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Point mutation wikipedia , lookup

Microevolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Genomic imprinting wikipedia , lookup

Ridge (biology) wikipedia , lookup

Non-coding DNA wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Human Genome Project wikipedia , lookup

Genomics wikipedia , lookup

Pathogenomics wikipedia , lookup

Genome editing wikipedia , lookup

Gene wikipedia , lookup

Gene expression profiling wikipedia , lookup

Minimal genome wikipedia , lookup

Helitron (biology) wikipedia , lookup

History of genetic engineering wikipedia , lookup

Designer baby wikipedia , lookup

Metagenomics wikipedia , lookup

Genome (book) wikipedia , lookup

Human genome wikipedia , lookup

Genome evolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Transcript
Articles in PresS. Physiol Genomics (May 6, 2003). 10.1152/physiolgenomics.00060.2003
Evolutionary relationships of the Tas2r receptor gene families in mouse and
man
1
2
1
1
Caroline Conte, Martin Ebeling, Anne Marcuz, Patrick Nef, and Pedro J. Andres-Barquin
1
1
2
Neuroscience and Bioinformatics, Pharma Research, F. Hoffmann-La Roche,
Basel 4070, Switzerland
Running head: Evolution of the mouse and human TAS2Rs
Corresponding author:
Pedro J. Andres-Barquin, Pharma Research Basel Discovery - Neuroscience, Bldg. 93/356, F.
Hoffmann-La Roche Ltd., CH 4070 Basel, Switzerland. Phone: 41-61-688 73 29. Fax: 41-61-688
14 48. E-mail: [email protected]
Copyright (c) 2003 by the American Physiological Society.
2
Abstract
The early molecular events in the perception of bitter taste start with the binding of specific
water-soluble molecules to G protein-coupled receptors (GPCRs) encoded by the Tas2r family of
taste receptor genes. The identification of the complete TAS2R receptor family repertoire in
mouse and a comparative study of the Tas2r gene families in mouse and man might help to better
understand bitter taste perception. We have identified, cloned and characterized 13 new mouse
Tas2r sequences, 9 of which encode putative functional bitter taste receptors. The encoded
proteins are between 293 and 333 amino acids long and share between 18 and 54 percent
sequence identity with other mouse TAS2R proteins. Including the 13 sequences identified, the
mouse Tas2r family contains approximately 30% more genes and 60% fewer pseudogenes than
the human TAS2R family. Sequence and phylogenetic analyses of the proteins encoded by all
mouse and human Tas2r genes indicate that TAS2R proteins present a lower degree of sequence
conservation in mouse than in human and suggest a classification in five groups that may reflect
a specialization in their functional activity to detect bitter compounds. Tas2r genes are organized
in clusters in both mouse and human genomes and an analysis of these clusters and phylogenetic
analyses indicate that the five TAS2R protein groups were present prior to the divergence of the
primate and rodent lineages. However, differences in subsequent evolutionary processes
including local duplications, interchromosomal duplications, divergence and deletions, gave rise
to species-specific sequences and shaped the diversity of the current TAS2R receptor families
during mouse and human evolution.
Sequence data reported in this paper have been submitted to GenBank and assigned the
accession numbers AF532785-AF532793 and AY145467 - AY145470.
Key words: bitter taste receptors, GPCR, primates, rodents.
3
Introduction
The systems underlying chemical sensation are essential for animal survival. Even singlecell organisms possess receptors that allow them to respond to chemical signals. In most
organisms, the chemical senses play a pivotal role in locating food, discriminating foods from
those that are toxic, motivating food intake, and regulating the aspects of social behaviour that
are necessary for reproduction. The sense of taste bestows the organism with the ability to detect
nutritionally important compounds, including sugars, salts and amino acids, and potentially
harmful substances, including alkaloids and acids. The initial step in taste perception is the
interaction of water-soluble molecules with receptors expressed at the surface of taste receptor
cells. On binding taste molecules, the receptors trigger transduction cascades that activate
synapses and thus cause activation of the nerve fibres (for review, see 13). Mammals taste many
compounds but are believed to distinguish between only five primary tastes: sweet, bitter, sour,
salty and umami (the taste of monosodium glutamate). The tastes sweet, bitter and umami are
believed to be detected by G protein-coupled receptor (GPCR) signalling pathways (6, 32).
A number of taste GPCRs called TAS1Rs, TAS2Rs and Taste-mGluR4 have been
identified and characterized in mammals. The taste cell-derived variant of mGluR4 receptor has
been proposed to function as an umami taste receptor (5). The TAS1R family of taste GPCRs is
thought to function in the perception of sweetness (17, 19, 23, 29), as amino acid sensors (22),
and as umami taste receptors (12). The TAS2R family of taste GPCRs is implicated in the
perception of bitterness (1, 4, 16). Tas2r genes are expressed in taste receptor cells and map to
regions of human and mouse chromosomes that have been related to the ability to sense a variety
of bitter compounds (3, 14, 25). Few TAS2R receptors are known that respond to bitter
substances due to the difficulty to functionally express Tas2r genes in heterologous cell lines (2,
4). Studies in experimentally tractable model organisms, such as mouse, and the identification of
orthologous relationships between human and mouse Tas2r genes can help to determine the
ligand-binding properties of TAS2R receptors and to translate data from mouse studies into an
4
understanding of human taste. A few pairs of mouse and human Tas2r orthologous have been
reported to date (1).
The identification and characterization of the entire repertoire of TAS2R receptors are the
basis for studies of receptor-ligand interactions and the understanding of the early molecular
events in bitter taste perception. We previously searched in the virtually completed human
genome sequence and, together with others, identified 28 genes encoding putative functional
bitter taste receptors of the TAS2R family (2, 7). In mouse, the Tas2r gene family is less well
characterized than the human TAS2R family. A total of 28 Tas2r genes were identified prior to
the publication of the draft of the mouse genome sequence (1, 16). To identify the complete
Tas2r family repertoire in mouse and make a comparative analysis of human and mouse Tas2r
gene families, we have carried out homology-based searches in the recently published draft of
the mouse genome sequence (20) and also in available unannotated raw sequences. We have
identified and characterized 9 new full-length sequences encoding putative mouse taste receptors
of the TAS2R family and also 4 sequences that may represent non-functional pseudogenes. Our
almost complete repertoire of Tas2r genes allow us to describe the evolution of these genes and
report new insights into the evolutionary processes that shaped the current Tas2r gene families in
mouse and man.
Materials and Methods
Sequence database mining
A publicly available draft of the mouse genome was downloaded from the Mouse
Genome Sequencing Consortium (MGSC; http://www.ensembl.org/Mus_musculus/Download/).
The draft is based on BAC clones sequenced in the public domain and contains sequences up to
those that were available in February, 2002. The mouse genome is roughly sevenfold covered by
the available sequence, and the assembled sequence is estimated to cover 96% of mouse
5
euchromatic DNA (http://www.ensembl.org/Mus_musculus/). Databases were prepared from the
raw sequence data. These databases were searched with a collection of known mouse, human,
and other vertebrate, sequences from the public domain to obtain matching segments (HSPs,
high-scoring segment pairs), which often roughly correspond to exons of genes on the
chromosomes. The database build-up and the sequence similarity searches were performed using
the BLAST2 suite of programs available from the National Center of Biotechnology Information
(NCBI)
(http://www.ncbi.nlm.nih.gov/)
on
their
publicly
accessible
ftp
site
(ftp://ftp.ncbi.nlm.nih.gov/blast). HSPs found by the same query sequence, and lying in
proximity to each other on a single chromosome, were assembled into complete genes using the
publicly available software GeneWise from the Wise2 package by Ewan Birney at the Sanger
Centre, Hinxton, UK (http://www.sanger.ac.uk:80/Software/Wise2). This software aligns protein
sequences to genomic DNA sequences and reconstructs splice sites at the exon-intron boundaries
that comply with the well-known "GT-AG" rule for the splicing of eukaryotic genes. The
resulting, predicted genes, and their corresponding protein translations, were assembled into two
databases, respectively. Using tools from the publicly available software package HMMER,
version
2.1.1,
by
Sean
Eddy
at
Washington
University
in
St.
Louis,
USA
(http://hmmer.wustl.edu/), an alignment of known, publicly available taste receptors was used to
produce a Hidden Markov Model (HMM) characteristic of this family (programs HMMBUILD
and HMMCALIBRATE from the HMMER package). This HMM was then used to search the
database of protein translations of predicted mouse genes mentioned above, using the
HMMSEARCH program from the HMMER package. The top-scoring matches were identified
as potential taste receptors and were further analyzed. TAS2R candidates identified were
compared to known TAS2R sequences in the public databases and also in a database of patented
sequences (GeneSeq database obtained from Derwent Information, London, UK) using the
BLAST2 suite of programs available from the NCBI (http://www.ncbi.nlm.nih.gov/).
6
Sequence alignments, phylogenetic analysis and sequence logos
Multiple sequence alignments were generated with ClustalW (31). The alignments were
slightly modified to adjust the gap positions by visual inspection. The resulting alignment was
graphically displayed using the PrettyBox program from the Wisconsin Package Version 10.2,
Genetics Computer Group (GCG), Madison, WI. Protein phylogenetic trees are based on the
alignment of 36 mouse and 28 human TAS2R protein sequences. Phylogenetic analyses were
performed using the PHYLIP package (Felsenstein 1993 PHYLIP [Phylogeny Inference
Package] version 3.6a2. Department of Genetics, University of Washington, Seattle). Protein
trees are based on sequence distances derived from the Jones-Taylor-Thornton substitution
matrix (program PROTDIST from PHYLIP). From the original alignment, 1,000 alignments
were obtained using the bootstrap procedure (program SEQBOOT). Distance matrices were
calculated for each of them, and phylogenetic trees were obtained using the Neighbor-Joining
algorithm as implemented in the program NEIGHBOR. The resulting trees were used to derive
bootstrap support values for each of the branch points in the tree. Sequence logos are based on
the alignment of 36 mouse TAS2R protein sequences. The sequence logos were generated using
a
web-based
program
developed
by
J.
Gorodkin
(http://www.cbs.dtu.dk/gorodkin/appl/plogo.html) (10, 30).
Genomic PCR and cloning
Tas2r DNA sequences were amplified from mouse genomic DNA by polymerase chain
reaction (PCR) using Taq DNA polymerase (Roche Molecular Biochemicals, Basel,
Switzerland) and oligonucleotide primers designed to amplify the full Tas2r open reading frames
(ORF). PCR amplifications were performed with the primer pairs described in table 1.
7
PCRs were performed in a total volume of 50 µl containing 100 ng of DNA, 200 nM of
each primer in 50 mM KCl, 10 mM Tris (pH 8.3), 1.5 mM MgCl2, 0.5 mM of each dNTP, and 2
U Taq DNA polymerase. An initial denaturation step at 94°C for 2 min was followed by 30
cycles of denaturation at 94°C for 30 s, annealing at 55°C for 45 s, and extension at 72°C for 1
min on a Tpersonal machine (Biometra, Goettingen, Germany). The last extension step was 10
min. The PCR products obtained were analysed on 1% agarose gels stained with ethidium
bromide. The sequence of the Tas2r genes and pseudogenes was determined by subcloning the
PCR products in pCRII-TOPO (Invitrogen, Paisley, UK). Oligonucleotides used to prime
sequencing
were
Sp6
f5'-ATTTAGGTGACACTATAG-3'
TAATACGACTCACTATAGGG-3'.
Double-stranded
templates
and
were
T7
sequenced
r5'by
dideoxynucleotide chain termination by dRhodamine Terminator Cycle Sequencing Ready
Reaction (Perkin Elmer, USA) with an initial denaturation step at 96°C for 2 min followed by 25
cycles of denaturation at 96°C for 30 s, annealing at 50°C for 15 s, and extension at 60°C for 4
min. The samples were loaded on an ABI310 sequence analyser (ABI).
Results
The mouse genome contains 36 Tas2r genes
The TAS2R family of receptors has been shown to play a role in the perception of
bitterness (1, 2, 4, 16). A first search in a partial mouse genome sequence led to the identification
of 26 full-length gene members of the Tas2r family in mouse (1). To better understand the early
events in bitter taste perception and complete the identification of the Tas2r family repertory, we
sought to identify new putative mouse taste receptors belonging to this family. For this purpose,
we have undertaken a bioinformatics homology-based screen of the mouse genome draft for
sequences related to the TAS2R family of taste receptors. In a first step we collected all TAS2R
sequence information available in public databases and in the GeneSeq database of patented
8
sequences. We found 32 entries of mouse Tas2r related sequences. Three of those entries were
partial sequences, two contained a frameshift each and 27 appeared to be full-length Tas2r genes.
We aligned all known, publicly available, TAS2R receptor sequences and developed an HMM
model characteristic of the TAS2R family to search in a database of protein translations of
predicted mouse genes and in ORFs predicted in unannotated high-throughput genomic
sequences (HTGS) (see Materials and Methods). ORF-encoding protein products that were
shorter than 250 amino acids were not considered as full-length, uninterrupted ORFs. Sequences
sharing more than 98% nucleotide or amino acid identity were considered to be identical,
because they may represent sequencing errors or genetic polymorphism. Sequences containing
one or more disruptions in a full-length ORF were considered as pseudogenes.
We identified in mouse 13 new Tas2r gene sequences that we named Tas2r34 to Tas2r46.
Four of those sequences, Tas2r41, 42, 45 and 46, are pseudogenes as they contain frameshifts
and/or premature stop codons. To experimentally validate these findings, we performed PCR
amplification of mouse genomic DNA. The reactions were primed with oligonucleotides
designed to amplify each full-length gene sequence and pseudogene sequence (Table 1).
Sequencing of the reaction products confirmed the correct identification of the 9 new Tas2r
genes and the four new pseudogenes in the mouse genome (data not shown). Figure 1 presents a
comparison of the predicted amino acid sequences of the identified full-length proteins with
mouse TAS2R5 protein. The predicted proteins have between 293 and 333 amino acids and
share 22 to 33% sequence identity with mouse TAS2R5 and 18 to 54% sequence identity with
other mouse TAS2R proteins (Fig. 1 and data not shown). Sequence analysis of the new proteins
also indicated the presence of seven transmembrane domains and large sequence conservation in
the transmembrane domains and in the intracellular domains.
9
The TAS2R proteins of mouse present a lower degree of sequence conservation than their human
counterparts
A comparison of the proteins encoded by the 27 intact mouse Tas2r genes found in
databases and by the 9 full-length Tas2r genes we identified reveals a number of conserved
features. Mouse TAS2R proteins are between 293 and 334 amino acids long and contain seven
transmembrane domains and short amino- (1-29 amino acids) and carboxy-termini (8-54 amino
acids). As the mouse TAS2R family of proteins exhibits a high variability in primary structure,
16–77% amino acid identity between its members, we sought to visualize the conservation of
residues between all mouse TAS2R proteins. We visualized the alignment of the full-length
amino acid sequences in a sequence logo (10, 30). As shown in figure 2A, a total of 11 residues
were present in more than 90% of all TAS2Rs. Most of these residues were located in the
transmembrane domains (TM), which have a higher degree of sequence conservation than the
intra-cytoplasmic (IC) and extra-cellular (EC) loops (fig. 2A). The intra-cytoplasmic loops also
shared a considerable degree of sequence conservation between the TAS2R members in
comparison to the extracellular loops, which are very variable. The intra-cytoplasmic loops and
their adjacent transmembrane segments are the predicted sites of G protein interaction and the
distinctive extracellular regions are the predicted regions of ligand binding (1, 15). The overall
high degree of variability between TAS2R proteins precluded the identification of a conserved
consensus sequence.
To compare the degree of sequence conservation between mouse and human TAS2R
proteins, we aligned the 36 mouse protein sequences and 28 human TAS2R protein sequences.
We identified 36 amino acid positions as the most conserved in both mouse and human
sequences (Fig. 2B). These amino acids are located in the intra-cytoplasmic loops and in the
transmembrane segments. Only one position (leucine 198), located in the TM5 adjacent to IC3,
is fully conserved in all mouse and human TAS2R proteins suggesting that this residue is
essential for the function of TAS2R receptors. A single position (leucine 202) is fully conserved
10
only in the mouse proteins and 3 positions (leucine 51, tryptophan 98 and serine 201) are fully
conserved only in the human proteins. Many of the other positions show a lower degree of
sequence conservation in the mouse proteins than in human (Fig. 2B). Sequence conservation
can also be measured using information theory, where the “information content” of each position
in the sequence is scored on the basis of the distribution of amino acids present, with conserved
positions scoring more than variable positions (30). The total information content of the mouse
and human proteins are 539.2 and 681.7 bits respectively, confirming that TAS2R proteins
present a lower degree of sequence conservation in mouse than in human.
Mouse Tas2r genes are located in chromosome regions that exhibit synteny with human TAS2R
loci
To determine the exact location of all mouse Tas2r family member sequences in the
chromosomes, we mapped all Tas2r sequences longer than 500 bp to the mouse genome
databases using BLAST2. With the exception of Tas2r34 and Tas2r19, located on chromosomes
2 and 15, respectively, all mouse Tas2r genes and pseudogenes are located on chromosome 6
(fig. 3). They are organized in two clusters: a cluster of ten Tas2r sequences spans approximately
10.4 Mb; a second cluster of 29 Tas2r sequences spans approximately 1.2 Mb. Chromosome
localization analysis of mouse and human Tas2r sequences indicated that the distribution in
clusters is very similar in both species and that the clusters 1 and 2 of mouse chromosome 6 are
located in regions of the chromosome that exhibit synteny with TAS2R-rich regions of human
chromosomes
7
and
12,
respectively
(Fig.
3)
(http://www.ncbi.nlm.nih.gov/Homology/index.html). A number of mouse and human Tas2r
genes located in these regions seem to be orthologs. Most of the Tas2r genes in cluster 1 of
mouse chromosome 6 have putative orthologous genes in human chromosome 7 and many of the
genes in cluster 2 of mouse chromosome 6 have putative orthologous genes in human
chromosome 12 (Fig. 3). Thus, both Tas2r clusters were present when the primate and rodent
11
lineages diverged and still exist now. As indicated by blue lines in Figure 3, cluster 2 underwent
three expansions in mouse and two expansions in human. These species-specific groups of
sequences could have originated from gene duplications or conversions since the divergence of
those species, or from loss of the orthologous gene(s) (loss from the genome or because the
datasets are incomplete). Also, chromosome localization analysis of mouse and human Tas2r
genes indicated that, unlike on human chromosome 12, Tas2r genes in cluster 2 of mouse
chromosome 6 are distributed in two sub-clusters separated by a region of 700 kb.
To study the evolutionary relationship between all mouse and human TAS2Rs, we used
our alignment of human and mouse TAS2R proteins to generate a phylogenetic tree as described
in Materials and Methods. The alignment also included three human TAS2R sequences,
TAS2R30, 33 and 36, which do not appear in the current draft of the human genome but are
available in the databases. As shown in Figure 4, most of the major clades in the phylogenetic
tree contain both mouse and human sequences suggesting that most TAS2R groups were present
in the common ancestor. A number of mouse proteins appear in the tree close to their putative
human orthologs confirming the results obtained in the analyses of nucleotide sequence identity
and chromosome localization. Analysis of this phylogenetic tree also allowed us to classify
TAS2R proteins into five groups according to two different criteria, the phylogenetic cluster and
the protein identity (Fig. 4). A phylogenetic tree generated from an alignment of all nucleotide
sequences confirmed these results (data not shown). The classification of TAS2R receptors in
groups may reflect a specialization in their functional activity to detect bitter compounds.
The Tas2r gene family expands by local tandem duplications
An alignment of all human and mouse Tas2r genes shows that the Tas2r genes that are
located close to each other in the genome are often very similar in sequence, indicating that
tandem events (duplications and/or gene conversions) are the major evolutionary forces shaping
the diversity of this gene family (Fig. 5). Thirty-one (86.1%) of the 36 full-length mouse Tas2r
12
genes have their closest mouse relatives in the same cluster. In the human genome, 88% of the
TAS2R genes with available chromosomal localization data have their closest human relatives in
the same cluster. We found some differences between both species regarding the temporal and
spatial pattern of gene duplication. In human, an analysis of the percentage of amino acid
identity between the sequences within each possible pair of TAS2R sequences show that there
are 10 pairs above the level of 80% (Fig. 5B and D). These turn out to be the ten possible pairs
within a group of five sequences (TAS2R 44, 50, 52, 53 and 54) that map very closely to each
other in one cluster (Fig. 3), indicating that these genes were generated by duplications that arose
intrachromosomally by local tandem events. As shown in Fig. 5B, a second group of 18 points
represent pairs of sequences with identities between 60 and 80%. These pairs of sequences are
all the possible pairs between 3 additional sequences (TAS2R51, 55 and 56), which are also
located in the same cluster (Fig. 3), and the 5 mentioned sequences. These data indicate that a
precursor gene generated a group of four genes, including TAS2R51, 55, 56 and a fourth gene
that is one of the genes from the group of 5 mentioned above. Later during evolution, this fourth
sequence gave rise to four additional sequences, generating the group of 5 most similar genes,
TAS2R44, 50, 52, 53 and 54. From the available data, it is not possible to infer which of those
five genes seeded the cluster. In mouse, all the pairs of TAS2R sequences share less than 80%
identity and 99% of the pairs are below the level of 60%, suggesting that the most recent
duplications arose earlier than those in human (Fig. 5A and C). Also, two TAS2Rs sharing 58%
identity, TAS2R34 and TAS2R43, are encoded by genes located on two different mouse
chromosomes, chromosome 2 and 6 respectively, indicating that an interchromosomal event
occurred in this species.
13
Discussion
In this study we describe the identification and characterization of 13 new bitter taste
receptor gene sequences in mouse nine of which encode full-length putative receptors and four
of which are pseudogenes. Including these sequences, the mouse Tas2r family is composed of at
least 36 full-length genes and 6 pseudogenes. Because almost 96% of the mouse genome is
sequenced, our survey should almost complete the Tas2r family repertoire in mouse. Comparison
of all known mouse and human TAS2R receptors reveals that the family of TAS2R receptors is
more divergent in mouse than in human. Also, the TAS2R repertoire is about 30% larger in
mouse than in human (36 genes in mouse and 28 genes in human), which suggest that mouse is
able to detect bitter molecules with very diverse chemical structure. Positive selection to provide
a diverse repertoire of bitter tastant binding receptors in mouse and/or lower selective constraints
on protein sequence in the mouse TAS2R family may account for an increased diversity in the
TAS2R family during mouse evolution. Bitter taste has evolved as a central warning signal
against the ingestion of potentially toxic substances, and rapid evolution of the TAS2R receptors
may be necessary to detect new harmful substances appearing in the environment. As bitter
molecules are very numerous and greatly differ in their chemical structure, it is likely that a large
number of divergent receptors be required to detect them and that selective pressure have
favored evolutionary mechanisms that allow Tas2r genes to evolve into a more diverse repertoire
in mouse than in human. It is also possible that deletions of TAS2R genes in the human lineage
have exacerbated the differences between the two species. The presence of Tas2r genes in the
mouse genome, as for example Tas2r22, which are distantly related to all other mouse genes and
lack a ortologue in human (Fig. 4), is consistent with this interpretation.
Comparison of the number of Tas2r pseudogenes in mouse and human reveals that a
smaller number of pseudogenes exist in mouse. Approximately 17% of the mouse sequences
were classified as pseudogenes while in human, the pseudogenes represent approximately 40%
of the TAS2R sequences (7). Again, this marked difference suggests a steady selective pressure
14
in mouse to maintain a functional Tas2r repertoire, but may also be due in some degree to a
faster elimination of pseudogenes from the mouse genome than from the human genome (11).
The lower comparative number of TAS2R genes and the higher comparative number of
pseudogenes observed in human with respect to mouse may represent a decrease in selective
advantage to detect and respond to gustatory stimuli during human evolution. This decrease in
selective advantage could be explained by the fact that in man, the gustatory function is not as
essential for survival as in mouse. Other families of seven transmembrane chemosensory
receptors having parallelism with the gustatory system from an evolutionary point of view,
including the olfactory receptor (OR) and the vomeronasal receptor (V1R) families, also contain
a high proportion of pseudogenes (8, 18, 27, 34). In the OR and V1R families, a larger number of
pseudogenes and a lower number of genes exist in human in comparison with mouse (8, 9, 24,
26, 27, 33, 34). These observations are consistent with our results showing a higher comparative
number of TAS2R pseudogenes and a lower comparative number of TAS2R genes in human
with respect to mouse. Interestingly, in primates, an study of the OR repertoire also suggest a
parallelism between the increase of the pseudogene rate in this family and a decrease in the
olfactory sensory function during evolution (28).
All mouse Tas2r genes and pseudogenes are located on chromosomes 2, 6, and 15. The
distribution along the chromosomes is not uniform because most of the sequences are organized
in 2 clusters located in chromosome 6. This chromosomal distribution is similar to the
distribution described in human chromosomes, where most TAS2R genes are organized in two
individual clusters located in chromosomes 7 and 12 (1, 7, 16). Comparison of the chromosomal
localization of Tas2r genes in mouse and human chromosomes reveals that the cluster 1 of
mouse chromosome 6 is located in a region of the chromosome that exhibit synteny with the
region
of
human
chromosome
7
containing
a
cluster
of
TAS2R
genes
(http://www.ncbi.nlm.nih.gov/Homology/index.html) (7). Similarly, the cluster 2 of mouse
chromosome 6 is located in a region of the chromosome that exhibit synteny with the region of
15
human chromosome 12 containing the largest cluster of human TAS2R genes. This indicates that
the general arrangement of these gene clusters was established before the divergence of the
primate and rodent lineages. However, a number of species-specific groups of Tas2r genes have
very likely originated from local gene duplication or conversion events since the primate and
rodent lineages diverged. An analysis of the phylogenetic tree of all mouse and human TAS2R
proteins and of the Tas2r gene clusters in mouse and human chromosomes supports this view.
Over half of all Tas2r genes in both species match another Tas2r gene within the same genome
better than one in the genome of the other species. This suggests that the diversity of the Tas2r
gene family has been largely shaped by a “birth-and-death” model, which proposes that new
genes arise by gene duplication, followed by divergence and maintenance of some duplicate
genes, and deletion or accumulation of mutations in other genes (21). The birth–and-death model
has also shaped a variety of gene families with significant sequence diversity including the OR
gene family (33) and the MHC and immunoglobulin gene families (21).
In conclusion,
we have identified, cloned and characterized 13 new mouse Tas2r
sequences, 9 of which encode putative functional bitter taste receptors. The finding of these
sequences significantly advances the identification of the complete Tas2r family repertory in
mouse, which include at present 36 genes. Comparison of the Tas2r gene families in mouse and
man give insights into the pressures and processes that shaped the diversity of the current
TAS2R families during mouse and human evolution. These findings provide a focus for
continuing studies of the TAS2R family of receptors that should contribute to a better
understanding of the early molecular events in bitter taste.
16
Acknowledgments
This work was supported by F. Hoffmann-La Roche Ltd. and Givaudan Flavors
Corporation. We thank Clemens Broger, Jay P. Slack, Ping Zhong, and Gonzalo Acuña for
helpful discussions.
Disclosure Statement
We have no conflicts of interest.
17
References
1. Adler E, Hoon MA, Mueller KL, Chandrashekar J, Ryba NJ, and Zuker CS. A novel
family of mammalian taste receptors. Cell 100: 693-702, 2000.
2. Bufe B, Hofmann T, Krautwurst D, Raguse JD, and Meyerhof W. The human TAS2R16
receptor mediates bitter taste in response to beta- glucopyranosides. Nat Genet 32: 397401, 2002.
3. Capeless CG, Whitney G, and Azen EA. Chromosome mapping of Soa, a gene influencing
gustatory sensitivity to sucrose octaacetate in mice. Behav Genet 22: 655-663, 1992.
4. Chandrashekar J, Mueller KL, Hoon MA, Adler E, Feng L, Guo W, Zuker CS, and
Ryba NJ. T2Rs function as bitter taste receptors. Cell 100: 703-711, 2000.
5. Chaudhari N, Landin AM, and Roper SD. A metabotropic glutamate receptor variant
functions as a taste receptor. Nat Neurosci 3: 113-119, 2000.
6. Chaudhari N and Roper SD. Molecular and physiological evidence for glutamate (umami)
taste transduction via a G protein-coupled receptor. Ann N Y Acad Sci 855: 398-406,
1998.
7. Conte C., Ebeling M, Marcuz A, Nef P, and Andres-Barquin PJ. Identification and
characterization of human taste receptor genes belonging to the TAS2R family.
Cytogenet Genome Res 98: 45-53, 2002.
8. Giorgi D, Friedman C, Trask BJ, Rouquier S. Characterization of nonfunctional V1R-like
pheromone receptor sequences in human. Genome Res 10: 1979-1985, 2000.
9. Glusman G, Yanai I, Rubin I, Lancet D. The complete human olfactory subgenome.
Genome Res 11: 685-702, 2001.
10. Gorodkin J, Heyer LJ, Brunak S, and Stormo GD. Displaying the information contents of
structural RNA alignments: the structure logos. Comput Appl Biosci 13: 583-586, 1997.
18
11. Graur D, Shuali Y, and Li WH. Deletions in processed pseudogenes accumulate faster in
rodents than in humans. J Mol Evol 28: 279-285, 1989.
12. Li X, Staszewski L, Xu H, Durick K, Zoller M, and Adler E. Human receptors for sweet
and umami taste. Proc Natl Acad Sci U S A 99: 4692-4696, 2002.
13. Lindemann B. Receptors and transduction in taste. Nature 413: 219-225, 2001.
14. Lush IE., Hornigold N, King P, and Stoye JP. The genetics of tasting in mice. VII. Glycine
revisited, and the chromosomal location of Sac and Soa. Genet Res 66: 167-174, 1995.
15. Margolskee RF. Molecular mechanisms of bitter and sweet taste transduction. J Biol Chem
277: 1-4, 2002.
16. Matsunami H, Montmayeur JP, and Buck LB. A family of candidate taste receptors in
human and mouse. Nature 404: 601-604, 2000.
17. Max M, ShankerYG, Huang L, Rong M, Liu Z, Campagne F, Weinstein H, Damak S,
and Margolskee RF. Tas1r3, encoding a new candidate taste receptor, is allelic to the
sweet responsiveness locus Sac. Nat Genet 28: 58-63, 2001.
18. Mombaerts P. The human repertoire of odorant receptor genes and pseudogenes. Annu Rev
Genomics Hum Genet 2: 493-510, 2001.
19. Montmayeur JP, Liberles SD, Matsunami H, and Buck LB. A candidate taste receptor
gene near a sweet taste locus. Nat Neurosci 4: 492-498, 2001.
20. Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of
the mouse genome. Nature 420: 520-562, 2002.
21. Nei M, Gu X, and Sitnikova T. Evolution by the birth-and-death process in multigene
families of the vertebrate immune system. Proc Natl Acad Sci U S A 94: 7799-7806,
1997.
22. Nelson G, Chandrashekar J, Hoon MA, Feng L, Zhao G, Ryba NJ, and Zuker CS. An
amino-acid taste receptor. Nature 416: 199-202, 2002.
19
23. Nelson G, Hoon MA, Chandrashekar J, Y. Zhang Y, Ryba NJ, and Zuker CS.
Mammalian sweet taste receptors. Cell 106: 381-390, 2001.
24. Pantages E, Dulac C. A novel family of pheromone candidate receptors in mammals.
Neuron 28: 835-845, 2000.
25. Reed DR, Nanthakumar E, North M, Bell C, Bartoshuk LM, and Price RA. Localization
of a gene for bitter-taste perception to human chromosome 5p15. Am J Hum Genet 64:
1478-1480, 1999.
26. Rodriguez I, Greer CA, Mok MY, Mombaerts P. A putative pheromone receptor gene
expressed in human olfactory mucosa. Nat Genet 1: 18-19, 2000.
27. Rodriguez I, Del Punta K, Rothman A, Ishii T, Mombaerts P. Multiple new and isolated
familes within the mouse superfamily of V1r vomeronasal receptors. Nat Neurosci 5:
134-140, 2002.
28. Rouquier S, Blancher A, Giorgi D. The olfactory receptor gene repertoire in primates and
mouse: evidence for reduction of the functional fraction in primates. Proc Natl Acad Sci
USA 97: 2870-2874, 2000.
29. Sainz E, Korley JN, Battey JF, and Sullivan SL. Identification of a novel member of the
T1R family of putative taste receptors. J Neurochem 77: 896-903, 2001.
30. Schneider TD and Stephens RM. Sequence logos: a new way to display consensus
sequences. Nucleic Acids Res 18: 6097-6100, 1990.
31. Thompson JD, Higgins DG, and Gibson TJ. CLUSTAL W: improving the sensitivity of
progressive multiple sequence alignment through sequence weighting, position-specific
gap penalties and weight matrix choice. Nucleic Acids Res 22: 4673-4680, 1994.
32. Wong GT, Gannon KS, and Margolskee RF. Transduction of bitter and sweet taste by
gustducin. Nature 381: 796-800, 1996.
20
33. Young JM, Friedman C, Williams EM, Ross JA, Tonnes-Priddy L, and Trask BJ.
Different evolutionary processes shaped the mouse and human olfactory receptor gene
families. Hum Mol Genet 11: 535-546, 2002.
34. Zhang X, Firestein S. The olfactory receptor gene superfamily of the mouse. Nat Neurosci
5: 124-133, 2002.
21
TABLE AND FIGURE LEGENDS
Table 1
Oligonucleotide primers used to amplify Tas2r34-Tas2r46 DNA sequences by PCR.
Fig. 1. Identification of 9 new TAS2Rs in mouse. Alignment of the predicted sequences of the 9
mouse TAS2R proteins identified and mouse TAS2R5 protein (1), generated by ClustalW.
Diagram is based on output obtained by the PrettyBox program from the Wisconsin Package
Version 10.2, Genetic Computer Group (GCG), Madison, WI. Horizontal lines indicate the
amino acid sequences corresponding to the predicted transmembrane domains TM1-TM7.
Amino acids are shown in single-letter code and are numbered according to the complete
predicted amino-acid sequences. Black and grey boxes indicate amino acids that are identical
and similar, respectively, to the consensus sequence.
Fig. 2.(A) Sequence conservation between the mouse TAS2R proteins : sequence logos for the
open reading frames of TAS2Rs. The N- and C-terminal sequence stretches are removed to avoid
length heterogeneity; no considerable sequence conservation was found in these regions. The
height of each amino acid symbol is proportional to its frequency of occurrence. The amino acid
sequences corresponding to the predicted transmembrane domains (TM1 to TM7), intracytoplasmic loops (IC1 to IC3) and extra-cellular loops (EC1 to EC3) are indicated. Asterisks
indicate the highly conserved residues. Green, black, red and blue colors represent, respectively,
uncharged polar (except for glycine and cysteine), non-polar, acidic and basic residues.
Arrowheads indicate the most conserved amino acids in both mouse and human sequences. (B)
Human TAS2Rs are more conserved than mouse TAS2Rs. Frequency of the most common
amino acids in the mouse and human TAS2R proteins. Amino acids are shown in single-letter
code and are numbered according to the mouse TAS2R5 sequence shown in Figure 1.
22
Frequencies in the mouse protein family are plotted as open circles and frequencies in the human
protein family as closed circles. A total of 36 mouse and 28 human TAS2Rs were evaluated. The
asterisk indicates the only residue that is fully conserved in all mouse and human sequences. The
arrow indicates the only residue that is fully conserved only in the mouse sequences. Arrowheads
indicate the residues that are fully conserved only in the human sequences.
Fig. 3. Orthologous relationship between mouse and human Tas2r clusters. The two mouse Tas2r
clusters are located on chromosome 6. Black horizontal lines represent expansions of the mouse
Tas2r gene clusters and green horizontal lines expansions of the human gene clusters on
chromosomes 7 and 12. Mouse genes are ordered within the clusters according to their genomic
positions in the draft produced by the Mouse Genome Sequencing Consortium and human genes
are ordered according to the human genome databases (NCBI build 30). A vertical line indicates
the location of each Tas2r in the expansions. Its corresponding number names each Tas2r.
Arrowheads indicate the transcription polarities. Pseudogenes are shown in italics and the newly
identified Tas2r genes and pseudogenes are shown in bold type. Intergenic distances are drawn
to scale as indicated. Gaps in the horizontal lines indicate breaks between genes or groups of
genes. Blue lines indicate Tas2r genes sharing = 65% nucleotide identity between both species.
Red lines indicate putative orthologous genes.
Fig. 4. A phylogenetic tree showing the evolutionary relationship between all full-length mouse
and human TAS2R proteins. The mouse sequences are shown in red and human sequences in
blue. Amino acids sequences were aligned using ClustalW. Further details are described in the
text. Numbers above branches are bootstrap support values derived from 1,000 bootstrap
replicates, with only those above 50% shown. Circles divide the tree in 5 sub-trees or groups of
TAS2Rs.
23
Fig. 5. The Tas2r gene family expands by local tandem duplications. Scatterplots comparing
percent of amino acid identity between pairs of mouse (A) and human (B) intact TAS2R
sequences in the same cluster (y-axis) compared to their physical distance (x-axis). Comparisons
in A and B comprise the sequences predicted from the genes located in the cluster 2 of mouse
chromosome 6 and in the cluster located in human chromosome 12 respectively. Two subsets of
data appear in the x-axis orientation in A due to the physical distance separating two discrete
sub-clusters of genes in the cluster 2 of mouse chromosome 6. Histograms show the distribution
of amino acid identity between the sequences predicted from each of the mouse (C) and human
(D) full-length Tas2r genes and its most similar gene in the same cluster (white bars) or its most
similar gene in another cluster (black bars). Average identity of the best match in the same
clusters is 44.8% in mouse and 55.4% in human and average identity of the best match in
different clusters is 29.2% in mouse and 33.7% in human.
Tas2r
Primer forward
Primer reverse
Tas2r34
Tas2r35
Tas2r36
Tas2r37
Tas2r38
Tas2r39
Tas2r40
Tas2r41
Tas2r42
Tas2r43
Tas2r44
Tas2r45
Tas2r46
5'-ATGTCTTTCTCACATTCATTC-3'
5’-ATGGGACCCATCATGTCC-3'
5’-ATGAAATCACAGCCAGTGACA-3'
5’-ATGAGATTTATGAACAGAACAAG-3'
5’-ATGCTGAGTCTGACTCCTGT-3'
5’-ATGGCTCAACCCAGCAAC-3'
5’-ATGAATGCTACTGTGAAGTG-3'
5’-ACATCATGGACTAGGAGAAGA-3'
5’-GTAACAGACTGTGGTATTCTC-3'
5’-ATGCCCTCCACACCCACA-3'
5’-ATGGCAATAATTACCACAAATTC-3'
5’-ATGCAGCATCTTTTAAAGATAAT-3'
5’-GGGTGCTGCTATCCTAGTTA-3'
5’-TTATGATCTGGGAATACAAAG-3'
5’-TCAGCAGCAGCCCCTCT-3'
5’-TCAAGGTTTCTTTTCTTTCAGC-3'
5’-TTATGAAGCAGAGGGTCCCT-3'
5’-TCAGAGTGTCCTGGGAGGA-3'
5’-TCAGAATCTATTTTGTAAGTAC-3'
5’-CTAAGGACCTGGGAGTTC-3'
5’-TTAGGAATCTGAGGATTCTGC-3'
5’-TTAGGAACCAGAGAATCTTACA-3'
5’-CTAAAACCTCATCTTCAGGG-3'
5’-CTACCTTTTAAGGTAAAGATGAA-3'
5’-TTAGAGACCCAAAGTTTCTAG-3'
5’-TTAAAATTGTACAAAAGTATCCTC-3'
Product size (bp)
897
966
984
1002
996
960
939
866
965
882
960
965
970