Download Gene mapping of monogenic disorders and complex

Document related concepts

Human Genome Structural Variation wikipedia , lookup

Human genome wikipedia , lookup

Human genetic variation wikipedia , lookup

NEDD9 wikipedia , lookup

Human evolutionary genetics wikipedia , lookup

Transcript
Graduate Theses and Dissertations
Graduate College
2012
Gene mapping of monogenic disorders and
complex diseases via genome wide association
studies
Xia Zhao
Iowa State University
Follow this and additional works at: http://lib.dr.iastate.edu/etd
Part of the Animal Diseases Commons, and the Molecular Biology Commons
Recommended Citation
Zhao, Xia, "Gene mapping of monogenic disorders and complex diseases via genome wide association studies" (2012). Graduate
Theses and Dissertations. Paper 12878.
This Dissertation is brought to you for free and open access by the Graduate College at Digital Repository @ Iowa State University. It has been accepted
for inclusion in Graduate Theses and Dissertations by an authorized administrator of Digital Repository @ Iowa State University. For more
information, please contact [email protected].
Gene mapping of monogenic disorders and complex diseases via genome wide
association studies
by
Xia Zhao
A dissertation submitted to the graduate faculty
in partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
Major: Genetics
Program of Study Committee:
Max Rothschild, Co-major Professor
Dorian Garrick, Co-major Professor
Susan Lamont
N. Matthew Ellinwood
Peng Liu
Iowa State University
Ames, Iowa
2012
Copyright © Xia Zhao, 2012. All rights reserved.
ii
TABLE OF CONTENTS
LIST OF FIGURES
iv
LIST OF TABLES
vi
ABSTRACT
CHAPTER1. GENERAL INTRODUCTION
INTRODUCTION
RESEARCH OBJECTIVES
DISSERTATION ORGANIZATION
LITERATURE REVIEW
REFERENCES
vii
1
1
2
2
4
27
CHAPTER 2. A NOVEL NONSENSE MUTATION IN THE DMP1 GENE IDENTIFIED
BY A GENOME-WIDE ASSOCIATION STUDY IS RESPONSIBLE FOR INHERITED
RICKETS IN CORRIEDALE SHEEP
ABSTRACT
INTRODUCTION
RESULTS
DISCUSSION
MATERIALS and METHODS
REFERENCES
CHAPTER 3. A MISSENSE MUTATION IN AGTPBP1 WAS IDENTIFIED IN SHEEP
WITH A LOWER MOTOR NEURON DISEASE
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
ACKNOWLEDGEMENTS
REFERENCES
CHAPTER 4. IN A SHAKE OF A LAMB’S TAIL: USING GENOMICS TO UNRAVEL A
CAUSE OF CHONDRODYSPLASIA IN TEXEL SHEEP
SUMMARY
INTRODUCTION
MATERIALS AND METHODS
RESULTS
DISCUSSION
REFERENCES
38
38
39
41
44
47
51
61
61
62
64
68
72
76
76
87
87
88
91
95
99
105
CHAPTER 5. BAYESIAN INFERENCE IDENTIFIES A CANDIDATE REGION
ASSOCIATED WITH CANINE CRYPTORCHIDISM THAT INCLUDES THE
AMHR2 GENE
ABSTRACT
INTRODUCTION
120
120
121
iii
METHODS
RESULTS
DISCUSSION
ACKNOWLEDGEMENTS
REFERENCES
CHAPTER 6. GENERAL CONCLUSIONS
125
129
133
138
138
157
FUTURE RESEARCH
CONCLUSIONS
161
162
ACKNOWLEDGEMENTS
164
iv
LIST OF FIGURES
CHAPTER1. GENERAL INTRODUCTION
Figure 1. Structures involved in testicular descent
1
20
CHAPTER 2. A NOVEL NONSENSE MUTATION IN THE DMP1 GENE IDENTIFIED
BY A GENOME-WIDE ASSOCIATION STUDY IS RESPONSIBLE FOR INHERITED
RICKETS IN CORRIEDALE SHEEP
Figure 1. Corriedale sheep with rickets
Figure 2. Work flow chart for discovering the mutation locus of inherited rickets in
Corriedale sheep
Figure 3. Mutation analyses
Figure S1. Complete predicted DMP1 protein sequence alignments among rickets
affected sheep, wild type sheep and cow
38
54
55
56
57
CHAPTER 3. A MISSENSE MUTATION IN AGTPBP1 WAS IDENTIFIED IN SHEEP
WITH A LOWER MOTOR NEURON DISEASE
Figure 1. The distribution of identical homozygous fragments in affected lambs
Figure 2. Work flow chart for discovering the mutation locus of low motor neuron
disease in Romney sheep
Figure 3 The conserved domain structures of partial protein sequence of AGTPBP1
Figure S1. Conserved amino acids predicted by “CD-search”
Figure S2. Characteristics of protein AGTPBP1
Figure S3. Secondary structure modeling of the AGTPBP1 protein indicates critical
amino acid residues
61
79
80
81
84
85
86
CHAPTER 4. IN A SHAKE OF A LAMB’S TAIL: USING GENOMICS TO UNRAVEL A
CAUSE OF CHONDRODYSPLASIA IN TEXEL SHEEP
87
Figure 1. Chondrodysplasia of Texel sheep
109
Figure 2. Work flow chart for discovering the mutation locus of chondrodysplasia in
Texel sheep
110
Figure 3. The topology prediction of transmembrane domains (TMDs) of ovine
SLC13A1 protein
111
Figure 4. The direct sequencing results of the causative mutation (g.25513delT)
112
Figure S1. The pedigree information for the 23 sheep genotyped by ovine SNP chip 113
Figure S2. The distribution of homozygous fragments commonly in affected lambs 114
Figure S3. The CNV region display
115
Figure S4. The comparison of the SLC13A1 protein sequences among 7 species
116
CHAPTER 5. BAYESIAN INFERENCE IDENTIFIES A CANDIDATE REGION
ASSOCIATED WITH CANINE CRYPTORCHIDISM THAT INCLUDES THE
AMHR2 GENE
120
Figure 1. Re-grouping dogs based on the value of cluster membership estimated from
STRUCTURE
144
v
Figure 2. Whole genome association analyses for subgroup 1 comprising 129 mostly
USA sourced (60 cases and 69 controls) Siberian Huskies
145
Figure S1. Whole genome association analyses for subgroup 2 comprising 41 mixed
sourced (26 cases and 15 controls) Siberian Huskies
146
Figure S2. Whole genome association analyses for subgroup 3 comprising 35 mostly
UK sourced (20 cases and 15 controls) Siberian Huskies
147
Figure S3. Whole genome association analyses for a pooled sample of 204 Siberian
Huskies in which 105 were diagnosed with cyptorchidism
148
vi
LIST OF TABLES
CHAPTER 2. A NOVEL NONSENSE MUTATION IN THE DMP1 GENE IDENTIFIED
BY A GENOME-WIDE ASSOCIATION STUDY IS RESPONSIBLE FOR INHERITED
RICKETS IN CORRIEDALE SHEEP
Table 1. The genotyping results of the causative mutation (R145X)
Table S1. Primer sequence, annealing temperature, pcr amplicon information and
genetic variants identified by sequencing
Table S2. The list of ovine positional candidate genes based on the bovine reference
gene sequences
38
54
58
59
CHAPTER 3. A MISSENSE MUTATION IN AGTPBP1 WAS IDENTIFIED IN SHEEP
WITH A LOWER MOTOR NEURON DISEASE
Table 1. The genotyping results of three exonic SNPs
Table S1 primer sequences, annealing temperatures, pcr amplicon information and
targeted encoding regions of gene AGTPBP1
Table S2. The list of ovine positional candidate genes based on the bovine reference
gene sequences
61
79
82
83
CHAPTER 4. IN A SHAKE OF A LAMB’S TAIL: USING GENOMICS TO UNRAVEL
A CAUSE OF CHONDRODYSPLASIA IN TEXEL SHEEP
87
Table 1. The genotyping results of the causative mutation (g.25513del)
108
Table S1. A list of candidate gene symbol, primer sequence, annealing temperature,
PCR amplicon information, genetic variants identified by sequencing and
accessed template sequence
117
CHAPTER 5. BAYESIAN INFERENCE IDENTIFIES A CANDIDATE REGION
ASSOCIATED WITH CANINE CRYPTORCHIDISM THAT INCLUDES
THE AMHR2 GENE
120
Table1. The top 1 Mb windows with the high percentage of genetic variance obtained
from bayesb analyses in different Siberian Husky subgroups
149
Table 2. The results of bayesb analyses on 17 previously reported genes being
associated with cryptorchidism
150
Table S1. Re-grouping dogs based on the value of cluster membership (k=3)
151
Table S2. A list of genes in the putative regions for subgroup1
152
Table S3. A list of genes in the putative regions for subgroup2 and subgroup3
155
Table S4. A list of genes in the putative regions for all Siberian huskies (including
subgroup1, 2 and 3)
156
vii
ABSTRACT
Inherited rickets, lower motor neuron disease and chondrodysplasia were recently reported
genetic disorders that had been identified on sheep farms in New Zealand. Animals with
rickets showed softening and weakening bone, and were characterized by decreased growth
rate, thoracic lordosis and angular limb deformities. Sheep diagnosed with lower motor
neuron disease were characterized by poor muscle tone and progressive weakness from about
one week of age, leading to severe tetraparesis, recumbency and muscle atrophy.
Chondrodysplasia in Texel sheep was characterized by dwarfism and angular deformities of
the forelimbs. These three disorders were determined likely simple autosomal recessive by
backcross breeding trials. One complex disease called cryptorchidism was found with high
prevalence in Siberian Husky dogs. This disease is a congenital disorder characterized by
failure of one or both testicles to descend into the scrotum. In order to map gene loci and
thus the mutations involved, genome wide association studies were conducted using the
Illumina OvineSNP50 BeadChip for the three monogenic disorders and the Illumina
CanineHD BeadChip for cryptorchidism. Several mapping strategies were utilized including
homozygosity mapping, Bayesian inference, case-control analysis and these were followed
by fine mapping with a candidate gene approach.
In results, a nonsense mutation 250C > T in exon 6 of the dentin matrix protein 1 gene
(DMP1) was revealed as the putative cause of inherited rickets in Corriedale sheep. The
missense mutation c.2909G > C on exon 21 of the ATP/GTP-binding protein 1 gene
(AGTPBP1) was identified likely being responsible for lower motor neuron disease in
Romney sheep. A 1-bp deletion (g.25513delT) on exon 3 in the solute carrier family 13,
viii
member 1 gene (SLC13A1) was discovered for chondrodysplasia seen in Texel sheep. The
100% concordant rate of genotypes for mutations in certain sheep populations with the
recessive pattern of inheritance for the three disorders indicates a causative relationship
between these mutations with their related monogenic disorders. Gene knockout mice and
studies of similar diseases in humans provide more evidence to support this conclusion. In
the cryptorchidism study, the anti-mullerian hormone type II receptor gene (AMHR2) was
found located in a 1Mb window on dog chromosome CFA27 which is associated with
cryptorchidism using a BayesB approach in a subgroup of Siberian Husky dogs. Previous
human studies support AMHR2 is a very promising functional candidate gene for the
development of cryptorchidism. In conclusions, we have successfully identified putative
causative mutations responsible for the three monogenic disorders in sheep, and also found a
promising candidate gene in a putative region associated with cryptorchidism in dogs. Our
finding could shed light on further investigations of the role of involved genes and the
affected animals could be used as animal models for human diseases with similar
mechanisims.
1
CHAPTER1. GENERAL INTRODUCTION
INTRODUCTION
Genetic disorders in humans and animals include illnesses caused by abnormalities in genes
or chromosomes, those abnormalities typically existing at birth. Disorders can be passed
down from the parents’ genes, or caused by de novo mutations to the DNA loci. Disorders
caused by a single mutated gene are called single or monogenic defects. Over 4,000 human
diseases belong to this kind (Wikipedia, http://en.wikipedia.org/wiki/Genetic_disorder).
Single gene disorders are relatively easy to be studied especially when a supportive pedigree
exists. In this dissertation, we described detection of causative mutations involved in three
sporadically occurring heritable disorders in sheep including inherited rickets, lower motor
neuron disease and chondrodysplasia. These defects all show an inheritance pattern that
reflects a simple gene autosomal recessive basis. Homozygosity mapping was applied to
locate the genomic regions and then the positional candidate gene approach was carried out
to search for the causative mutation responsible for each of these disorders.
Genetic disorders might not be monogenic, but can be caused by complex, multifactorial or
so-called polygenic effects. The effects of multiple genes and environmental factors would
contribute to the development of complex diseases. Compared to monogenic disorders, the
genetic basis underlying complex disorders are not simply deciphered. Different strategies
implemented in the genetic analyses were utilized for this kind of diseases. Here we did a
whole genome scan for a complex disorder called cryptorchidism in Siberian Husky dogs.
Cryptorchidism is an abnormality whereby one or both testes fail to descend into the scrotum.
Undescended testes could reduce fertility, and increase the risk of testicular tumors. For
2
certain purebred breeds of dogs, the incidence of cryptorchidism is as high as 14% of the
population. Dogs with cryptorchidism will not be qualified by their breed organizations for
either breeding purposes or most of the dog competitions or dog shows in the United States.
It is essential to remove affected dogs and prevent carriers from contributing to the next
generation of a dog breed. In this dissertation, we integrate information from a genome-wide
association study with biological functions and comparative genomics to predict possible
genomic regions and candidate genes that might be the cause of cryptorchidism.
RESEARCH OBJECTIVES
In this dissertation, the first goal is to explore the genetic basis for each of the disorders
above. The causative mutations can be used as genetic markers for breeding purposes in
order to exclude carriers to avoid at risk mating, and to improve animal welfare and decrease
the economic losses. Second, due to the body size and relatively low cost of maintenance,
sheep with rickets, lower motor neuron disease or chondrodysplasia could be used as animal
models for studies of similar disorders in humans.
Third, the results from the dog
cryptorchidism study may provide valuable information for human research on
cryptorchidism.
DISSERTATION ORGANIZATION
This dissertation was written by starting with a literature review (Chapter 1) followed by four
manuscripts published in peer-reviewed journals (or prepared for submission). Chapter 2 is a
manuscript that explores the causative mutation responsible for inherited rickets in
Corriedale sheep. Dorian Garrick designed this experiment and assisted to uncover the
identical by descent regions.
The fine mapping in this research and manuscript was
3
conducted and written by Xia Zhao under the direction of Dorian Garrick and Max
Rothschild. Keren Dittmer provided suggestions for good candidate genes and contributed a
lot for manuscript preparation. This work has been published in PLoS One 2011;6:e21739.
Chapter 3 is a manuscript that examines the genetic basis causing a lower motor neuron
disease that occurred in an extended family of Romney lambs in New Zealand. Dorian
Garrick designed this experiment and helped to uncover the identical by descent regions.
Searching for priority candidate genes in the homozygous regions, mutation discovery and
mutation analyses were performed by Xia Zhao.
Suneel Onteru contributed initial
sequencing during the screening of positional candidate genes. The manuscript was written
by Xia Zhao under the supervision of Dorian Garrick and Max Rothschild. This work has
been published in Heredity 2012;109:156-62. Chapter 4 is a manuscript that unravels the
cause of chondrodysplasia in Texel sheep by a whole genome scan. With the help from
Dorian Garrick, the identical by descent region was located. Fine mapping of positional
candidate genes was carried out by Xia Zhao and Suneel Onteru. Eventually, the putative
causal mutation was identified and the manuscript was written by Xia Zhao under the
supervision of Dorian Garrick and Max Rothschild. This work has been published in a
special issue of Animal Genetics 2012;43 Suppl 1:9-18. Chapter 5 is a manuscript that
exams the associated genomic regions and candidate genes for cryptorchidism in Siberian
Huskies using a whole genome association study built on a foundation of Bayesian inference.
This research was conducted by Xia Zhao under the supervision of Max Rothschild and
Dorian Garrick. Max Rothschild helped to design this experiment. Suneel Onteru and
Mahdi Saatchi conducted the Bayes B analyses. The manuscript was written by Xia Zhao
4
under the supervision of Max Rothschild and Dorian Garrick.
Chapter 6 discribed
conclusions summarized from Chapters 2, 3, 4 and 5.
LITERATURE REVIEW
1. Current Trends in Characterizing The Underlying Basis of Genetic
Disorders
The completed whole genome sequence in humans and many animal species, and subsequent
generation of widespread markers of genetic variations have allowed investigators to obtain
dramatic improvement in their ability to map gene loci associated with monogenic or
polygenetic diseases. Monogenic disorders are rare at a population level and result from a
single mutated gene. These disorders typically follow Mendelian patterns of transmission
(autosomal dominant, autosomal recessive, X-linked dominant, X-linked recessive) or nonMendelian patterns of transmission (Y-linked and mitochondrial or maternal inherited). As a
result, monogenic disorders are easily studied when supportive pedigrees are available with
many affected members.
The database OMIM (Online Mendelian Inheritance In Man,
http://www.ncbi.nlm.nih.gov/Omim/mimstats.html) reported in September 21st, 2012, a total
of 3571 human phenotypes with a known molecular basis. The database OMIA (Online
Mendelian Inheritance in Animals, http://omia.angis.org.au/home/) also provides a list of 473
Mendelian traits with a known important mutation in animals, and 1,218 potential animal
models for human diseases.
The two databases may exhibit overlap with similar
characterized disease phenotypes and this helps to establish the basis for comparative
genomics to facilitate further disease studies in humans or animal species.
5
More recently, researchers have shifted their focus from rare monogenic to more common
polygenic disorders. Unlike monogenic disorders, polygenic disorders do not usually follow
Mendelian inheritance pattern, and appear to be caused by an unknown number of genes.
Only a small fraction of affected individuals in some complex diseases such as
cardiovascular defects, diabetes and several cancers, have been found to result from a single
mutant gene transmitted with Mendelian inheritance (Scheuner et al., 2004). The effects of
multiple genes and environmental factors might contribute together to the development of
these kinds of diseases. The genetic mechanisms of many complex diseases are still unclear.
Studying complex disorders for their genetic basis is more difficult than studying monogenic
diseases. Genome wide association studies (GWAS) represent a major tool for unraveling
complex traits (Stranger et al., 2011). Linkage mapping can be used in studies of complex
multifactorial disease. Additionally, studies of monogenic diseases will contribute greatly to
knowledge of polygenic forms of human diseases (Antonarakis and Beckmann, 2006). The
following section describes the basic principles of a list of mapping methods and their
applications for uncovering mutations responsible for monogenic disorders and complex
diseases.
2. Gene Mapping Strategy
Understanding the genetic basis of human diseases was extremely challenging for biomedical
researchers before the 20th century.
A big improvement has been obtained with the
development of novel genetic tools and technologies, such as single nucleotide
polymorphism (SNP) chips, microarrays and next generation sequencing (NGS). Several
gene mapping strategies have been broadly performed in many studies for human and animal
6
diseases including linkage mapping, GWAS, homozygosity mapping, and whole genome
sequencing.
2.1 Linkage Mapping
Linkage mapping was widely applied as a traditional mapping strategy for qualitative and
quantitative trait loci (QTL) mapping over the last few decades. This mapping strategy is
based on the assumption that linked genes are on the same chromosome, whereas the chance
of two gene loci located at different chromosomes or more than 50 centimorgan (cM) apart
being co-inherited in one meiotic event is 0.5. According to this principle, detecting markers
being inherited together with a given phenotype having a probability much larger than 0.5
will help discover phenotype related genes or QTL regions. Linkage mapping is usually
conducted in certain families that can provide informative meioses for linkage detection. A
meiosis is informative for linkage when one can identify whether or not the resulting gamete
is a recombinant in a particular region.
Different types of markers have been used for linkage map construction.
Restriction
fragment length polymorphisms (RFLPs) are differences between samples of homologous
DNA molecules that come from differing locations of restriction enzyme sites. The differing
locations of restriction enzyme sites could be caused by nucleotide substitutions,
rearrangements, insertions, or deletions (Chang et al., 1988). Amplified fragment length
polymorphisms (AFLPs) were improved markers produced by adding ligation of adaptors to
the ends of digested restriction fragments and that were amplified by polymerase chain
reaction (PCR) (Tan et al., 2001). These two types of markers (RFLPs and AFLPs) are less
employed in practice now, compared to simple sequence repeats (SSR or microsatellites) and
7
SNPs for linkage mapping (Ball et al., 2010). Microsatellites or SSR are repeating sequences
of 2-6 base pairs of DNA, with advantages for linkage mapping such as being PCR-based,
highly reproducible and polymorphic, generally co-dominant and abundant in animal and
plant genomes (Miao et al., 2005). Single nucleotide polymorphisms are single nucleotide
variations in a DNA sequence. The high-resolution linkage map using high density SNP
markers facilitates fine mapping of QTLs and enables the identification of genomic regions
associated with high recombination rates (Groenen et al., 2009). More recently, a novel
genetic marker called restriction-site associated DNA (RAD) was developed. This kind of
marker is a short fragment of DNA nearby each side of a particular restriction enzyme
recognition site (Miller et al., 2007). Integrating next-generation sequencing as a RAD
genotyping platform, RAD has been rapidly applied for linkage map construction in many
species since a high-density genetic map can be available and simultaneously sufficient
sequence information will meet the demand of mutation analysis (Baird et al., 2008; Wang et
al., 2012).
The implementation of linkage mapping at genome wide scale using commercial SNP chips
has been widely applied for many interesting traits. The density of SNP chips, haplotype
frequency in the population and the amount of linkage disequilibrium (LD) in the genome
will strongly affect the ability to detect genomic regions harboring the real causative
mutations (Teo et al., 2010). Alleles that are in non-random association are said to be in LD.
Linkage disequilibrium represents the chance of co-inheritance of alleles at different loci. It
can be the result of physical linkage of genes or due to selection, mutation, migration,
crossing or genetic drift. Owing to the longer LD structure and artificial selection, livestock
are preferred study populations for linkage mapping for many economically important traits
8
such as reproduction traits, disease resistance, behavior traits, lactation (milk production) and
meat quality (Georges, 2007; Rosendo et al., 2012; Heifetz et al., 2009; Glenske et al., 2011;
Marques et al., 2011; Nones et al., 2012). On the other hand, mapping gene loci using
linkage data is extremely inefficient in humans due to typically few progeny for calculating
reliable map distance, traits being difficult to follow for many generations, and large distance
between the known genes compared to livestock populations (Griffiths et al., 2000).
2.2 Homozygosity Mapping
Homozygosity mapping was first proposed in 1987 (Lander et al., 1987) as a powerful
method of mapping genes underlying recessive traits in humans.
The principle of this
approach is based on the assumption that the homozygous genotype of a mutation resulting
from a recessive disease gene is “identical by descent (IBD)” by the transmission of the
affected allele to the affected child from a common ancestor occuring through both parents.
The adjacent chromosomal region surrounding the mutated locus is likely homozygous by
descent in closely related offspring and this short segment of homozygosity can be detected
in the affected group but not in carriers. This mapping strategy has been used successfully in
both human and animal recessive diseases (Lander et al., 1987; Drielsma et al., 2012;
Charlier et al., 2008). Traditional homozygosity mapping requires suitable consanguineous
families with multiple affected offspring.
However, the limited availability of suitable
families has hampered the discovery of genes for many diseases. Moreover, too closely
related family members provide large homozygous segments that contain too many candidate
genes. This hinders efficient identification of causative genes. Applying homozygosity
mapping to single affected individuals from outbred populations may aid in the discovery of
9
short candidate chromosome segments and accelerate the identification of novel disease
genes (Hildebrandt et al., 2009).
For many years, microsatellite markers were used in the analysis of homozygosity mapping.
With the development of high throughput genotyping platforms like array-based genotyping,
this mapping approach is mainly based on data generated by commercially available chips
with hundreds of thousands of SNPs.
The decreasing costs of NGS technique drives
researchers to try homozygosity mapping using sequencing data on some diseases such as
exome sequencing data in a study for a cohort of Crohn’s Disease cases. A new algorithm
was implemented in this study by detecting IBD from exome sequencing data to find small
slices of matching alleles between pairs of individuals and then to extend them into full IBD
segments. However, the detection accuracy of this method is low in assessing the causative
regions (Zhuang et al., 2012).
There are several freely available online computational tools for homozygosity mapping to
facilitate the analysis of large datasets for gene discovery, for example HomozygosityMapper
(Seelow et al., 2009, http://www.homozygositymapper.org/), IBDFinder (Carr et al., 2009,
http://www.insilicase.co.uk/Desktop/Ibdfinder.aspx), and KinSNP (Amir et al., 2010,
http://bioinfo.bgu.ac.il/bsu/software/KinSNP/).
2.3 Genome Wide Association Study
The recent use of genome wide association studies has become a powerful tool for
investigating the genetic basis of common diseases such as Crohn’s disease (Barrett et al.,
2008), type 2 diabetes (Zeggini et al., 2008) and obesity (Wang et al., 2011). This approach
searches for associations between DNA sequence variants (SNPs) across the whole genome
10
and phenotypes of interest. In an attempt to pinpoint genes that contribute to the etiology of
a disease, scientists use GWAS to determine if one variant of an SNP is statistically more
common in individuals with a given phenotype than in the ones without. This mapping
strategy became available for human diseases since the completion of the Human Genome
Project in 2003 and the HapMap in 2005. The map of genetic variants and a list of new
technologies assist the accurate analysis of samples for genetic variations contributing to the
onset of a disease. However, the associated variants themselves may not be the causative
mutations, but may just be in LD with the actual causal genetic variants. A genome wide
association analysis is usually followed by fine mapping with a larger number of
recombinant individuals to narrow the region of interest, and may include targeted
sequencing of genome libraries prepared for the candidate genomic regions. Genome wide
association studies can be confounded by population stratification caused by systematic
ancestry differences between cases and controls (Price et al., 2010). Population stratification
will drive spurious associations when allele frequencies are different between subpopulations
(Campbell et al., 2005). Therefore, correcting for stratification is critical in performing
GWAS.
2.4 Gene Hunting for Diseases with Whole Genome Sequencing
With the development of novel biomedical technologies and dramatically decreased prices
for sequencing, whole genome sequencing is more affordable that in the past. Gene hunting
for diseases with whole genome sequencing is a modern approach for revealing the
underlying causal gene locus not only on Mendelian diseases but also on complex diseases.
This approach avoids the need for routine molecular analysis in the lab which is often labor
intensive and time consuming, and effectively provides genetic information for doctors to
11
counsel at risk patients. Whole genome sequencing may also quickly allow suitable patients
the option of gene therapy.
For example, remarkable advances would be obtained for
understanding disease-specific genetic alterations associated with cancers by the application
of whole genome sequencing, especially for the detection of structural variation (Mardis &
Wilson, 2009). Artificial selection in livestock will also be easily conducted when genetic
markers are quickly identified by whole genome sequencing (Larkin et al., 2012).
Phenotypes of Mendelian diseases might not be genetically homogeneous. The causal type
of mutation like point mutations or segmental variations may vary between disease affected
individuals with the same syndromes. Additionally, different patterns of inheritance can be
involved in the same phenotype. For instance, Charcot-Marie-Tooth disease (CMT) is an
inherited peripheral neural disease but can be separated into an autosomal dominant,
recessive, or X-linked causes and both SNPs and copy number variants confer susceptibility
to CMT (Lupski et al. 2010). Different genetic variants in the same gene locus might cause
similar phenotypes. For instance, cystic fibrosis (CF) is a recessive inherited disease. Over
1,000 mutations have been found in the cystic fibrosis transmembrane conductance regulator
gene (CFTR) to be associated with CF (Jaffe and Bush, 2001). There is not usually a 100%
success in discovering the causative mutation for a new CF patient even by screening all
known mutation loci. To detect the genetic basis underlying these genetically heterogeneous
or polymorphic disorders, performing whole genome sequencing within individual families
who have well characterized pathologic records will be quick and efficient (Lupski et al.,
2010).
Most complex diseases result from the combined small effects from multiple genes.
Mapping novel genes by whole genome sequencing for complex diseases is a powerful tool
12
(Casals et al., 2012). Whole genome sequencing could detect any alterations on the entire
genome and provide new insights into the mutational spectra. For example, a whole genome
sequencing analysis of a group of hepatocellular carcinoma (HCC) patients revealed the
influence of etiological background of somatic mutation patterns, and also identified
recurrent mutations in chromatin regulators in the development of HCC (Fujimoto et al.,
2012).
3. Gene Mapping of Genetic diseases
In order to understand the applications of different mapping strategies for detecting genetic
components underlying heritable diseases, gene mapping of three monogenetic disorders
including inherited rickets, chondrodysplasia, and lower motor neuron disease, as well as one
of the polygenic complex diseases, cryptorchidism are reviewed here.
3.1 Gene Mapping of Inherited Rickets
Rickets is a metabolic bone disease due to nutritional deficiencies of vitamin D, phosphorus,
magnesium and/or calcium.
The lack of those micronutrients will lead to defective
mineralization of cartilage at sites of endochondral ossification. The weakening or softening
bones will often result in fractures and limb deformities (Fitch, 1943; Dittmer et al., 2009;
Ozkan, 2010). Inherited rickets is an inborn error of metabolism, resulting from mutations
that disrupt the normal bone metabolism. Several different forms of inherited rickets have
been described including X-linked hypophosphatemic rickets, autosomal dominant
hypophosphatemic rickets (ADHR), autosomal recessive hypophosphatemic rickets (ARHR),
hereditary hypophosphataemic rickets with hypercalciuria (HHRH) and two types of Vitamin
D-dependent rickets (VDDR-I and VDDR-II).
13
X-linked hypophosphatemic rickets is commonly known as HYP or XLH. For this review,
HYP will be the acronym used.
Based on pedigree analysis, HYP was considered a
dominant disorder and is the most common form of familial hypophosphatemic rickets. To
map the causative locus, a multilocus linkage analysis using a series of X chromosome
markers was conducted in eleven HYP relevant human families, and eventually located a
region on the X chromosome (Read et al., 1986). The laboratory mouse mutation (Hyp) has
been identified on a homologous segment of the mouse X chromosome and the mice
exhibited similar symptoms to human HYP (Eicher et al., 1976). Using a positional cloning
approach, the HYP consortium successfully isolated a new gene named phosphate regulating
gene with homologies to endopeptidase on the X chromosome (PHEX; MIM 300550). The
causative mutations responsible for HYP patients were then determined (HYP consortium,
1995).
Similarly, a genome-wide multilocus linkage mapping was performed for ADHR using a
large family and determined the gene responsible for this disorder was on human
chromosome 12p13 in the 18-cM interval (Econs et al., 1997). Combining the results in this
large study with another small ADHR pedigree, the disease locus was mapped to a reduced
region of a 1.5-Mb region. The following direct sequencing of coding and promoter regions
in the positional candidate genes detected missense mutations in fibroblast growth factor 23
gene (FGF23; MIM 605380), a member of the FGF family (ADHR consortium, 2000).
Gene mapping of ARHR was carried out using a series of mapping tools. In one study,
homozygosity mapping using a SNP array identified a 4.6 Mb candidate regions on human
chromosome 4q21.
Mutation screening by direct sequencing found loss-of-function
mutations in a positional plausible candidate gene called DMP1 (dentin matrix protein 1;
14
MIM 600980).
The elevated phosphaturic protein FGF23 suggested that dysfunctional
DMP1 mis-regulated the expression of FGF23 to results in ARHR (Lorenz-Depierux et al.,
2010). Similar methods were used in another study for ARHR human patients, but mutations
in a different causative gene called ecto-nucleotide pyrophosphatase/phosphodiesterase 1
(ENPP1) were found. Biochemical experiments confirmed that the inactive mutations in
ENPP1 reduce enzymatic activity in hypophosphatemic patients but may not affect
phosphate (Pi) renal reabsorption (Levy-Litan et al., 2010).
Another form of autosomal recessive hypophosphatemic rickets is HHRH (Tieder et al.,
1985).
The distinctive characteristic of HHRH compared to other forms of
hypophosphatemic rickets is hypercalciuria due to increased serum 1,25-dihydroxyvitamin D
levels and increased intestinal calcium absorption.
A genome-wide linkage scan with
homozygosity mapping, followed by the candidate gene method were performed to map a
single nucleotide deletion in solute carrier family 34 (sodium phosphate), member 3 gene
(SLC34A3; MIM 241530). This gene encodes the renal sodium-phosphate cotransporter
NaPi-IIc (Bergwitz et al., 2006).
Two other recessively inherited forms of rickets in humans were due to disruptions of either
the synthesis or metabolism of vitamin D. Vitamin D-dependent rickets, type I (VDDR-I) is
caused by mutations in the CYP27B1 gene, which encodes vitamin D 1-alpha-hydroxylase
(Fu et al., 1997). The mutation in this gene was initially mapped by linkage mapping with
big pedigrees followed by the candidate gene method which involved sequencing for
causative genetic variants (Labuda et al., 1990; Fu et al., 1997). Vitamin D-dependent
rickets, type II (VDDR-II), is called hereditary vitamin D - resistant rickets (HVDRR) (MIM
277440). Shortly after the successful cloning of a full-length cDNA sequence of the vitamin
15
D receptor gene (VDR) in 1988, researchers discovered many mutations in VDR that were
found to be associated with HVDRR (Malloy et al., 1999).
3.2 Gene Mapping of Lower Motor Neuron Disease
Motor neuron diseases (MND) are a group of progressive neurological disorders involved
with selective degeneration of upper motor neurons (UMN) and/or lower motor neurons
(LMN). Messages from neurons in the brain (UMN) are initially transmitted to neurons in
the brain stem and spinal cord (LMN) and then are transmitted from LMN to particular
muscles to produce movements such as walking, chewing or blinking. When there are
disruptions in the signals between LMN and the muscle, it is commonly called LMN disease.
The incidence of MND varies between different ethnic human populations from 0.6 to 2.4
individuals per 100,000 person-years (Lee, 2010). Common characteristics of MND are
muscle weakness and/or spastic paralysis. Symptoms for UMN degeneration include spastic
tone, hyperreflexia and upgoing plantar reflex. Signs of LMN degeneration are represented
as muscle atrophy, fasciculations and weakness when associated LMN degenerate (Lee,
2010). Different types of MND have been classified based on their time of onset, type of
motor neuron involvement and detailed clinical features. These MND include amyotrophic
lateral sclerosis (ALS), hereditary spastic paraplegia (HSP), primary lateral sclerosis (PLS),
spinal bulbar muscular atrophy (SBMA), spinal muscular atrophy (SMA) and lethal
congenital contracture syndrome (LCCS). Among these diseases, both UMN and LMN are
involved in the development of ALS and LCCS (Dion et al., 2009).
Previous evidence has shown genetic variants within more than 30 genes might be
responsible for MND (Dion et al., 2009).
Amyotrophic lateral sclerosis is a fatal
16
degenerative disorder of motor neurons.
Linkage mapping using single-strand
conformational polymorphisms (SSCP) was applied to discover markers near the Cu/Znbinding superoxide dismutase (SOD1) gene being in significant linkage with ALS.
Mutations in this gene were identified by direct sequencing to explain about 20% of patients
with familial ALS (Rosen et al., 1993). Hereditary spastic paraplegia with an autosomal
dominant inheritance pattern is characterized by progressive spasticity of the lower limbs.
Pathogenic mutations in spastin gene (SPAST) were revealed by a positional cloning strategy
involving construction of BAC (bacterial artificial chromosome) contigs for sequencing. The
spastin gene was shown to be the most common causal gene locus of autosomal dominant
HSP patients (Hazan et al., 1999). Spinal muscular atrophy is a common fatal autosomal
recessive disorder, leading to progressive paralysis with muscular atrophy. Genomic regions
linked with SMA were first determined by means of linkage analysis. Further analysis using
yeast artificial chromosome (YAC) contigs were applied to narrow down the linked regions,
and finally identified loss of function mutations or deletions on SMN1 (survival of motor
neuron 1, telomeric) responsible for SMA (Lefebvre et al., 1995).
Lethal congenital
contractual syndrome type 2 is an autosomal recessive neurogenic form of arthrogryposis,
characterized by multiple joint contractures and markedly distended urinary bladder. To map
gene loci for this MND, linkage analyses and homozygosity mapping using microsatellite
markers were used. One base pair substitution on exon 11 of ERBB3 (receptor tyrosineprotein kinase erbB-3) was identified as being associated with LCCS2 (Narkis et al, 2007).
3.3 Gene Mapping of Chondrodysplasia
17
Chondrodysplasia is a genetically diverse group of diseases characterized by abnormal
development of cartilage that are present in humans and many animals. Defective genes
disrupt normal chondrogenesis and cartilage development leading to abnormal shape and
structure of the skeleton. The inheritance patterns of chondrodysplasia are diverse.
Mutations in several genes related to the synthesis of collagen or proteoglycans, or the
signal-transduction mechanisms have been reported to be involved in inherited
chondrodysplasia. Collagen II is the major collagenous component of cartilage. Traditional
genetic tools such as genomic cloning and sequencing indicated that several point mutations
in human collagen, type II, alpha 1 gene (COL2A1) interrupt the normal formation of type II
collagen and causes an autosomal dominant achondrogenesis type II (Vissing et al., 1989;
Chan et al., 1995). Partially disrupted Col2a1 has been identified with a phenotype of
chondrodysplasia in transgenic mice (Vandenberg et al., 1991). Successive genetic linkage
and positional candidate cloning studies have shown that disruption of genes encoding other
cartilage-specific collagen could cause different types of chondrodysplasia. Mutations in
families with an autosomal dominantly inherited chondrodysplasia, multiple epiphyseal
dysplasia (MED), have been reported in genes coding the α1, α2 and α3 chains of collagen
IX. A mutation in intron 8 of COL9A1 (collagen, type IX, alpha 1) causing a splicing defect
was found in one family with MED (Czarny-Ratajczak et al. 2001). Several mutations were
described to cause the skipping of COL9A2 (collagen, type IX, alpha 2) exon 3 in families
with MED (Muragaki et al., 1996; Spayde et al., 2000; Holden et al. 1999). Separate
families with MED have been described to show that mutations in the acceptor splice site of
intron 2 in COL9A3 (collagen, type IX, alpha 3) caused the skipping of exon 3, and resulted
phenotypes of MED (Paassilta et al. 1999b, Bönnemann et al. 2000). Type X collagen is a
18
short chain collagen expressed by hypertrophic chondrocytes during endochondral
ossification. A nonsense mutation of the collagen, type X, alpha 1 (COL10A1) gene has been
found to be responsible for Schmid type metaphyseal chondrodysplasia (MCDS) in a human
patient (Woelfle et al., 2011).
Sulfated glycosaminoglycan is very important for maintaining the elasticity and waterbinding capacity in the cartilage matrix (Jerosch, 2011). Absorption and re-absorption of
sulfate ions is very critical for the normal cartilage and bone development. Different forms
of chondrodysplasia in humans including multiple epiphyseal dysplasia, atelosteogenesis
type II and diastrophic dysplasia are due to mutations in the solute carrier family 26, member
2 gene (SLC26A2), which encodes a sulfate transporter protein (DTDST). These results were
supported by genetic analyses with techniques of positional cloning and direct sequencing.
The impaired sulfate uptake prevents the normal development of cartilage (Cho et al., 2010;
Hästbacka et al., 1996; Hästbacka et al., 1999).
Mutations in genes encoding a number of functionally osteogenic growth factors/signaling
molecules have been also reported to be associated with different forms of chondrodysplasia.
A 22-bp tandem duplication resulting in a frameshift of CDMP1 (designated cartilagederived morphogenetic protein 1) was found to be associated with chondrodysplasia in
human patients (Thomas et al., 1996). A multi-breed association study in domestic dogs
demonstrated that an evolutionarily recently acquired retrogene encoding fibroblast growth
factor 4 (fgf4), an extra copy of the FGF4 gene, was strongly associated with
chondrodysplasia (Parker et al., 2009). The most common inherited chondrodysplasia of
sheep is “spider lamb syndrome” (SLS) in the Suffolk breed. By means of direct sequencing,
19
it was found that the mutation responsible for the disease was a non-synonymous
transversion on the highly conserved tyrosine kinase II domain of the fibroblast growth factor
receptor 3 gene (FGFR3) (Cockett et al., 1999; Beever et al., 2006).
3.4 Gene Mapping of Cryptorchidism
Cryptorchidism, also known as undescended testicle, is the absence from the scrotum of one
or both testes. It is the most common genital problem encountered in pediatrics. The
incidence of this defect in human studies varies from 0.2% to 13% (Thonneau et al., 2003).
Cryptorchidism has also been seen in domestic animals such as dogs (Cox et al., 1978), cats
(Yates et al., 2003), horses (Cox et al., 1979), pigs (Rothschild et al., 1988), goats (Singh and
Ezeasor, 1989), sheep (Smith et al., 2007) and cattle (St Jean et al., 1992), and wild animals
such as maned wolves (Burton and Ramsay, 1986), black bears (Dunbar et al., 1996) and
reindeer (Leader-Williams, 1979). An undescended testicle is considered a risk factor for
testicular cancer and male infertility (Berkowitz et al., 1993; Carizza et al., 1990).
In most mammals, the testis must descend from the abdomen to the scrotum to provide an
environment with a lower temperature than deep body temperature which is too high for
normal spermatogenesis. The elephant is an exception since its testes are completely inside
the body (Short, 2005). Normal testicular descent involves a two-phase process including
transabdominal and inguinoscrotal descents (Hutson, 1985; Figure 1).
Key anatomical
events to release the testes from their urogenital ridge location and to guide the gonad into
the scrotum are the degeneration of the cranio-suspensory ligament, the swelling, outgrowth
and regression of the gubernaculum and the elevation of the intro-abdominal pressure
(Hutson and Hasthorpe, 2005). Insulin-like hormone 3 plays an important role for the normal
20
descent of the testes during the initial phase of transabdominal testicular descent, and
androgen works on both phases. The gubernaculum is the principle structure in testicular
androgen
ogen play a role of controlling the changes
descent and both insulin-like hormone 3 and andr
of the gubernaculum (Nation et al., 2009). Other factors including hormones like mullerian
inhibiting substance, estrogen and gonadotropin-releasing hormone, and proteins encoded by
members of the HOX gene famil y also contribute to testicular descent. Abnormalities during
the key processes of testicular descent have been recognized to relate to cryptorchidism.
Figure 1. Structures involved in testicular descent. Two critical phases of testis descent
transabdominal (left), and inguinoscrotal (right) travel are essential to move the testes
into the scrotum. The intermediate stage is shown in the middle. The process of testicle
descent depends on the degeneration of the cranio-suspensory ligament (CSL) and the
swelling, outgrowth and regression of the gubernaculum. This figure was modified
based on the reference Klonisch et al., 2004.
Cryptorchidism is a genetic defect caused by the interaction of genetic, epigenetic and
environmental factors (Hutson and Hasthorpe, 2005). Environmental factors with estrogenic
or anti-androgenic effects play a role on the development of cryptorchidism (Fisher, 2004,
21
Gill et al., 1979). Various genes implicated in the regulation of testicular descent have been
investigated for cryptorchidism. Previous reports have identified approximately 20 genes
that might be associated with cryptorchidism (Klonisch et al., 2004; Amann and
Veeramachaneni, 2007).
Androgens might play the role on both transabdominal and
inguinoscrotal processes via regulating the release of a calcitonin gene-related peptide by the
genitofemoral nerve. Several androgen related genes have been investigated while studying
cryptorchidism.
Androgen receptor (AR) was found to be strongly expressed in the
gubernaculums (George and Peterson, 1988) and a case-control study showed that the GGN
trinucleotide repeat length variant in AR was associated with a risk of cryptorchidism and
penile hypospadias in an Iranian population (Radpour et al., 2007). Calcitonin gene-related
peptides are encoded by two genes including CGRP-alpha (CALCA) and CGRP-beta
(CALCB). Exogenous calcitonin gene-related peptide injection into the scrotum can change
the direction of gubernacular migration in the mutant trans-scrotal (TS) rat (Griffiths et al.,
1993; Clarnette and Hutson 1999).
However, no pathogenic sequence changes were
identified in CALCA and CALCB for 90 selected human patients with unilateral or bilateral
cryptorchidism by mutation screening performed through SSCP analyses (Zuccarello et al.,
2004). According to the previous reports, a minority of affected cases have mutations either
in insulin-like hormone 3 gene (INSL3) or affecting its receptor gene, called relaxin/insulinlike family peptide receptor 2 (RXFP2), the latter also known as leucine-rich repeatcontaining G protein-coupled receptor 8 gene (LGRF8). Insulin-like factor 3 is expressed in
the pre-and postnatal Leydig cells of the testis and postnatal theca cells of the ovary, and is a
member of the insulin-like hormone superfamily (Zimmermann et al., 1997). Targeted
disruption of this gene prevents gubernaculum growth and causes bilateral cryptorchidism in
22
mice (Nef and Parada 1999; Zimmermann et al., 1999). The receptor gene LGR8 has been
found highly expressed in the gubernaculum and is responsible for mediating hormonal
signals that control normal testicular descent in humans (Gorlov et al., 2002). Knockout
mice models for this gene exhibit bilateral intra-abdominal cryptorchidism (Bogatcheva et al.,
2007). Disruptions of other hormone related genes also show associations with aberrant
testicular descent. Müllerian inhibiting substance (MIS), or anti-Mullerian hormone (AMH),
encoding a 140-kDa glycoprotein produced by Sertoli cells is critical for the mullerian duct
during male sexual differentiation (Bashir and Wells, 1995) and is associated with the
swelling reaction of gubernaculum occurring during the first phase of testicular descent
(Kubota et al., 2002). In mice, the testicular feminization (Tfm) mutation with the deletion of
the Mis gene was found with a phenotype of undescended testis suggesting a causal
relationship between Mis and cryptorchidism (Behringer et al., 1994).
The mutated
gonadotropin-releasing hormone receptor gene (Gnrhr) has failed to initiate testicular descent
in male mice as well (Pask et al., 2005). Estrogen receptor 1 (ESR1) is a ligand-activated
transcription factor that is important in the process of testicular descent in children
(Przewratil et al., 2004). Esr1 knockout male mice with retracted gonads suggest estrogens
may play a role in the scrotal positioning of the testicle (Donaldson et al., 1996). Moreover,
investigations were carried out on HOX genes for studying cryptorchidism. Homeobox A10
(Hoxa10) is highly expressed in the gubernaculum and this gene has been found to be
essential for the gubernacular migration during testicular descent in mice. Disruption of this
gene implicates its role in the development of cryptorchidism ((Nightingale et al., 2008; Rijli
et al., 1995). A single SNP variant in HOXD13 (homeobox D13), has been identified to be
23
significantly associated with cryptorchidism in humans via a case-control analysis (Wang et
al., 2007).
4. Animal Models for Human Diseases
Using appropriate model organisms for human disease studies will help understand the
genetic components behind disorders. Domestic animals have proven to be good research
models due to their large body size and many anatomical and physiological similarities to
humans. Numerous genetic defects have been identified in domestic animals, and inherited
disorders in these animal models can provide valuable information to assist in the
identification of human disease genes and the study of gene therapy.
4.1 The Sheep model
Sheep (Ovis aries) were initially domesticated approximately 11,000 years ago for
agricultural purposes including the production of fleece, skins, meat and milk (Colledge et al.,
2005; Chessa et al., 2009). The top three sheep meat producing countries, in order of
quantity, are China, Australia and New Zealand based on data from the Food and
Agricultural
Organization
of
the
United
Nations
(http://faostat.fao.org/site/569/DesktopDefault.aspx?PageID=569#ancor).
in
Large
2010
animal
models like sheep have received extensive clinical attention for human genetic diseases due
to their advantages over rodent models, such as greater physiological similarity to human
patients, a longer life span and larger body size. Sheep are considered as excellent models
for studying respiratory physiology (Meeusen et al., 2009), renal physiology (Ramchandra et
al., 2012), reproductive physiology (Vinet et al., 2012), cardiovascular physiology (Wylie et
al., 2012) and the endocrine system (Recabarren et al., 2005). The body size of sheep allows
24
for frequent blood sampling and easy insertion of monitoring devices into organs. These
advantages allow performing prenatal gene therapy to be performed in sheep prior to disease
onset to achieve long-term expression of the exogenous gene without modifying the germ
line of the recipient (Porada et al., 1998).
The construction of the reference sheep genome sequence was started in 2009 by the
International Sheep Genomics Consortium.
Genome sequence will enable accelerated
genetic gain and improved understanding of the etiology of diseases deleterious for the sheep
breeding programs (The International Sheep Genomics Consortium et al., 2010). Most
modern sheep breeds have maintained high levels of genetic diversity due to strong gene
flow between individuals of different lines and breeds (Kijas et al., 2012). The initial
analysis of LD structure in sheep was conducted across over 3,000 sheep from 74 diverse
breeds. The results suggest substantially lower LD in sheep compared with cattle but higher
when compared to humans. Moreover, substantial variation in LD structure exists between
ovine breeds (Raadsma et al., 2010). The new reference sheep genome sequence would help
biomedical researchers to overcome previous challenges of limited gene sequence
availability.
4.2 The Dog Model
The domestic dog (Canis lupus familiaris) may be the first animal to be domesticated
approximately 15,000 years ago (Savolainen et al., 2002). The purposes of domestication of
dogs at the beginning were for working or hunting. More recently, the dog has been used
extensively as a model organism for complex human disease research because of the
similarities between canine and human diseases including disease manifestation and genetic
25
predisposition of diseases such as cancers (Khanna et al., 2006), diabetes (Srinivasan and
Ramarao, 2007), heart disease (Kohler et al., 1984), kidney disease (Aresu et al., 2008)
(Karlsson and Lindblad-Toh, 2008). Additionally, many anatomic and physiological aspects,
particularly in the cardiovascular, urogenital, nervous and musculoskeletal systems between
canine and humans are similar (Khanna et al., 2006). The similar spectrum of diseases,
specific dog population structure, the close social relationship and similar living environment
between humans and dogs, drive the domestic dog as an ideal model for mapping human
diseases. With the first version of Dog Genome Sequencing draft being completed in 2005,
the dog genome was found to exhibit relatively small size (2.4 Gb compared to 3.2 Gb in
humans), with less divergence from the human when compared to the mouse genome
(Lindblad-Toh et al., 2005; International Human Genome Sequencing Consortium, 2004).
This similarity allows the dog to be a suitable animal model for human diseases with a
genetic component.
Additionally, the dog genome has a favorable LD structure for investigating complex
disorders. Two population bottlenecks have occurred during domestication of the dog. This
created an advantageous linkage structure which is ideal for gene mapping. First, long-range
LD, from hundreds of kilobases to several megabases, exists within a breed, unlike the tens
of kilobases observed in the human genome. Second, short-range LD exists in the whole dog
population due to its long domestication period (Lindblad-Toh, et al., 2005). By knowing the
patterns of limited LD across whole dog populations and extensive LD within breeds, a twostage mapping including identification of the disease locus in a breed and then finding the
narrowed genomic region using multiple breeds is a favored strategy (Karlsson and LindbladToh, 2008). An example of this occurred with the mapping of white spotting in dogs.
26
Initially, 19 boxer dogs were used to discover an 800kb region. A second breed, bull terriers,
was used to confirm and narrow the regions to a 100kb region that contained a gene called
microphthalmia-associated transcription factor (MITF). Candidate regulatory mutations were
identified to be associated with the spotting trait (Karlsson et al., 2007).
The most powerful and commonly used mapping approach in dogs for disease research is
GWAS. Less than 50 individuals are sufficient to map a simple Mendelian recessive trait
using GWAS, but many more dogs are needed to map complex diseases (Karlsson and
Lindblad-Toh, 2008). There are many successful cases using GWAS to map Mendelian
traits in dogs. Some examples include a missense mutation on SOD1 (superoxide dismutase
1) associated with canine degenerative myelopathy (DM) (Awano et al., 2009), a 23 kb
deletion in the canine ADAM9 (a disintegrin and metalloprotease domain, family member 9)
associated with a canine cone-rod dystrophy (crd3) (Goldstein et al., 2010), and a missense
mutation in a conserved functional domain of β-glucuronidase (GUSB) confirmed to cause
mucopolysaccharidosis VII (MPS VII) in a population of Brazilian terriers (Hytönen et al.,
2012).
Other mapping strategies including QTL mapping, across-breed mapping and selective
sweep mapping have also been applied in dogs (Karlsson and Lindblad-Toh, 2008; Sutter et
al., 2007).
The selective sweep mapping is to identify the selective sweeps that are
associated with a strongly selected phenotype. A selective sweep is the region where there is
a reduction or elimination of genetic variations in DNA neighboring a mutation, due to recent
and strong positive natural or artificial selection (Palaisa et al., 2004). Artificial selection for
breeding purposes in dogs over hundreds of years causes big morphological differences
27
among modern breeds and uniform physical features within pure-bred dogs (Wayne and
Ostrander, 2007). This special characteristic of dog populations provides suitable resources
for QTL mapping and selective sweep mapping to discovery of genes underlying breedspecific characteristics. For example, a common haplotype in insulin-like growth factor 1
gene (IGF1) associated with body size in dogs was successfully identified by QTL mapping
and selective sweep mapping using small and giant dog breeds (Sutter et al., 2007). To avoid
the confounding of long haplotypes and extensive homozygosity within the breed, acrossbreed selection sweep mapping is helpful to find interesting loci. A study used this mapping
strategy to analyze genetic variation in 46 dog breeds identified 44 chromosomal regions that
are extremely variable between breeds and are likely to control many of the morphological
and behavioral traits that vary between them (Vaysse et al., 2011).
By contrast, the
limitation of offspring from crosses in large numbers restricts the application of QTL
mapping in dogs comparing to livestock animals.
Genetic diseases consititute huge public healthy problems for humans, and also cause big
economic losses in animal industry. Finding the causative mutations will help to improve
this situation. Applying suitable gene mapping tools for scientists could make gene hunts
faster, cheaper and practical for genetic diseases.
Animal models are preferable for
experimental disese research due to their advantages reviewed above. The gene discoveries
in animal models will facilitate the study of human genetic diseases.
REFERENCES
ADHR Consortium. Autosomal dominant hypophosphataemic rickets is associated with
mutations in FGF23. Nat Genet. 2000. 26:345-8.
Amann RP, Veeramachaneni DNR. Cryptorchidism in common eutherian mammals.
Reproduction. 2007. 133:541-1.
28
Amir el-AD, Bartal O, Morad E, Nagar T, Sheynin J, Parvari R, et al. KinSNP software for
homozygosity mapping of disease genes using SNP microarrays. Hum Genomics. 2010.
4:394-40.
Antonarakis SE, Beckmann, JS. Mendelian disorders deserve more attention. Nat Rev Genet.
2006. 7: 277- 82.
Aresu L, Pregel P, Bollo E, Palmerini D, Sereno A, Valenza F. Immunofluorescence staining
for the detection of immunoglobulins and complement (C3) in dogs with renal disease. Vet
Rec. 2008. 163:679-82.
Awano T, Johnson GS, Wade CM, Katz ML, Johnson GC, Taylor JF, et al. Genome-wide
association analysis reveals a SOD1 mutation in canine degenerative myelopathy that
resembles amyotrophic lateral sclerosis. Proc Natl Acad Sci U S A. 2009. 106: 2794-9.
Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, et al. Rapid SNP
discovery and genetic mapping using sequenced RAD markers. PLoS One. 2008. 3:e3376.
Ball AD, Stapley J, Dawson DA, Birkhead TR, Burke T, Slate J. A comparison of SNPs and
microsatellites as linkage mapping markers: lessons from the zebra finch (Taeniopygia
guttata). BMC Genomics. 2010. 11:218.
Barrett JC, Hansoul S, Nicolae DL, Cho JH, Duerr RH, Rioux JD, et al. Genome-wide
association defines more than 30 distinct susceptibility loci for Crohn's disease. Nat Genet.
2008. 40:955-62.
Bashir MS, Wells M. Müllerian inhibiting substance. J Pathol. 1995. 176:109-10.
Beever JE, Smit MA, Meyers SN, Hadfield TS, Bottema C, Albretsen J, et al. A single-base
change in the tyrosine kinase II domain of ovine FGFR3 causes hereditary chondrodysplasia
in sheep. Anim Genet. 2006. 37:66-71.
Behringer RR, Finegold MJ, Cate RL. Müllerian-inhibiting substance function during
mammalian sexual development. Cell. 1994. 79:415-25.
Bergwitz C, Roslin NM, Tieder M, Loredo-Osti JC, Bastepe M, Abu-Zahra H, et al.
SLC34A3 mutations in patients with hereditary hypophosphatemic rickets with
hypercalciuria predict a key role for the sodium-phosphate cotransporter NaPi-IIc in
maintaining phosphate homeostasis. Am J Hum Genet. 2006. 78: 179-92.
Berkowitz GS, Lapinski RH, Dolgin SE, Gazella JG, Bodian CA, Holzman IR. Prevalence
and natural history of cryptorchidism. Pediatrics. 1993. 92:44-9.
Bogatcheva NV, Truong A, Feng S, Engel W, Adham IM, Agoulnik AI. GREAT/LGR8 is
the only receptor for insulin-like 3 peptide. Mol Endocrinol. 2003. 17:2639-46.
Bönnemann CG, Cox GF, Shapiro F, Wu J-J, Feener CA, Thompson TG, et al. A mutation in
the α3 chain of type IX collagen causes autosomal dominant multiple epiphyseal dysplasia
with mild myopathy. Proc Natl Acad Sci USA. 2000. 97:1212-17.
Burton M, Ramsay E. Cryptorchidism in Maned Wolves. J Zoo An Med. 1986. 17:133-5.
29
Campbell CD, Ogburn EL, Lunetta KL, Lyon HN, Freedman ML, Groop LC, et al.
Demonstrating stratification in a European American population. Nat Genet. 2005. 37:868-72.
Carizza C, Antiba A, Palazzi J, Pistono C, Morana F, Alarcón M. Testicular maldescent and
infertility. Andrologia. 1990. 22:285-8.
Carr IM, Sheridan E, Hayward BE, Markham AF, Bonthron DT. IBDfinder and SNPsetter:
tools for pedigree-independent identification of autozygous regions in individuals with
recessive inherited disease. Hum Mutat. 2009. 30:960-7.
Casals F, Idaghdour Y, Hussin J, Awadalla P. Next-generation sequencing approaches for
genetic mapping of complex diseases. J Neuroimmunol. 2012. 248:10-22. Review.
Chan D, Cole WG, Chow CW, Mundlos S, Bateman JF. A COL2A1 mutation in
achondrogenesis type II results in the replacement of type II collagen by type I and III
collagens in cartilage. J Biol Chem. 1995. 270:1747-53.
Chang C, Bowman JL, DeJohn AW, Lander ES, Meyerowitz EM. Restriction fragment
length polymorphism linkage map for Arabidopsis thaliana. Proc Natl Acad Sci U S A. 1988.
85:6856-60.
Charlier C, Coppieters W, Rollin F, Desmecht D, Agerholm JS, Cambisano N, et al. Highly
effective SNP-based association mapping and management of recessive defects in livestock.
Nat Genet. 2008. 40:449-54.
Chessa B, Pereira F, Arnaud F, Amorim A, Goyache F, Mainland I, et al. Revealing the
History of Sheep Domestication Using Retrovirus Integrations. Science. 2009. 324: 532-6.
Cho TJ, Kim OH, Lee HR, Shin SJ, Yoo WJ, Park WY, et al. Autosomal recessive multiple
epiphyseal dysplasia in a Korean girl caused by novel compound heterozygous mutations in
the DTDST (SLC26A2) gene. J Korean Med Sci. 2010. 25:1105-8.
Clarnette TD, Hutson JM. 1999. Exogenous calcitonin gene-related peptide can change the
direction of gubernacular migration in the mutant trans-scrotal rat. J Pediatr Surg. 34:120812.
Cockett NE, Shay TL, Beever JE, Nielsen D, Albretsen J, Georges M, et al. Localization of
the locus causing Spider Lamb Syndrome to the distal end of ovine Chromosome 6. Mamm
Genome. 1999. 10:35-8.
Colledge S, Conolly J, Shennan S. The evolution of Neolithic farming from SW Asian
origins to NW European limits. Eur J Arch. 2005. 8:137-56.
Cox JE, Edwards GB, Neal PA. An analysis of 500 cases of equine cryptorchidism. Equine
Vet J. 1979. 11:113-6.
Cox VS,Wallace LJ, Jessen CR. An anatomic and genetic study of canine cryptorchidism.
Teratology. 1978.18:233-40.
Dion PA, Daoud H, Rouleau GA. Genetics of motor neuron disorders: new insights into
pathogenic mechanisms. Nat Rev Genet. 2009. 10:769-82.
30
Dittmer KE, Thompson KG, Blair HT. Pathology of inherited rickets in Corriedale sheep. J
Comp Pathol. 2009. 141: 147-55.
Donaldson KM, Tong SY, Washburn T, Lubahn DB, Eddy EM, Hutson JM, et al.
Morphometric study of the gubernaculum in male estrogen receptor mutant mice. J Androl.
1996. 17:91-5.
Drielsma A, Jalas C, Simonis N, Désir J, Simanovsky N, Pirson I, et al. Two novel
CCDC88C mutations confirm the role of DAPLE in autosomal recessive congenital
hydrocephalus. J Med Genet. 2012 [Epub ahead of print]
Dunbar MR, Cunningham MW, Wooding JB, Roth RP. Cryptorchidism and delayed
testicular descent in Florida black bears. J Wildl Dis. 1996. 32:661-4.
Econs MJ, McEnery PT, Lennon F, Speer MC. Autosomal dominant hypophosphatemic
rickets is linked to chromosome 12p13. J Clin Invest. 1997. 100:2653-7.
Eicher EM, Southard JL, Scriver CR, Glorieux FH. Hypophosphatemia: mouse model for
human familial hypophosphatemic (vitamin D-resistant) rickets. Proc Natl Acad Sci U S A.
1976. 73:4667-71.
Fisher JS. Environmental anti-androgens and male reproductive health: focus on phthalates
and testicular dysgenesis syndrome. Reproduction. 2004. 127:305-15. Review.
Fitch LWN. Rickets in hoggets, with a note on the aetiology and definition of the disease.
Australian Vet J. 1943.19:2.
Fu GK, Lin D, Zhang MY, Bikle DD, Shackleton CH, Miller WL, et al. Cloning of human
25-hydroxyvitamin D-1 α-hydroxylase and mutations causing vitamin D-dependent rickets
type 1. Mol Endocrinol.1997. 11:1961-70.
Fujimoto A, Totoki Y, Abe T, Boroevich KA, Hosoda F, Nguyen HH, et al. Whole-genome
sequencing of liver cancers identifies etiological influences on mutation patterns and
recurrent mutations in chromatin regulators. Nat Genet. 2012. 44:760-4.
George FW, Peterson KG. Partial characterization of the androgen receptor of the newborn
rat gubernaculum. Biol Reprod. 1988. 39:536-9.
Georges M. Mapping, Fine Mapping, and Molecular Dissection of Quantitative Trait Loci in
Domestic Animals. Annu Rev Genomics Hum Genet. 2007. 8: 131-62.
Gill WB, Schumacher GF, Bibbo M, Straus FH 2nd, Schoenberg HW. Association of
diethylstilbestrol exposure in utero with cryptorchidism, testicular hypoplasia and semen
abnormalities. J Urol. 1979. 122:36-9.
Glenske K, Prinzenberg EM, Brandt H, Gauly M, Erhardt G. A chromosome-wide QTL
study on BTA29 affecting temperament traits in German Angus beef cattle and mapping of
DRD4. Animal. 2011. 5:195-7.
Goldstein O, Mezey JG, Boyko AR, Gao C, Wang W, Bustamante CD, et al. An ADAM9
mutation in canine cone-rod dystrophy 3 establishes homology with human cone-rod
dystrophy 9. Mol Vis. 2010. 16:1549-69.
31
Gorlov IP, Kamat A, Bogatcheva NV, Jones E, Lamb DJ, Truong A, et al. Mutations of the
GREAT gene cause cryptorchidism. Hum Mol Genet. 2002. 11:2309-18.
Griffiths AJF, Miller JH, Suzuki DT, Lewontin RC, Gelbart WM. An ntroduction to genetic
analysis. 2000. 7th edition. New York: W. H. Freeman.
Griffiths AL, Middlesworth W, Goh DW, Hutson JM. 1993. Exogenous calcitonin generelated peptide causes gubernacular development in neonatal (Tfm) mice with complete
androgen resistance. J Pediatr Surg. 28:1028-30.
Groenen MA, Wahlberg P, Foglio M, Cheng HH, Megens HJ, Crooijmans RP, et al. A highdensity SNP-based linkage map of the chicken genome reveals sequence features correlated
with recombination rate. Genome Res. 2009. 19:510-9.
Hästbacka J, Kerrebrock A, Mokkala K, Clines G, Lovett M, Kaitila I, et al. Identification of
the Finnish founder mutation for diastrophic dysplasia (DTD). Eur J Hum Genet. 1999.
7:664-70.
Hästbacka J, Superti-Furga A, Wilcox WR, Rimoin DL, Cohn DH, Lander ES.
Atelosteogenesis type II is caused by mutations in the diastrophic dysplasia sulfatetransporter gene (DTDST): evidence for a phenotypic series involving three
chondrodysplasias. Am J Hum Genet. 1996. 58:255-62.
Hazan J, Fonknechten N, Mavel D, Paternotte C, Samson D, Artiguenave F, et al. Spastin, a
new AAA protein, is altered in the most frequent form of autosomal dominant spastic
paraplegia. Nat Genet. 1999. 23:296–303.
Heifetz EM, Fulton JE, O'Sullivan NP, Arthur JA, Cheng H, Wang J, et al. Mapping QTL
affecting resistance to Marek's disease in an F6 advanced intercross population of
commercial layer chickens. BMC Genomics. 2009. 14:10-20.
Hildebrandt F, Heeringa SF, Rüschendorf F, Attanasio M, Nürnberg G, Becker C, et al. A
Systematic Approach to Mapping Recessive Disease Genes in Individuals from Outbred
Populations. PLoS Genet. 2009. 5: e1000353.
Holden P, Canty EG, Mortier GR, Zabel B, Spranger J, Carr A, et al. Identification of novel
pro-α2(IX) collagen gene mutations in two families with distinctive oligo-epiphyseal forms
of multiple epiphyseal dysplasia. Am J Hum Genet. 1999. 65:31-8.
Hutson JM, Hasthorpe S. Abnormalities of testicular descent. Cell Tissue Res. 2005.
322:155-8. Review.
Hutson JM. A biphasic model for the hormonal control of testicular descent. Lancet. 1985.
24:419-21.
HYP Consortium. A gene (PEX) with homologies to endopeptidases is mutated in patients
with X-linked hypophosphatemic rickets. Nat Genet. 1995. 11: 130-6.
Hytönen MK, Arumilli M, Lappalainen AK, Kallio H, Snellman M, Sainio K, et al. A novel
GUSB mutation in Brazilian terriers with severe skeletal abnormalities defines the disease as
mucopolysaccharidosis VII. PLoS One. 2012. 7:e40281.
32
International Human Genome Sequencing Consortium. Finishing the euchromatic sequence
of the human genome. Nature. 2004. 431:931-45.
International Sheep Genomics Consortium, Archibald AL, Cockett NE, Dalrymple BP,
Faraut T, Kijas JW, et al. The sheep genome reference sequence: a work in progress. Anim
Genet. 2010. 41:449-53. Review.
Jaffé A, Bush A. Cystic fibrosis: review of the decade. Monaldi Arch Chest Dis. 2001.
56:240-7. Review.
Jerosch J. Effects of Glucosamine and Chondroitin Sulfate on Cartilage Metabolism in OA:
Outlook on Other Nutrient Partners Especially Omega-3 Fatty Acids. Int J Rheumatol.
2011:969012.
Karlsson EK, Baranowska I, Wade CM, Salmon Hillbertz NH, Zody MC, Anderson N, et al.
Efficient mapping of mendelian traits in dogs through genome-wide association. Nat Genet.
2007. 39:1321-8.
Karlsson EK, Lindblad-Toh K. Leader of the pack: gene mapping in dogs and other model
organisms. Nat Rev Genet. 2008. 9:713-25. Review.
Khanna C, Lindblad-Toh K, Vail D, London C, Bergman P, Barber L, et al. The dog as a
cancer model. Nat Biotechnol. 2006. 24: 1065- 6.
Kijas JW, Lenstra JA, Hayes B, Boitard S, Porto Neto LR, San Cristobal M, et al. GenomeWide Analysis of the World's Sheep Breeds Reveals High Levels of Historic Mixture and
Strong Recent Selection. PLoS Biol. 2012. 10: e1001258.
Klonisch T, Fowler PA, Hombach-Klonisch S. Molecular and genetic regulation of testis
descent and external genitalia development. Dev Bio. 2004. 270:1-18.
Kohler J, Silverman NA, Levitsky S, Pavel DG, Eckner FA, Fang RB. A model of cyanotic
heart disease: functional, pathological, and metabolic sequelae in the immature canine heart.
J Surg Res. 1984. 37:309-13.
Kubota Y, Temelcos C, Bathgate RA, Smith KJ, Scott D, Zhao C, et al. The role of insulin 3,
testosterone, Müllerian inhibiting substance and relaxin in rat gubernacular growth. Mol
Hum Reprod. 2002. 8:900-5.
Labuda M, Morgan K, Glorieux FH. Mapping autosomal recessive vitamin D dependency
type I to chromosomal 12q14 by linkage analysis. Am J Hum Genet. 1990. 47:28-36.
Lander ES, Botstein D. Homozygosity mapping: a way to map human recessive traits with
the DNA of inbred children. Science. 1987. 236:1567-70.
Larkin DM, Daetwyler HD, Hernandez AG, Wright CL, Hetrick LA, Boucek L, et al.
Whole-genome resequencing of two elite sires for the detection of haplotypes under selection
in dairy cattle. Proc Natl Acad Sci U S A. 2012. 109:7693-8.
33
Leader-Williams N. Abnormal testes in reindeer, Rangifer tarandus. J Reprod Fertil. 1979.
57:127-30.
Lee CN. Reviewing evidences on the management of patients with motor neuron disease.
Hong Kong Med J. 2012. 18:48-55.
Lefebvre S, Bürglen L, Reboullet S, Clermont O, Burlet P, Viollet L, et al. Identification and
characterization of a spinal muscular atrophy-determining gene. Cell. 1995. 80:155–65.
Levy-Litan V, Hershkovitz E, Avizov L, Leventhal N, Bercovich D, Chalifa-Caspi V, et al.
Autosomal-recessive hypophosphatemic rickets is associated with an inactivation mutation in
the ENPP1 gene. Am J Hum Genet. 2010. 86: 273-8.
Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, et al. Genome
sequence, comparative analysis and haplotype structure of the domestic dog. Nature. 2005.
438:803-19.
Lorenz-Depiereux B, Schnabel D, Tiosano D, Hausler G, Strom TM. Loss-of-function
ENPP1 mutations cause both generalised arterial calcification of infancy and autosomalrecessive hypophosphatemic rickets. Am J Hum Genet. 2010. 86: 267-72.
Lupski JR, Reid JG, Gonzaga-Jauregui C, Rio Deiros D, Chen DC, Nazareth L, et al. Wholegenome sequencing in a patient with Charcot-Marie-Tooth neuropathy. N Engl J Med. 2010.
362:1181-91.
Malloy PJ, Pike JW, Feldman D. The vitamin D receptor and the syndrome of hereditary
1,25-dihydroxyvitamin D-resistant rickets. Endo Rev 1999. 20: 156-88.
Mardis ER, Wilson RK. Cancer genome sequencing: a review. Hum Mol Genet.
2009.18:R163-8.
Marques E, Grant JR, Wang Z, Kolbehdari D, Stothard P, Plastow G, et al. Identification of
candidate markers on bovine chromosome 14 (BTA14) under milk production trait
quantitative trait loci in Holstein. J Anim Breed Genet. 2011. 128:305-13.
Meeusen EN, Snibson KJ, Hirst SJ, Bischof RJ. Sheep as a model species for the study and
treatment of human asthma and other respiratory diseases. Drug Discov Today Dis Models.
2009. 5: 101-6.
Miao XX, Xub SJ, Li MH, Li MW, Huang JH, Dai FY, et al. Simple sequence repeat-based
consensus linkage map of Bombyx mori. Proc Natl Acad Sci U S A. 2005. 102:16303-8.
Miller MR, Dunham JP, Amores A, Cresko WA, Johnson EA. Rapid and cost-effective
polymorphism identification and genotyping using restriction site associated DNA (RAD)
markers. Genome Res. 2007. 17: 240-48.
Muragaki Y, Mariman ECM, van Beersum SEC, Perälä M, van Mourik JBA, Warman ML,
et al. A mutation in the gene encoding the α2 chain of the fibril-associated collagen IX,
COL9A2, causes multiple epiphyseal dysplasia (EDM2). Nat Genet. 1996. 12:103-5.
34
Narkis G, Ofir R, Manor E, Landau D, Elbedour K, Birk OS. Lethal congenital contractural
syndrome type 2 (LCCS2) is caused by a mutation in ERBB3 (Her3), a modulator of the
phosphatidylinositol-3-kinase/Akt pathway. Am J Hum Genet. 2007. 81:589–95.
Nation TR, Balic A, Southwell BR, Newgreen DF, Hutson JM. The hormonal control of
testicular descent. Pediatr Endocrinol Rev. 2009. 7:22-31.
Nef S, Parada LF. Cryptorchidism in mice mutant for Insl3. Nat Genet. 1999. 22:295-9.
Nightingale SS, Western P, Hutson JM. The migrating gubernaculum grows like a "limb
bud". J Pediatr Surg. 2008. 43:387-90.
Nones K, Ledur MC, Zanella EL, Klein C, Pinto LF, Moura AS, et al. Quantitative trait loci
associated with chemical composition of the chicken carcass. Anim Genet. 2012. 43:570-6.
Ozkan B. Nutritional rickets. J Clin Res Pediatr Endocrinol. 2010. 2: 137- 43.
Paassilta P, Lohiniva J, Annunen S, Bonaventure J, Le Merrer M, Pai L, et al. COL9A3: a
third locus for multiple epiphyseal dysplasia. Am J Hum Genet. 1999. 64:1036-44.
Palaisa K, Morgante M, Tingey S, Rafalski A. "Long-range patterns of diversity and linkage
disequilibrium surrounding the maize Y1 gene are indicative of an asymmetric selective
sweep". Proc Natl Acad Sci U.S.A. 2004. 101: 9885-90.
Parker HG, VonHoldt BM, Quignon P, Margulies EH, Shao S, Mosher DS, et al. An
expressed fgf4 retrogene is associated with breed-defining chondrodysplasia in domestic
dogs. Science. 2009. 325:995-8.
Pask AJ, Kanasaki H, Kaiser UB, Conn PM, Janovick JA, Stockton DW, et al. A novel
mouse model of hypogonadotrophic hypogonadism: N-ethyl-N-nitrosourea-induced
gonadotropin-releasing hormone receptor gene mutation. Mol Endocrinol. 2005. 19:972-81.
Porada CD, Tran N, Eglitis M, Moen RC, Troutman L, Flake AW, et al. In utero gene
therapy: transfer and long-term expression of the bacterial neo(r) gene in sheep after direct
injection of retroviral vectors into preimmune fetuses. Hum Gene Ther. 1998. 9: 1571-85.
Price AL, Zaitlen NA, Reich D, Patterson N. New approaches to population stratification in
genome-wide association studies. Nat Rev Genet. 2010. 11:459-63. Review.
Przewratil P, Paduch DA, Kobos J, Niedzielski J. Expression of estrogen receptor alpha and
progesterone receptor in children with undescended testicle previously treated with human
chorionic gonadotropin. J Urol. 2004. 172:1112-6.
Raadsma HW, ISGC Int. Sheep Genomics Consortium. Linkage Disequilibrium In The
Sheep Genome: Findings From The ISGC HapMap Initiative. Abstract:W134. Plant &
Animal genomes XVIII Conference, Jan. 9-30, 2010. San Diego, CA.
Radpour R, Rezaee M, Tavasoly A, Solati S, Saleki A. Association of long polyglycine tracts
(GGN repeats) in exon 1 of the androgen receptor gene with cryptorchidism and penile
hypospadias in Iranian patients. J Androl. 2007. 28:164-9.
35
Ramchandra R, Hood S, Frithiof R, McKinley MJ, May CN. The Role of the Paraventricular
Nucleus of the Hypothalamus in the Regulation of Cardiac and Renal Sympathetic Nerve
Activity in Conscious Normal and Heart Failure Sheep. J Physiol. 2012. Jul 2.
Read AP, Thakker RV, Davies KE, Mountford RC, Brenton DP, Davies M, et al. Mapping of
human X-linked hypophosphataemic rickets by multilocus linkage analysis. Hum Genet.
1986. 73:267-70.
Recabarren SE, Padmanabhan V, Codner E, Lobos A, Durán C, Vidal M, et al. Postnatal
developmental consequences of altered insulin sensitivity in female sheep treated prenatally
with testosterone. Am J Physiol Endocrinol Metab. 2005. 289:E801-6.
Rijli FM, Matyas R, Pellegrini M, Dierich A, Gruss P, Dollé P, et al. Cryptorchidism and
homeotic transformations of spinal nerves and vertebrae in Hoxa-10 mutant mice. Proc Natl
Acad Sci U S A. 1995. 92:8185-9.
Rosen DR, Siddique T, Patterson D, Figlewicz DA, Sapp P, Hentati A, et al. Mutations in
Cu/Zn superoxide dismutase gene are associated with familial amyotrophic lateral sclerosis.
Nature. 1993. 362:59-62.
Rosendo A, Iannuccelli N, Gilbert H, Riquet J, Billon Y, Amigues Y, et al. Microsatellite
mapping of quantitative trait loci affecting female reproductive tract characteristics in
Meishan x Large White F(2) pigs. J Anim Sci. 2012. 90:37-44.
Rothschild MF, Christian LL, Blanchard W. Evidence for multigene control of
cryptorchidism in swine. J Hered. 1988. 79:313-4.
Savolainen P, Zhang YP, Luo J, Lundeberg J, Leitner T. "Genetic evidence for an East Asian
origin of domestic dogs". Science. 2008. 298: 1610–3.
Scheuner MT, Yoon PW, Khoury MJ. Contribution of Mendelian disorders to common
chronic disease: opportunities for recognition, intervention, and prevention. Am J Med Genet
C Semin Med Genet. 2004. 125C:50-65. Review.
Seelow D, Schuelke M, Hildebrandt F, Nürnberg P. HomozygosityMapper--an interactive
approach to homozygosity mapping. Nucleic Acids Res. 2009. 37(Web Server issue):W593-9.
Short RW. "Why the elephant has intra-abdominal testes" (On-line). 2005. Accessed at
http://www.publish.csiro.au/?act=view_file&file_id=SRB03Ab59.pdf (pdf).
Singh A, Ezeasor D. Ultrastructure of Sertoli cells in scrotal testis of unilaterally cryptorchid
goat. Prog Clin Biol Res. 1989. 296:159-64.
Smith KC, Brown PJ, Morris J, Parkinson TJ. Cryptorchidism in North Ronaldsay sheep. Vet
Rec. 2007. 161:658-9.
Spayde EC, Joshi AP, Wilcox WR, Briggs M, Cohn DH, Olsen BR. Exon skipping mutation
in the COL9A2 gene in a family with multiple epiphyseal dysplasia. Matrix Biol. 2000.
19:121-8.
36
Srinivasan K, Ramarao P. Animal models in type 2 diabetes research: an overview. Indian J
Med Res. 2007. 125: 451-72.
St Jean G, Gaughan EM, Constable PD. Cryptorchidism in North American cattle: breed
predisposition and clinical findings. Theriogenology. 1992. 38:951-8.
Stranger BE, Stahl EA, Raj T. Progress and promise of genome-wide association studies for
human complex trait genetics. Genetics. 2011. 187:367-83. Review.
Sutter NB, Bustamante CD, Chase K, Gray MM, Zhao K, Zhu L, et al. A single IGF1 allele
is a major determinant of small size in dogs. Science. 2007. 316:112-5.
Tan YD, Wan C, Zhu Y, Lu C, Xiang Z, Deng HW. An amplified fragment length
polymorphism map of the silkworm. Genetics. 2001. 157:1277-84.
Teo YY, Small KS, Kwiatkowski DP. Methodological challenges of genome-wide
association analysis in Africa. Nat Rev Genet. 11:149-60.
Thomas JT, Lin K, Nandedkar M, Camargo M, Cervenka J, Luyten FP. A human
chondrodysplasia due to a mutation in a TGF-beta superfamily member. Nat Genet. 1996.
12:315-7.
Thonneau PF, Gandia P, Mieusset R. Cryptorchidism: incidence, risk factors, and potential
role of environment; an update. J Androl. 2003. 24:155-62. Review.
Tieder M, Modai D, Samuel R, Arie R, Halabe A, Bab I, et al. Hereditary hypophosphatemic
rickets with hypercalciuria. N Engl J Med. 1985. 312 : 611-7.
Vandenberg P, Khillan JS, Prockop DJ, Helminen H, Kontusaari S, Ala-Kokko L. Expression
of a partially deleted gene of human type II procollagen (COL2A1) in transgenic mice
produces a chondrodysplasia. Proc Natl Acad Sci USA. 1991. 88:7640-4.
Vaysse A, Ratnakumar A, Derrien T, Axelsson E, Rosengren Pielberg G, Sigurdsson S, et al.
Identification of genomic regions associated with phenotypic variation between dog breeds
using selection mapping. PLoS Genet. 2011. 7:e1002316.
Vinet A, Drouilhet L, Bodin L, Mulsant P, Fabre S, Phocas F. Genetic control of multiple
births in low ovulating mammalian species. Mamm Genome. 2012. Aug 8. [Epub ahead of
print].
Vissing H, D'Alessio M, Lee B, Ramirez F, Godfrey M, Hollister DW. Glycine to serine
substitution in the triple helical domain of pro-alpha 1 (II) collagen results in a lethal
perinatal form of short-limbed dwarfism. J Biol Chem. 1989. 264:18265-7.
Wang K, Li WD, Zhang CK, Wang Z, Glessner JT, Grant SF, et al. A genome-wide
association study on obesity and obesity-related traits. PLoS One. 2011. 6:e18939.
Wang N, Fang L, Xin H, Wang L, Li S. Construction of a high-density genetic map for grape
using next generation restriction-site associated DNA sequencing. BMC Plant Biol. 2012.
12:148.
37
Wang Y, Barthold J, Kanetsky PA, Casalunovo T, Pearson E, Manson J. Allelic variants in
HOX genes in cryptorchidism. Birth Defects Res A Clin Mol Teratol. 2007. 79:269-75.
Wayne RK, Ostrander EA. Lessons learned from the dog genome. Trends Genet. 2007.
23:557-67.
Woelfle JV, Brenner RE, Zabel B, Reichel H, Nelitz M. Schmid-type metaphyseal
chondrodysplasia as the result of a collagen type X defect due to a novel COL10A1 nonsense
mutation: A case report of a novel COL10A1 mutation. J Orthop Sci. 2011. 16:245-9.
Wylie MC, Johnson VM, Carpino E, Mullen K, Hauser K, Nedder A, et al. Respiratory,
neuromuscular, and cardiovascular effects of neosaxitoxin in isoflurane-anesthetized sheep.
Reg Anesth Pain Med. 2012. 37:152-8.
Yates D, Hayes G, Heffernan M, Beynon R. Incidence of cryptorchidism in dogs and cats.
Vet Rec. 2003. 152:502-4.
Zeggini E, Scott LJ, Saxena R, Voight BF, Marchini JL, Hu T, et al. Meta-analysis of
genome-wide association data and large-scale replication identifies additional susceptibility
loci for type 2 diabetes. Nat Genet. 2008. 40:638-45.
Zhuang Z, Gusev A, Cho J, Pe'er I. Detecting identity by descent and homozygosity mapping
in whole-exome sequencing data. PLoS One. 2012.7:e47618
Zimmermann S, Schöttler P, Engel W, Adham IM. Mouse Leydig insulin-like (Ley I-L) gene:
structure and expression during testis and ovary development. Mol Reprod Dev. 1997. 47:308.
Zimmermann S, Steding G, Emmen JM, Brinkmann AO, Nayernia K, Holstein AF, et al.
Targeted disruption of the Insl3 gene causes bilateral cryptorchidism. Mol Endocrinol. 1999.
13:681-91.
Zuccarello D, Morini E, Douzgou S, Ferlin A, Pizzuti A, Salpietro DC, et al. Preliminary
data suggest that mutations in the CgRP pathway are not involved in human sporadic
cryptorchidism. J Endocrinol Invest. 2004. 27:760-4.
38
CHAPTER 2. A NOVEL NONSENSE MUTATION IN THE DMP1 GENE
IDENTIFIED BY A GENOME-WIDE ASSOCIATION STUDY IS RESPONSIBLE
FOR INHERITED RICKETS IN CORRIEDALE SHEEP
A paper published in PLoS One 2011;6:e21739.
Xia Zhao1 #, Keren E Dittmer2 #, Hugh T Blair2, Keith G Thompson2, Max F Rothschild1,
Dorian J Garrick1,2 *
1
Department of Animal Science and Center for Integrated Animal Genomics, Iowa State
University, Ames, IA 50011, USA
2
Institute of Veterinary, Animal & Biomedical Sciences, Massey University, Palmerston
North 4474, New Zealand
#
These two authors contributed equally to this work.
ABSTRACT
Inherited rickets of Corriedale sheep is characterized by decreased growth rate, thoracic
lordosis and angular limb deformities. Previous outcross and backcross studies implicate
inheritance as a simple autosomal recessive disorder. A genome wide association study was
conducted using the Illumina OvineSNP50 BeadChip on 20 related sheep comprising 17
affected and 3 carriers.
A homozygous region of 125 consecutive single-nucleotide
polymorphism (SNP) loci was identified in all affected sheep, covering a region of 6 Mb on
ovine chromosome 6. Among 35 candidate genes in this region, the dentin matrix protein 1
gene (DMP1) was sequenced to reveal a nonsense mutation 250C/T on exon 6.
This
mutation introduced a stop codon (R145X) and could truncate C-terminal amino acids.
39
Genotyping by PCR-RFLP for this mutation showed all 17 affected sheep were “T T”
genotypes; the 3 carriers were “C T”; 24 phenotypically normal related sheep were either “C
T” or “C C”; and 46 unrelated normal control sheep from other breeds were all “C C”. The
other SNPs in DMP1 were not concordant with the disease and can all be ruled out as
candidates. Previous research has shown that mutations in the DMP1 gene are responsible
for autosomal recessive hypophosphatemic rickets in humans. Dmp1_knockout mice exhibit
rickets phenotypes. We believe the R145X mutation to be responsible for the inherited
rickets found in Corriedale sheep. A simple diagnostic test can be designed to identify
carriers with the defective “T” allele. Affected sheep could be used as animal models for this
form of human rickets, and for further investigation of the role of DMP1 in phosphate
homeostasis.
Keywords: nonsense mutation; DMP1 gene; inherited rickets; sheep; genetic disease
INTRODUCTION
Rickets is a metabolic bone disease in humans and animals with most cases caused by a
nutritional deficiency of either vitamin D or phosphorus. This disease leads to softening and
weakening of bone caused by defective mineralization of cartilage at sites of endochondral
ossification and potentially causes fractures and limb deformities [1], [2], [3]. Rickets may
also result from genetic mutations that disrupt genes whose functions are critical for normal
bone metabolism. Five types of rickets have been described in humans with common clinical
characteristics of renal phosphate wasting, including X-linked hypophosphatemic rickets
(XLH), autosomal dominant hypophosphatemic rickets (ADHR), autosomal recessive
hypophosphatemic rickets (ARHR) type 1 and 2, and hereditary hypophosphataemic rickets
40
with hypercalciuria (HHRH). A large number of different mutations comprising single base
pair deletions, nonsense and missense mutations have been identified in PHEX (phosphate
regulation gene with homologies to endopeptidases on the X chromosome) as being
responsible for XLH [4], [5], [6]. ADHR is caused by gain of function mutation leading to
increased activity of fibroblast growth factor 23 encoded by gene FGF23 [7], and mutations
in
the
dentin
matrix
protein
1
gene
(DMP1)
and
ecto-nucleotide
pyrophosphatase/phosphodiesterase 1 (ENPP1) have been identified in autosomal recessive
hypophosphataemic rickets type 1 and 2, respectively [8], [9], [10]. The mutation for HHRH
occurs in the gene for the NaPi-IIc protein (or SLC34A3, human chromosome 9q34), a renal
Na-P co-transporter [11]. Two other recessively inherited forms of rickets in humans are due
to the disruptions of either the synthesis or metabolism of vitamin D. Vitamin D-dependent
rickets type I (VDDR-I) is caused by mutations in the CYP27B1 gene, which encodes
vitamin D 1-alpha-hydroxylase [12]. Loss of function mutations in the vitamin D receptor
gene (VDR) are the genetic basis for vitamin D-dependent rickets type II (VDDR-II), which
is also called hereditary vitamin D - resistant rickets (HVDRR) [13], [14].
Inherited forms of rickets have been uncovered frequently in humans but were rare in
domestic animals. Recently, an inherited form of rickets has been described in purebred
Corriedale sheep from a commercial flock in New Zealand, with an incidence of up to 20
lambs out of 1,600 over a 2-year period. Affected sheep were characterized by decreased
growth rate, thoracic lordosis and angular limb deformities (Figure 1) [15], with low serum
calcium and phosphate concentrations and normal 25 hydroxyvitamin D and 1,25
dihydroxyvitamin D3 concentrations [16]. Embryo transfer and backcross breeding trials
have determined that this disease is likely a simple autosomal recessive disorder [16].
41
Large animal models of human genetic diseases have received extensive clinical attention
due to their advantages over rodent models, including greater physiological similarity to
human patients, a longer life span and larger body size. Sheep have been used as an animal
model for gene therapy (GT) of human diseases, such as post-traumatic osteoarthritis [17]
and steroid-induced ocular hypertension [18]. The convenience of performing in utero GT
prior to disease onset will help to achieve long-term expression of the exogenous gene
without modifying the germ line of the recipient [19]. Based on the hypothesis that inherited
rickets-affected Corriedale sheep could be a potential model for human rickets; and a genetic
marker could help sheep breeders to avoid at-risk matings, we performed a genome-wide
association study by genotyping 54,241 evenly distributed single nucleotide polymorphisms
(SNPs) for 17 affected Corriedale sheep and 3 carriers using the Illumina Ovine SNP50
BeadChip. It has been shown that the analysis of genome-wide high-density SNPs is an
effective strategy to map and identify causal mutations for recessive diseases [20].
RESULTS
Homozygosity Mapping
The genotyping data using a SNP BeadChip for this study was submitted to the NCBI Gene
Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo) under Super Series accession no
GSE26310. In our study, exactly 3 homozygous regions of more than 10 consecutive SNPs
were identified in all 17 affected sheep whereas the 3 carriers exhibited heterozygosity in
those regions. The largest homozygous segment consisted of 125 consecutive SNP loci,
whereas the other two segments had only 19 or 11 consecutive SNPs. The largest 125-SNP
region started from SNP OAR6_109334543.1 and extended to SNP s03175.1, covering a
region of 5.95 Mb (109,334,543 to 115,285,275 bp) on the long arm of sheep chromosome 6
42
(OAR 6). There were 35 genes located in this region based on the bovine reference sequence
(Figures 2a, 2b and Table S2). The size of the second homozygous segment was about 0.70
Mb (OAR6: 118,884,834 to 119,587,883 bp) and had only 3.2 Mb separation from the first
homozygous segment on the same chromosome (OAR 6). The third homozygous segment
covered a 0.66 Mb region (OAR15: 1,131,166 to 1,654,496 bp) on the short arm of OAR 15.
Within the second and third homozygous regions there were 8 and 6 positional candidate
genes, respectively (Table S2). One gene called DMP1 in the largest 125-SNP region was
considered the most plausible candidate due to its biological functions involved in
mineralization and phosphate homeostasis and previous reports of its involvement in human
rickets [8], [21]. By comparing with the bovine genomic reference sequence it was found
that this gene was at 112.20 Mb on OAR6q (Table S2), spanned over 16 kb of size and
contained 6 exons.
Candidate Gene Sequencing
Fine mapping was carried out by individual PCR amplification and sequencing of all the
exons of the DMP1 gene to identify the mutation for this disease (Table S1). After blasting
the
sheep
genomic
reference
sequence
with
bovine
DMP1
coding
sequence
(ENSBTAT00000025468), there were several alignments for exon 6 with high identity rates
(91%-96%). As exon 6 is the last and also the largest exon covering 88% of the open reading
frame of DMP1, it was prioritized for re-sequencing. Direct sequencing from 3 carriers and
1 normal control revealed 1 published SNP (s18093.1, OAR6: 112,213,812 bp) and 8 novel
ones in this exon (Figure 2e). The complete genomic sequence of exon 6 for sheep DMP1
gene was submitted to GenBank (http://www.ncbi.nlm.nih.gov/genbank/) under accession no.
HQ600592.
43
Moreover, we successfully amplified exon 1, exon 3, exon 4 and exon 5 of sheep DMP1 gene
using exon flanked intronic sequences of cow reference for each exon. The amplifications of
exon 2 failed after using several pairs of primers. Intronic SNPs (SNP1 and SNP2) and a 1bp insertion/deletion (indel: A) were identified respectively in intron 1, intron 5 and intron 3
by direct sequencing (Table S1).
Mutation Analysis
Seventeen base pairs upstream of SNP s18093.1, a SNP (C/T, OAR6:112,213,795 bp)
located at +250bp of exon 6 caused a non-synonymous amino acid change in the DMP1
protein. This C -> T transition introduced a stop codon (R145X) that led to a truncated
DMP1 protein at the 145th amino acid (Figure 3a, Figure 3b and Figure S1). Genotyping of
this mutation by PCR-RFLP (Polymerase Chain Reaction-Restriction Fragment Length
Polymorphism) (Figure 3c) and direct sequencing showed all 17 affected animals had the “T
T” genotype; the 3 carriers were “C T”; 24 phenotypically normal related sheep were either
“C T” or “C C”; and 46 unrelated control sheep of other breeds were only “C C” genotypes
(Table 1). This mutation was 100% concordant with the recessive pattern of inheritance in
affected, carrier and normal individuals. Moreover, genotyping of the other 8 SNPs in exon
6 which surrounded the R145X mutation in DMP1, found incomplete linkage disequilibrium
for these markers with either the disease or the mutant SNP (Figure 2f). The 3 genomic
variants in introns 1, 3 and 5 identified were not located at either the splicing receptor or
acceptor sites. Therefore, those mutations could be ruled out as candidate mutation loci. RTPCR/RFLP (Figure 3d) was performed for the R145X mutation and showed the same
digestion pattern as PCR-RFLP.
44
Taken together, our results strongly suggest that the R145X substitution is causative for
inherited rickets in Corriedale sheep, disrupting the function of the DMP1 protein, preventing
successful bone mineralization and phosphate homeostasis.
DISCUSSION
Homozygosity mapping is a powerful method to map simple recessive traits since the regions
immediately flanking the mutation locus will likely be homozygous by descent in closely
related offspring. The affected animals in this study resulted from consanguineous mating
between pairs of offspring descended from one common male ancestor in the embryo transfer
and backcross studies used to characterize the inheritance of the disease [15], [16]. The
region containing the mutation should in these circumstances span a large chromosome
segment that would be progressively diminished in subsequent meioses when recombination
events occur near to the mutation. This mapping strategy has been used successfully for
recessive diseases in both animal and human studies [20], [22]. The homozygosity mapping
in this study was conducted from knowledge of the SNP genotypes in affected individuals
and their genomic location, and used command line UNIX tools including awk, sort and join
to identify any regions of at least 10 consecutive homozygous SNPs, equivalent to about 0.5
cM or more. It was presumed that one such region would contain the mutation associated
with inherited rickets in Corriedale sheep. In this study, 3 homozygous regions met the
criteria of exceeding 10 consecutive homozygous SNP markers. Previously reported rickets
genes including PHEX, FGF23, ENPP1, SLC34A3, CYP27B1 and VDR did not show up in
any of these homozygous segments based on the ovine genome assembly v1.0, whereas
DMP1 was contained in the largest region. This is not surprising because PHEX or FGF23
related rickets have a distinctive inheritance mode that differs from the autosomal recessive
45
inheritance of the rickets studied here. The severe hypophosphatemia in affected sheep
indicates disturbed phosphate balance within the body. Therefore, the DMP1 gene was
considered to be a high priority candidate gene. The nonsense mutation we identified in the
DMP1 gene was 100% concordant with the recessive pattern of inheritance for the disease
studied.
DMP1 encodes a serine-rich acidic protein and is a member of Small Integrin-Binding
Ligand, N-linked, Glycoprotein family (SIBLING) which mineralizes tissue such as bones
and teeth by binding strongly to hydroxyapatite using an Arg-Gly-Asp (RGD) tripeptide [23].
Dual functions of the DMP1 protein have been reported based on the in vitro studies. The
nonphosphorylated cytoplasmic DMP1 protein translocates to the cell nucleus and initially
acts as a transcriptional factor to enhance gene transcription of the osteoblast-specific genes
such as alkaline phosphatase and osteocalcin, and then it moves out to the extracellular
matrix during the osteoblast to osteocyte transition phase, to promote mineralization and
phosphate homeostasis [24], [25]. Loss of function mutations in the human DMP1 gene have
been shown to cause autosomal recessive hypophosphatemic rickets in different ethnic
groups including Turkish, consanguineous Spanish, Lebanese and Japanese populations.
These variant mutations in DMP1 include deletions in exon 6, nucleotide substitution in the
splice acceptor sequence of intron 2, and missense mutations in exon 2 or exon 3 that
introduce the premature termination codon [8], [21], [26]. The Dmp1-null mouse model also
exhibits hypophosphatemic rickets, which indicates the important role of the DMP1 gene in
the development of normal bone formation [21]. Since the majority of the dentin matrix
protein 1 is encoded by the sixth and last exon of DMP1, typical for proteins in the SIBLING
family [27], the novel nonsense mutation (250C/T, R145X) identified here in ovine DMP1
46
gene is a plausible mutation leading to loss of function following nonsense-mediated mRNA
decay shown to occur in human patients. The nonsense-mediated mRNA decay is a posttranscriptional mechanism in eukaryotic species. It likely eliminates the abnormal transcripts
produced by prematurely terminated translation [28]. The previous studies strongly support
the nonsense mutation we identified in exon 6 as causing loss of function of the DMP1
protein and being responsible for ARHR type 1 in Corriedale sheep.
Despite the wide variety of mutated genes that result in the different forms of hereditary
hypophosphatemic rickets, the clinical features are relatively similar (with the exception of
HHRH). In XLH, ADHR, ARHR type 1 and 2, the overriding common feature is increased
serum concentration and activity of FGF23. FGF23 inhibits a NPT-2a co-transporter (Na-Pi
co-transporter) in the kidneys leading to loss of phosphate from renal tubules into the urine
(phosphaturia). FGF23 also inhibits the activity of CYP27B1, thereby inhibiting active
vitamin D (1,25-dihydroxyvitamin D3) production [29]. As a result of the increased FGF23
concentrations patients with XLH, ADHR, ARHR type 1 and 2 present with rickets,
hypophosphatemia, phosphaturia and inappropriately normal serum 1,25-dihydroxyvitamin
D3 concentrations [9], [30]. Corriedale sheep with the mutation in DMP1 also developed
rickets, were severely hypophosphataemic and had normal serum 1,25(OH)2D3
concentrations [1], [16]. The mechanism by which mutations in PHEX, DMP1 or ENPP1
result in upregulation of FGF23 is unknown. An unusual feature of ARHR type 2 is that
mutations in ENPP1 are also associated with generalized arterial calcification of infancy,
again the mechanism by which a mutation in ENPP1 can result in a disease characterised by
hypomineralization and one characterised by hypermineralization is unknown [9]. In HHRH,
47
CYP27B1 is actually upregulated, resulting in increased 1,25(OH)2D3 concentrations, and
perhaps explaining the hypercalciuria [11].
In conclusion, this is the first report of ARHR type 1 caused by a mutation in the DMP1 gene
in a sheep population. In this study, we demonstrate successful use of applying high density
SNP arrays for localization of a mutation locus causing a recessive inherited defect. The
results can be rapidly applied as a selection marker to identify carriers with the defective “T”
allele in order to avoid at-risk matings to improve animal welfare and decrease economic
losses. At this stage, the incidence of the mutation in the Corriedale sheep population is
unknown, however it is suspected to be widespread and testing of New Zealand Corriedale
sheep is planned. More importantly, due to their size and relatively low cost of maintenance,
Corriedale sheep with inherited rickets could be used as an animal model for this form of
human rickets. Even though the human and sheep mutations are not exactly the same, both
disrupt the function of the DMP1 protein.
The sheep model can be used for further
investigation of the role of DMP1 in phosphate homeostasis and to explore the potential for
early medical intervention to reduce the development of the disease in homozygous
individuals.
MATERIALS and METHODS
Ethics statement
Approvals dealing with DNA or tissue collection from animals were obtained from the
Massey University Animal Ethics Committee (APPROVAL NUMBER 05/123) and Iowa
State University Animal Care & Use Committee did not require additional approvals.
Animals
48
In total, 17 affected Corriedale sheep (9 males, 8 females) and 27 phenotypically normal
crossbred Corriedale sheep (6 males, 21 females) as well as 46 normal controls (Texel and
crossbred Perendale sheep) were examined. The age of examined animals ranged from a
fetus of 133 days gestation to a 3 year-old animal. Eight affected sheep were from the
original commercial property at Marlborough, New Zealand, and 9 affected sheep were their
offspring. The 27 phenotypically normal crossbred Corriedale sheep were progeny from the
mating of unrelated Romney ewes with the Corriedale carrier ram (F1 generation from
outcross), or progeny of the F1 generation daughters mated back to the carrier ram (F2
generation from backcross). SNP genotypes were obtained for the 17 affected and 3 carrier
Corriedale sheep, while fine mapping was conducted on all 44 pure or crossbred Corriedale
sheep. The 46 normal control sheep used for validation of the mutation were from unrelated
populations without evidence of rickets.
SNP genotyping
DNA was extracted from blood using a Roche MagNA Pure automated analyser. DNA
samples were randomized on genotyping plates according to disease status. The genotyping
was performed using the Ovine SNP50 BeadChip containing 54,241 SNPs (Illumina, San
Diego, CA, USA) with standard procedures at GeneSeek Inc. Lincoln, NE, USA. The SNPs
retained had to get a call rate > 80% and a minimum GenTrain score of 0.25. All genotypes
with satisfactory call rate and GenTrain score were used regardless of minor allele frequency
(MAF).
SNP chip data analyses
The IBD program used awk, join and sort UNIX commands to scan the whole genome of the
genotyped affected cases, to produce a list of all the strings of consensus homozygous
49
markers with a length of more than 10 SNPs. The same process was repeated in the carriers
to ensure the regions identified in the affected individuals were not homozygous in the
carriers. The homozygosity mapping algorithm involved the following: first, SNPs in the
Illumina ovine 50K map file were sorted by chromosome and position, and numerically
coded in contiguous order.
Second, the names of the “A/B” genotypes (provided by
GeneSeek Inc.) for each individual were joined with the renumbered map file to identify the
genotypes by their locations. The joined file was separated into three groups according to the
disease status of affected, carrier and normal. Third, the number of occurrences of genotypes
(“AA”, “AB”, “BB” and no call “--”) for each SNP was counted in the affected and carrier
groups. Fourth, the genotype frequencies were calculated for each SNP. In the frequency
calculation, the numerator was the count of genotypes being either “AA”, “AB” or “BB”, and
the denominator was the total count excluding the category of no call. Finally, if the highest
genotype frequency for a SNP was equal to 1 and the two alleles of this genotype were the
same, such as “AA” or “BB”, this SNP was considered a homozygous locus. The span of
contiguous homozygous loci were accumulated and reported if the region included more than
10 loci.
Genes in these IBD regions were examined for potential involvement using
comparative maps between the ovine and bovine genomes through the Ovine Genome
Assembly v1.0 (http://www.livestockgenomics.csiro.au/perl/gbrowse.cgi/oar1.0/#search).
DMP1 gene sequencing
Primers were designed to amplify all 6 exons for sheep DMP1 (Table S1). Cow DMP1
genomic sequence was used as a template for primer designing in order to amplify the first 5
exons of sheep DMP1 gene. PCR was performed using a 10 µl system with GoTaq DNA
polymerase (Promega, Madison, WI). The annealing temperatures were 58° C or 60° C.
50
PCR products were purified with ExoSAP-IT® (USB, Cleveland, OH), before pooling from
3 carriers and sequencing commercially using a 3730xl DNA Analyzer.
Mutation analyses
Sheep EST sequence (DY509762) encompassing exon 1 to exon 5 of sheep DMP1 along
with re-sequencing results were used to predict the DMP1 protein sequence by applying the
ExPASy translate tool (http://ca.expasy.org/tools/dna.html). The predicted sheep DMP1
protein was compared to the wild type (normal) of sheep and cow (AAI49017) using the
CLUSTALW alignment tool (http://align.genome.jp/). Each SNP variant was analyzed to
check whether it could induce a missense or nonsense mutation. The identified SNPs were
genotyped on all 44 related Corriedale sheep and/or 46 normal controls by restriction enzyme
digestion (PCR-RFLP) or direct sequencing.
The genotyping results from all 9 SNPs
distributed in exon 6 of the DMP1 gene were formatted and put into Haploview software
(http://www.broadinstitute.org/haploview/haploview).
The linkage disequilibrium (LD)
between pairs of SNPs in exon 6 was analyzed by Haploview software. Under the LD tabs,
r2, the correlation coefficient, was saved to show the LD between each pair of SNPs. The
related LD color scheme was produced accordingly.
RNA extraction and Reverse transcriptase-polymerase chain reaction (RT-PCR)
RNA was extracted from fibroblast cells cultured in vitro from skin biopsies of affected and
control sheep to confirm the mutation existed at the transcript level. The fibroblast cell
culture is as previously described [16]. Total RNA was extracted using Tri Reagent as per
the manufacturer’s instructions (Sigma-Aldrich Co., USA). RT-PCR was performed using
the Superscript® one-step RT-PCR system with Platinum® Taq polymerase (Invitrogen
51
Corp., USA). The RT-PCR conditions were: 55ºC for 30 min, 94ºC for 2 min, 40 cycles of
30 s at 94ºC, 30 s at 58º and 30 s at 72ºC, and finally a 5 min elongation step at 72ºC.
Restriction enzyme digestion
The nonsense mutation (C -> T) created a new recognition site for the restriction enzyme
NIaIII (New England Biolabs Inc., USA). DNA fragments from the PCR and RT-PCR
reactions were subjected to restriction enzyme digestion with NlaIII. The reaction mix was
incubated at 37ºC overnight, and the products were analyzed on a 2.5% (w/v) ultra-pure
agarose gel.
REFERENCES
1. Dittmer KE, Thompson KG, Blair HT (2009) Pathology of inherited rickets in
Corriedale sheep. J Comp Pathol 141: 147-155.
2. Fitch LWN (1943) Rickets in hoggets, with a note on the aetiology and definition of
the disease. Australian Vet J 19:2.
3. Ozkan B (2010) Nutritional rickets. J Clin Res Pediatr Endocrinol 2: 137-143.
4. HYP Consortium (1995) A gene (HYP) with homologies to endopeptidases is
mutated in patients with X-linked hypophosphatemic rickets. Nat Genet 11: 130-136.
5. Sabbagh Y, Jones AO, Tenenhouse HS (2000) PHEXdb, a locus-specific database
for mutations causing X-linked hypophosphatemia. Hum. Mutat 16: 1-6.
6. Ichikawa S, Traxler EA, Estwick SA, Curry LR, Johnson ML, et al. (2008)
Mutational survey of the PHEX gene in patients with X-linked hypophosphatemic
rickets. Bone 43: 663-666.
7. ADHR Consortium (2000) Autosomal dominant hypophosphataemic rickets is
associated with mutations in FGF23. Nat Genet. 26:345-348.
8. Lorenz-Depiereux B, Bastepe M, Benet-Pages A, Amyere M, Wagenstaller J, et al.
(2006) DMP1 mutations in autosomal recessive hypophosphatemia implicate a bone
matrix protein in the regulation of phosphate homeostasis. Nat Genet 38: 1248-1250.
9. Lorenz-Depiereux B, Schnabel D, Tiosano D, Hausler G, Strom TM (2010) Loss-offunction ENPP1 mutations cause both generalised arterial calcification of infancy
and autosomal-recessive hypophosphatemic rickets. Am J Hum Genet 86: 267-272.
10. Levy-Litan V, Hershkovitz E, Avizov L, Leventhal N, Bercovich D, et al. (2010)
Autosomal-recessive hypophosphatemic rickets is associated with an inactivation
mutation in the ENPP1 gene. Am J Hum Genet 86: 273-278.
52
11. Bergwitz C, Roslin NM, Tieder M, Loredo-Osti JC, Bastepe M, et al. (2006)
SLC34A3 mutations in patients with hereditary hypophosphatemic rickets with
hypercalciuria predict a key role for the sodium-phosphate cotransporter NaPi-IIc in
maintaining phosphate homeostasis. Am J Hum Genet 78: 179-192.
12. Alzahrani AS, Zou M, Baitei EY, Alshaikh OM, Al-Rijjal RA, et al. (2010) A novel
G102E mutation of CYP27B1 in a large family with vitamin D-dependent rickets
type 1. Clin Endocrinol Metab 95: 4176-4183.
13. Malloy PJ, Pike JW, Feldman D (1999) The vitamin D receptor and the syndrome of
hereditary 1,25-dihydroxyvitamin D-resistant rickets. Endo Rev 20: 156-188.
14. Malloy PJ, Xu R, Peng L, Peleg S, Al-Ashwal A, et al. (2004). Hereditary 1,25dihydroxyvitamin D resistant rickets due to a mutation causing multiple defects in
vitamin D receptor function. Endocrine 145: 5106-5114.
15. Thompson KG, Dittmer KE, Blair HT, Fairley RA, Sim DF (2007) An outbreak of
rickets in Corriedale sheep: evidence for a genetic aetiology. N Z Vet J 55: 137-142.
16. Dittmer KE, Howe L, Thompson KG, Stowell KM, Blair HT, et al. (2010) Normal
vitamin D receptor function with increased expression of 25-hydroxyvitamin D(3)24-hydroxylase in Corriedale sheep with inherited rickets. Res Vet Sci 91:362-369.
17. Hurtig M, Frederic D, Darilyn F, Betina L, Antonio C, et al. (2008) BMP-7 Gene
Therapy for Mitigation of Post-traumatic Osteoarthritis in Sheep. Eur Cell Mater 16:
52
18. Gerometta R, Spiga MG, Borras T, Candia OA (2010) Treatment of sheep steroidinduced ocular hypertension with a glucocorticoid-inducible MMP1 gene therapy
virus. Invest Ophthalmol Vis Sci 51: 3042-3048.
19. Porada CD, Tran N, Eglitis M, Moen RC, Troutman L, et al. (1998) In utero gene
therapy: transfer and long-term expression of the bacterial neo(r) gene in sheep after
direct injection of retroviral vectors into preimmune fetuses. Hum Gene Ther 9:
1571-1585.
20. Charlier C, Coppieters W, Rollin F, Desmecht D, Agerholm JS, et al. (2008) Highly
effective SNP-based association mapping and management of recessive defects in
livestock. Nat Genet 40: 449-454.
21. Feng JQ, Ward LM, Liu S, Lu Y, Xie Y, et al. (2006) Loss of DMP1 causes rickets
and osteomalacia and identifies a role for osteocytes in mineral metabolism. Nat
Genet 38: 1310-1315.
22. Lander ES, Botstein D (1987) Homozygosity mapping: a way to map human
recessive traits with the DNA of inbred children. Science 236: 1567-1570.
23. Fisher LW, Fedarko NS (2003) Six genes expressed in bones and teeth encode the
current members of the SIBLING family of proteins. Connect Tissue Res 44 Suppl 1:
33-40.
24. Narayanan K, Ramachandran A, Hao J, He G, Park KW, et al. (2003) Dual
functional roles of dentin matrix protein 1. Implications in biomineralization and
gene transcription by activation of intracellular Ca2+ store. J Biol Chem 278: 1750017508.
25. Schiavi SC (2006) Bone talk. Nat Genet 38: 1230-1231.
53
26. Koshida R, Yamaguchi H, Yamasaki K, Tsuchimochi W, Yonekawa T, et al. (2010)
A novel nonsense mutation in the DMP1 gene in a Japanese family with autosomal
recessive hypophosphatemic rickets. J Bone Miner Metab 28: 585-590.
27. Fisher LW, Torchia DA, Fohr B, Young MF, Fedarko NS (2001) Flexible structures
of SIBLING proteins, bone sialoprotein, and osteopontin. Biochem Biophys Res
Commun 280: 460-465.
28. Maquat LE (2004) Nonsense-mediated mRNA decay: splicing, translation and
mRNP dynamics. Nat Rev Mol Cell Biol 5: 89-99.
29. Bergwitz C, Jüppner H (2010) Regulation of Phosphate Homeostasis by PTH,
Vitamin D, and FGF23. Annu Rev Med 61: 91-104.
30. Dittmer KE, Thompson KG (2011) Vitamin D metabolism and rickets in domestic
animals: a review. Vet Pathol 48: 389-407.
54
Table 1. The genotyping results of the causative mutation (R145X)
Sheep
Breed
Corriedale
Phenotype
Rickets Status
Affected
Known Carrier
Phenotype Normala
Genotype
# Sheep
17
3
24
Genotype
TT
TC
TC,CC
# Sheep
17
3
24
Rate of
Concordance
100%
100%
100%
Texel,
Normal
46
CC
46
100%
Perendale
a
These sheep are phenotypically normal offspring from F1 generation and F2 backcross
generations generated by out-crossing and back-crossing trials.
Figure 1. Corriedale sheep with rickets. (a) 1½-year-old Corriedale sheep with
inherited rickets showing angular limb deformities of the forelimbs and lordosis of the
spine in the mid-thoracic region (white arrow). (b) 1½-year-old Corriedale sheep with
inherited rickets showing combined varus and valgus or “windswept” deformity of the
forelimbs.
55
Figure 2. Work flow chart for discovering the mutation locus of inherited rickets in
Corriedale sheep. (a) The sheep chromosome 6 (OAR 6). (b) The IBD region
containing the causative locus is located on OAR 6 between 109 and 115 Mb. (c) The
black boxes show 35 cow reference genes encompassing the sheep IBD region based on
the comparative genomic maps. (d) & (e) The boxes represent exons of the cow and the
predicted sheep DMP1 genes. The solid parts of boxes demonstrate the coding region
for DMP1. The vertical lines on exon 6 are representing the mutant SNP (R145X) with
the other 8 SNPs. (f) The number and color in each diamond box shows r2 value for
each pair of SNPs. Higher numbers and darker colors indicate SNPs are with higher
LD. Four SNPs in block 1 show a complete mutual linkage disequilibrium (LD) but not
with the causative mutation (R145X).
56
Figure 3. Mutation analyses (a) The chromatogram shows the location of the mutation
(Y: C/T). (b) Protein sequence alignments of DMP1. The highlighted R145X
substitution induces a stop codon that leads to premature termination of the protein.
MUT, R145X mutant; WT, wild type. (c) PCR-RFLP genotyping results for the
affected (3 bands) and carrier (4 bands) Corriedale sheep and normal (2 bands) control
sheep. M, 1kb marker; A, affected; C, carrier; N, normal. (d) RT-PCR-RFLP results
for 2 controls (N1 &N2) and 3 affected sheep (A1, A2 &A3).
57
Figure S1. Complete predicted DMP1 protein sequence alignments among rickets
affected sheep, wild type sheep and cow. The amino acid (R145X) with a border is the
position where the “C - > T” transition induced a stop codon and lead to a truncated
DMP1 protein at the 145th amino acid.
Table S1. Primer sequence, annealing temperature, PCR amplicon information and genetic variants identified by
sequencing.
Primer
name
Anealing
Temp. (ºC)
Size
(bp)
Fragment
name
Genetic variant
(position in the
fragment)
SNP
name
SNP
characteristic
DMP1e1F
DMP1e1R
DMP1e3F
DMP1e3R
DMP1e4F
DMP1e4R
DMP1e5F
DMP1e5R
DMP1e6_2F a
DMP1e6_2Ra
5’- CTTTTCCCATCCTTGGGTCT-3’
5’- GCACTGTTCTCCCCATCTTT-3’
5’- CAAAATGTTATCCCCAGACCA-3’
5’- TTCAGCATTTGCAGTGAAGC-3’
5’- GGCTAAACACAAGCCAAGGA-3’
5’- GGAGATTGGGAGGGTCATGT-3’
5’- CAGGCCATTTGGAAAGTCAT-3’
5’- GAGGAACATTTAGGGCCACA-3’
5’- ATGGAAAATGGGGTGACTTG-3’
5’- CATCCCTTCATCGTCGAACT-3’
60
560
DMP1e1
C/T (+ 453bp)
SNP1
intronic
60
388
DMP1e3
no
-
no
60
659
DMP1e4
indel (A) (+284bp)
-
intronic
60
411
DMP1e5
C/G (+ 341bp)
SNP2
intronic
58
668
DMP1e6_2
DMP1e6_5F
DMP1e6_5R
5’- ATGAGTCCAGGGGTGACAAC-3’
5’- CAACAATGGGCATCTTTCCT-3’
58
703
DMP1e6_5
C/T (+455bp)
C/T (+472bp)
A/G (+493bp)
C/G (+101bp)
C/T (+134bp)
C/T (+140bp)
A/G (+167bp)
A/C (+218bp)
C/T (+602bp)
R145X
S18033.1
SNP3
SNP4
SNP5
SNP6
SNP7
SNP8
SNP9
exonic
exonic
exonic
exonic
exonic
exonic
exonic
exonic
3’ UTR
a
These primer sets were used for PCR-RFLP genotyping of the mutation R145X.
58
Primer sequence
Table S2. The list of ovine positional candidate genes based on the bovine reference gene sequences.
Targeted
Region
(size)
1
(5.95 Mb)
Ovine Chromosome Position
Bovine Ref.
Gene Symbol
WDFY3
OAR6:110461394..110535578
OAR6:111226629..111449410
ARHGAP24
PTPN13
OAR6:111458495..111488852
OAR6:111512006..111515283
OAR6:111512014..111515294
OAR6:111512014..111515294
OAR6:111600842..111740902
OAR6:111756587..111811197
OAR6:111885205..111904838
OAR6:111920170..111969527
OAR6:112013056..112050582
OAR6:112063439..112116046
OAR6:112199546..112216300*
OAR6:112365289..112411784
OAR6:112588875..112650749
SLC10A6
UBE2D3P
LOC783264
LOC783114
AFF1
KLHL8
HSD17B13
HSD17B11
NUDT9
SPARCL1
DMP1*
MAN2B2
PPP2R2C
OAR6:112757862..112757903
OAR6:112759544..112781041
OAR6:112780837..112910375
OAR6:113024908..113058856
OAR6:113144679..113208484
LOC617028
LOC782334
JAKMIP1
LOC615433
CRMP1
OAR6:113213443..113312073
OAR6:113335352..113473188
OAR6:113451908..113473188
OAR6:113787927..114038860
OAR6:114257470..114261815
OAR6:114391388..114395645
OAR6:114700887..114900429
OAR6:114899716..114940430
OAR6:114899716..114901567
OAR6:115074361..115095979
OAR6:115101064..115121797
OAR6:115135963..115151278
OAR6:115207964..115221470
OAR6:115249664..115251638
EVC
EVC2
LOC786485
STK32B
CYTL1
MSX1
STX18
NSG1
LOC789276
ZNF509
LYAR
TMEM128
LOC525643
DRD5
Similar to WD repeat and FYVE domain-containing protein 3 (Autophagy-linked
FYVE protein) (Alfy)
Rho GTPase activating protein 24
Protein tyrosine phosphatase, non-receptor type 13 (APO-1/CD95 (Fas)associated phosphatase)
Solute carrier family 10 (sodium/bile acid cotransporter family), member 6
Similar to Putative ubiquitin-conjugating enzyme E2 D3-like protein
Similar to ubiquitin-conjugating enzyme E2D 3
Similar to ubiquitin-conjugating enzyme E2D 3
Similar to AF4/FMR2 family, member 1
Similar to KIAA1378 protein, transcript variant 1
Hydroxysteroid (17-beta) dehydrogenase 13
Hydroxysteroid (17-beta) dehydrogenase 11
Nudix (nucleoside diphosphate linked moiety X)-type motif 9
SPARC-like 1 (hevin)
Dentin matrix acidic phosphoprotein 1
Similar to mannosidase, alpha, class 2B, member 2
Similar to Serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B
gamma isoform
Hypothetical LOC617028
Hypothetical LOC782334
Janus kinase and microtubule interacting protein 1
Hypothetical LOC615433
Similar to Dihydropyrimidinase-related protein 1 (DRP-1) (Collapsin response
mediator protein 1)
Ellis van Creveld syndrome
Ellis van Creveld syndrome 2
Hypothetical LOC786485
Similar to serine/threonine kinase 32B
Similar to Cytokine-like protein 1 precursor (Protein C17)
Msh homeobox 1
Syntaxin 18
Neuron specific gene family member 1
Hypothetical LOC789276
Similar to hCG2039195, transcript variant 1
Ly1 antibody reactive homolog (mouse)
Transmembrane protein 128
Similar to otopetrin
Dopamine receptor D5
Bovine Gene
Accession No.
XM_617252
NM_001102234
NM_174590
NM_001081738
NM_001075135
XM_001249763
XM_001250124
XM_001249488
XM_612186
NM_001046616
NM_001046286
NM_001101096
NM_001034302
NM_174038
XM_601803
XM_001250700
XM_882441
XM_001250977
NM_001102251
XM_867237
XM_580336
NM_174747
NM_173927
XM_001254147
XM_607574
XM_581416
NM_174798
NM_001099719
NM_001077957
XM_001256071
XM_583985
NM_001031769
NM_001034454
XM_603996
XM_604584
59
OAR6:109103268..109364732
Bovine Ref. Gene Name
Table S2. (continued)
Targeted
Region
(size)
2
(0.70 Mb)
3
(0.66 Mb)
Bovine Ref.
Gene Symbol
OAR6:118851890..118883051
OAR6:118941179..118943474
OAR6:118976061..119063813
OAR6:119204026..119245137
OAR6:119417009..119417107
OAR6:119417009..119417587
OAR15:1220402..1223574
OAR15:1338164..1344393
OAR15:1418829..1442937
PDE6B
GPI7
LOC614799
WDR1
LOC786062
ZNF518B
KIAA1826
KBTBD3
AASDHPPT
OAR15:1484534..1488246
OAR15:1503540..1575517
OAR15:1596294..1619791
OAR15:1220402..1223574
ANKRD49
MRE11A
GPR83
KIAA1826
Bovine Ref. Gene Name
Phosphodiesterase 6B, cGMP-specific, rod, beta
GPI7 protein
Similar to solute carrier family 2 (facilitated glucose transporter), member 9
WD repeat domain 1 (WDR1)
Similar to Zinc finger protein 518B-like
Similar to Zinc finger protein 518B
Hypothetical protein LOC511226
Kelch repeat and BTB (POZ) domain containing 3
Similar to aminoadipate-semialdehyde dehydrogenase-phosphopantetheinyl
transferase
Ankyrin repeat domain 49
Double-strand break repair protein meiotic recombination 11 homolog A
Similar to G protein-coupled receptor 83
Hypothetical protein LOC511226
Bovine Gene
Accession No.
NM_174418
NM_001024518
XM_866430
NM_001046346
XM_001788422
XM_584181
NM_001046078
XM_602528
XM_001250765
NM_001014965
XM_603439
XM_615580
NM_001046078
60
Ovine Chromosome Position
61
CHAPTER 3. A MISSENSE MUTATION IN AGTPBP1 WAS IDENTIFIED IN
SHEEP WITH A LOWER MOTOR NEURON DISEASE
A paper published in Heredity 2012;109:156-62
Xia Zhao1, Suneel K Onteru1, Keren E Dittmer2, Kathleen Parton2, Hugh T Blair2, Max F
Rothschild1, Dorian J Garrick1,2
1
Department of Animal Science and Center for Integrated Animal Genomics, Iowa State
University, Ames, IA 50011, USA
2
Institute of Veterinary, Animal & Biomedical Sciences, Massey University, Palmerston
North 4474, New Zealand
Running title: A missense mutation in AGTPBP1 and sheep LMN disease
Word count for main text: 22,172 characters
Keywords: AGTPBP1; lower motor neuron disease; missense mutation; Purkinje cell
degeneration; sheep
ABSTRACT
A type of lower motor neuron disease inherited as autosomal recessive in Romney sheep was
characterized with normal appearance at birth, but with progressive weakness and
tetraparesis after the first week of life. Here we carried out genome-wide homozygosity
mapping using Illumina OvineSNP50 BeadChips on lambs descended from one carrier ram,
including 19 sheep diagnosed as affected and 11 of their parents that were therefore known
62
carriers. A homozygous region of 136 consecutive single-nucleotide polymorphism (SNP)
loci on chromosome 2 was common to all affected sheep and it was the basis for searching
for the positional candidate genes. Other homozygous regions shared by all affected sheep
spanned 8 or fewer SNP loci. The 136-SNP-region contained the sheep ATP/GTP-binding
protein 1 (AGTPBP1) gene. Mutations in this gene have been shown to be related to
Purkinje cell degeneration (pcd) phenotypes including ataxia in mice.
One missense
mutation c.2909G>C on exon 21 of AGTPBP1 was discovered which induces an Arg to Pro
substitution (p.Arg970Pro) at amino acid 970, a conserved residue for the catalytic activity of
AGTPBP1. Genotyping of this mutation showed 100% concordant rate with the recessive
pattern of inheritance in affected, carrier, phenotypically normal and unrelated normal
individuals. This is the first report showing a mutant AGTPBP1 is associated with a lower
motor neuron disease in a large mammal animal model. Our finding raises the possibility of
human patients with the same etiology caused by this gene or other genes in the same
pathway of neuronal development.
INTRODUCTION
Motor neuron diseases (MND) are a group of disorders involved with selective degeneration
of upper motor neurons (UMN), and/or lower motor neurons (LMN). The incidence of MND
varied from 0.6 per 100,000 person-years to 2.4 per 100,000 person-years in different ethnic
human populations (Lee, 2010). UMN originate in the cerebral cortex or the brain stem and
provides indirect stimulation of the target muscles, while LMN connect the brain stem and
spinal cord to muscle fibers, and provide nerve signals between the UMN and muscles.
Common characteristics of MND are muscle weakness and/or spastic paralysis. Signs of
LMN lesions are represented as muscle atrophy, fasciculations and weakness when
63
associated LMN degenerate. Symptoms like spastic tone, hyperreflexia and upgoing plantar
reflex signs appear as UMN degenerate (Lee, 2012). The weakness of respiratory or bulbar
muscle can even cause death of MND patients due to respiratory insufficiency (Borasio et al.,
1998). Based on the time of onset, types of motor neuron involvement and detailed clinical
features, MND in humans can be classified into six types of diseases including amyotrophic
lateral sclerosis (ALS), hereditary spastic paraplegia (HSP), primary lateral sclerosis (PLS),
spinal bulbar muscular atrophy (SBMA), spinal muscular atrophy (SMA) and lethal
congenital contracture syndrome (LCCS). MND are considered incurable and no effective
treatment is currently available. However, the gene therapy strategy of performing gene
transfer to patients by utilizing adeno-associated virus (AAV) vectors is a promising
therapeutic approach (Nizzardo et al., 2012). Previous evidence has shown genetic variants
within more than 30 genes might be responsible for MND (Dion et al., 2009). For example,
mutations in the Cu/Zn-binding superoxide dismutase (SOD1) gene have been identified in
about 20% of patients with familial ALS (Rosen et al., 1993). However, genetic causes still
remain unclear for more than half of human patients with MND.
In 1999, an extended family of Romney lambs was found to have a suspected hereditary
MND, characterized by poor muscle tone and progressive weakness from about 1 week of
age, leading to severe tetraparesis, recumbency and muscle atrophy. Lambs remained alert
up to four weeks of age when they were euthanized. The muscles were atrophied and pale,
and histologically had variation in muscle fiber diameter and small angular fibers. Large
foamy macrophages were present in cerebrospinal fluid, and there were degenerate neurons
in the spinal cord leading to a decrease in the number of motor neurons, consistent with LMN
diseases. No histological changes were seen in the cerebellum (Anderson et al., 1999).
64
Outcross and backcross breeding trials were consistent with this LMN disease having simple
autosomal recessive inheritance.
In order to find the causative locus responsible for this disease, we performed a genome-wide
homozygosity mapping process and then fine mapped the mutation using a positional
candidate gene approach within the largest homogygous genomic region found in the
affected sheep. Our goals of this research were to discover the genetic basis of this disease,
to help sheep breeders avoid matings between carrier animals, and to provide a potential
animal model for gene therapy treatment of human LMN disease.
METHODS
Animals
Approvals dealing with DNA or tissue collection from animals were obtained from the
Massey University Animal Ethics Committee (Approval Number 99/135).
Iowa State
University Animal Care & Use Committee did not require additional approvals.
A single Romney putative carrier ram was mated in New Zealand to 130 crossbred Finnish
Landrace ewes to produce 115 daughters that were subsequently backcrossed to their sire.
The backcrosses produced 5 affected sheep out of 47 lambs in the first lambing, and 11 out of
71 in the second lambing. The observed rate of affected sheep was consistent with the 1 in 8
expectation for a homozygous recessive disorder from this mating scheme. Any affected
offspring and their dams now proven to be carriers, formed the basis of the experimental
population. High density SNP genotypes were obtained from 19 affected, which included 3
affected lambs from the original Romney flock, and 11 known carrier sheep. Fine mapping
using novel SNPs was conducted on 52 Romney or Romney/Finn crosses including the
above mentioned 30 ones with clear disease statuses and 22 potential carriers which were
65
phenotypically normal but sired by the carrier ram. Moreover, 85 control sheep consisted of
14 normal Romney sheep, 25 crossbred Texel sheep and 46 crossbred Corriedale sheep were
used for validation of the causative mutation. These sheep were produced from unrelated
populations without evidence of LMN disease.
SNP genotyping
DNA was extracted from blood using the QIAamp DNA Blood Mini Kit (QIAGEN, CA,
USA) and the Roche MagNA Pure automated analyser (Roche, USA). The genotyping of
54,241 evenly distributed polymorphic SNP markers across the genome was performed using
the Ovine SNP50 BeadChip (Illumina, San Diego, CA, USA) at the commercial company
GeneSeek Inc. Lincoln, NE, USA. Standard procedures were performed with a PCR and
ligation-free protocol, which routinely achieves high average call rates and accuracy.
Genome wide homozygosity mapping
In order to define the molecular genetic basis for the LMN disease in these Romney sheep, a
search for putative homozygous-by-descent (also named identical by descent, IBD) regions
was initiated by performing a genome wide scan using an in-house UNIX script as described
in our previous publication (Zhao et al., 2011). The SNP genotypes were ordered according
to their genomic location as provided in the Illumina manifest, and used to identify regions of
consecutive homozygous SNP loci that were common to all affected animals. The same
procedure was applied to known carriers to ensure that the identified region was not simply
fixed across all animals.
A threshold of 10 SNPs defining a candidate consecutive
homozygous region was chosen after calculating expected IBD fragment size for 19 affected
lambs produced through the outcross and backcross experiments assuming exactly one
crossover event occurred every meiosis on the chromosome carrying the causal mutation
66
with uniform probability along the chromosome expressed in linkage units (cM).
The
homozygous fragment containing the causative mutation would have exceeded 0.5 cM or
about 10 consecutive SNP with probability > 95%. The expected size of the IBD region that
contains the causative mutation depends in part on the number of meioses that separate the
common ancestor from the affected lambs, and the number of affected individuals. The IBD
region that included the causative mutation had 23% probability of being shorter than 2 cM
and a 32% probability being longer than 5 cM. The mean expected length of the IBD region
was about 4 cM.
Genes in the identified IBD regions were examined for potential
involvement using comparative homology between ovine and bovine genomes as provided
through
the
Ovine
Genome
Assembly
v1.0
(http://www.livestockgenomics.csiro.au/perl/gbrowse.cgi/oar1.0/#search).
Positional candidate genes sequencing
Primer3 software (version 0.4.0) and the reference sequences from Ovine Genome Assembly
v1.0 were used to design PCR primers flanking the coding regions of ovine AGTPBP1
(Supplementary Table S1). Among the many genes in the identified homozygous regions,
AGTPBP1 was considered to be the best candidate gene because of its role in neural
functions as reported by several publications (Harris et al., 2000; Fernandez-Gonzalez, et al.,
2002). PCR was performed using a 10 µl cocktail mixture and a standard program as
previously described (Zhao et al., 2011), with an annealing temperature of 58ºC or 60ºC.
Negative controls were performed to check for the presence of contaminants in all reactions.
PCR products were verified by 1.5% agarose gel electrophoresis, and treated with ExoSAPIT® (Affymetrix, USA), before pooling them from 2 affected and 2 carrier sheep. The
pooled PCR products were sequenced commercially using a 3730xl DNA Analyzer (Applied
67
Biosystems, USA). The Sequencher software (version 3.0) was used for sequence alignment
and comparison.
Mutation analyses
Since annotation of the ovine genome has not yet been completed, the bovine AGTPBP1
reference mRNA sequence (GenBank: XM_001252536.3) was used as a query in crossspecies BLAST searches to identify sheep EST. Both the bovine AGTPBP1 mRNA and
corresponding ovine EST served as references to search for the coding regions within ovine
AGTPBP1 gene.
The AGTPBP1 protein sequence was predicted using successfully
amplified and re-sequenced exonic regions of the gene from the sheep population, the bovine
AGTPBP1 reference mRNA sequence and ovine EST. In order to understand the functional
annotation of this protein, the conserved domains within this sheep protein sequence were
searched by blasting the complete protein sequence against the Conserved Domains database
(database:
CDD-40526
PSSMs)
in
NCBI
(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) while setting an option of 500 as the
maximum number of hits. Multiple sequence alignments provide a basis for conserved
domain models. The specific hits and super families were given for the queried protein.
Moreover, the conserved protein residues involved in conserved features such as binding or
catalysis were annotated when the query sequence residues were in the Entrez Protein
database. The sequence alignments of protein AGTPBP1 from multiple species including
human, mouse, chicken, frog, purpuratus, sea squirt, trichoplax, zebrafish and sheep, were
displayed to show the conserved feature sites. The fold structure of the conserved domain of
AGTPBP1 in sheep (residues 850-1150) and mouse (residues 800-1110) were predicted
using the 3D_PSSM program (http://www.sbg.bio.ic.ac.uk/3dpssm/index2.html). The known
68
crystal structures of carboxypeptidases in the PDB database help to generate the predicted
structure of AGTPBP1.
Sequence alignment was performed on the ClustalW2 server
(http://www.ebi.ac.uk/Tools/msa/clustalw2/) among sheep and mouse AGTPBP1, human
(PDB ID: 1DTD) carboxypeptidase A2 (CPA2) and cattle (PDB ID: 1HDU)
carboxypeptidase A (CPA).
This helped us to be more confident about the conserved
features of AGTPBP1 secondary structure and the amino acids critical for carboxypeptidase
catalysis.
A three dimensional macromolecular structure of the conserved domain was
searched and downloaded using homologous sequences with confirmed crystal structure
(http://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtml).
Each SNP variant was
analyzed to check whether it could induce missense or nonsense mutations. The identified
SNPs were then genotyped using restriction enzyme digestion (PCR-RFLP: Polymerase
Chain Reaction-Restriction Fragment Length Polymorphism) on the affected and carrier
sheep as well as on other normal controls.
Restriction enzyme digestion (PCR-RFLP)
The sequences of PCR primer “LMN_new” (Supplementary Table S1) were 5’
TGTTAGCACCTCCTCTTTGCT3’ and 5’ AGCAGACCCTTGGCATGATA3’.
DNA
fragments from the PCR reactions were subjected to restriction enzyme digestion with AciI
(New England Biolabs Inc., USA).
The missense mutation in AGTPBP1 removes a
recognition site of the AciI enzyme. The 10 µl system reaction mix contained 1 µl of NEB
buffer 3, 0.5 units of AciI and 2 µl of PCR product. The reaction mix was incubated at 37ºC
overnight, and the products were analyzed on a 2.0% (w/v) ultra-pure agarose gel.
RESULTS
Detection of IBD regions
69
The whole genome scan discovered 209 homozygous fragments consisting of 3 or more
consecutive SNPs shared by all 19 affected lambs.
The size distribution of common
homozygous fragments in these lambs is shown (Figure 1). Only one large homozygous
segment exceeded the 10-SNP threshold and it consisted of 136 consecutive SNP loci
identified in all 19 affected sheep, whereas the 11 carriers exhibited heterozygosity in this
segment. The region started from SNP s56017.1 and extended to SNP OAR2_36428027.1.
This segment is physically positioned on the small arm of ovine chromosome 2 (OAR 2)
from 29,284,415 bp to 36,428,027 bp, covering a region of 7 Mb (Figure 2a and b). We
found this segment likely encompassed 36 genes (Figure 2d and Supplementary Table S2)
based on the bovine reference sequences. A gene called ATP/GTP binding protein, transcript
variant 1 (AGTPBP1), also named Nna1 (Nervous system nuclear protein induced by
axotomy protein 1 homolog) was the most plausible candidate due to its involvement in
axonal regeneration and its causal relationship with mouse pcd phenotype (Harris et al., 2000;
Fernandez-Gonzalez et al., 2002). The ovine AGTPBP1 gene spans about 0.17 Mb and
encodes a 1,225-amino-acid intracellular protein that contains a predicted zinc
carboxypepetidase domain near its C-terminus (Harris et al., 2000). This gene was initially
cloned from regenerating spinal cord neurons of the mouse (supplied by OMIM 606830).
Gene coding region re-sequencing and mutation analyses
Functional mutations in candidate genes were examined by re-sequencing all available exons
using primers based on the adjacent intronic sequences of these genes. The ovine AGTPBP1
gene was predicted to have 25 exons based on the bovine reference gene and protein
sequences. All the exons except exons 11, 12 and 17 for this gene were successfully
amplified and three exonic SNPs were identified. The first exonic SNP (called c.2909G>C)
70
was located at the 6th bp of exon 21 of the sheep AGTPBP1 gene. The other two were
located at the 93rd and 99th bp of exon 22. An additional 17 SNPs were identified in either
introns or the 3’ UTR on this gene. These were immediately excluded as causative loci since
none were located either in the splicing donor/receptor regions or other important regulatory
regions and were not in concordance with sheep phenotypes (data not shown).
After
comparing our findings with the bovine coding sequence of AGTPBP1, we observed that the
first exonic SNP (c.2909G>C) introduced a missense transition (Arg>Pro) at amino acid 970
(p.Arg970Pro) (Figure 2e), and the third exonic SNP (c.3133T>C) also resulted as a
missense mutation causing a Pro to Ser substitution (p.Pro1045Ser). The second exonic SNP
(c.3126C>T) was a synonymous mutation and did not change any amino acid. A conserved
domain named M14_Nna1 (CDD: cd06906) spanning from amino acid residue 859 to 1136
on AGTPBP1 in sheep was predicted in this study using the “CD-search” option available in
NCBI.
This domain is a zinc-binding carboxypeptidase domain, which hydrolyzes the
peptide bonds at the C-terminal part of certain amino acids in polypeptide chains. The
sequence alignment among multiple species demonstrated 2 specific feature hits including
metallocarboxypeptidase activity and zinc binding.
Nine conserved amino acid residues
were involved in these two features hits. (Supplementary Figure S1 and Figure S2; GomisRṻth et al., 1999; Aloy et al., 2001). Three conserved residues (His920, Glu923 and
His1017) expressed as a motif “HXXE…H” are responsible for zinc binding, and all 9
conserved residues (His920, Glu923, Arg970, Asn979, Arg980, His1017, Gly1018, Met1027
and Glu1102) are important for substrate binding and catalysis of this protein
(Supplementary Figure S1 and Figure 3a). In terms of the size and relative positioning of βsheets and α-helices, the overall predicted secondary structures of AGTPBP1 in sheep and
71
mouse were the same as that of other metallocarboxypepdases in human and cattle by using
the 3D-PSSM program. Features and conserved amino acids and motifs are located in
similar positions. For example, the well-known zinc-binding HXXE…H motif lies between
the second β-sheet and the first α-helix as reported (Wang et al., 2006). Six residues (His920,
Glu923, Arg970, Asn979, Arg980 and Glu1102) remain conserved among these proteins
(Supplementary Figure S3). The first exonic missense mutation (c.2909G>C) identified on
exon 21 is the middle nucleotide within a codon “CGC” which was translated and substituted
from Arg970 (codon “CGC”), one of the conserved residues responsible for substrate binding
and catalysis, to Pro970 (codon “CCC”) (Figure 3b). Molecular modeling based on the
crystal structure of the homologous protein, putative carboxypeptidase (PBDID: 3K2K), also
demonstrated that Arg970 forms a part of a loop (the short yellow section of the middle loop
in Figure 3c), which might be a key site for substrate binding or catalytic activity. The amino
acid residue encoded by the exonic missense mutation on exon 22 (c.3133T>C;
p.Pro1045Ser) varies considerably among different species (Figure 3b). It indicates that this
missense mutation may not affect the function of the protein AGTPBP1 as it is neither
involved in zinc binding nor in substrate binding and catalysis.
The Arg970Pro mutation was genotyped by PCR-RFLP (Figure 2e). The 611 base pair
length of the “LMN_new” PCR fragment amplified from normal sheep had one restriction
site, which could be recognized by the restriction enzyme AciI and cut into 2 fragments with
lengths of 519 bp, and 92 bp.
However, the restriction site was removed due to the
Arg970Pro mutation and the 611 bp fragment amplified from the affected lambs was intact
after digestion. Therefore, the PCR fragment amplified from sheep with a genotype of “GG”
was cut into 2 fragments with lengths of 519 bp, and 92 bp; the one with a genotype of “GC”
72
had 3 fragments with lengths of 611 bp, 519 bp, and 92 bp; the one with a genotype of “CC”
had a single fragment with length 611bp. Our genotyping results showed all 19 affected
sheep were “CC”, the 11 known carriers were “GC”, and the 22 phenotypically normal but
possible carrier sheep were either “GC” or “GG” genotypes.
In contrast, 85 control
(unrelated) sheep were only “GG” genotypes (Table 1). This mutation was 100% concordant
with the recessive pattern of inheritance in affected, carrier, phenotypically normal and
unrelated normal individuals. The two exonic SNP (c.3126C>T and c.3133T>C) on exon 22
were genotyped by direct sequencing. However, there is no perfect concordant rate for these
two SNPs based on the counts for each of the genotypes (Table 1). At this stage, we
excluded these two SNP as causative mutations for this defect due to their genotyping results
and mutation analyses. Taken together, our results strongly suggest that the Arg970Pro
substitution is the causative locus for LMN disease in Romney sheep and it could act by
decreasing the function of protein AGTPBP1 to generate LMN disease in these lambs.
DISCUSSION
The AGTPBP1 gene encodes a protein which belongs to a member of the cytosolic
carboxypeptidase subfamily (M14D subfamily) (Kalinina et al., 2007; Rodriguez de la Vega
et al., 2007). This protein plays a role in protein turnover by cleaving proteasome-generated
peptides into amino acids (Berezniuk et al., 2010), enzymatically removing gene-encoded
glutamic acids or tyrosine from the C terminus of proteins (Rogowski et al., 2010; Kalinina
et al., 2007), and also plays a role in neuronal bioenergetics at least in mouse brains
(Chakrabarti et al., 2010). Similar to other cytosolic carboxypeptidases, the mouse Nna1
protein is broadly expressed in brain, pituitary, eye, testis and other tissues (Kalinina et al.,
2007). Mutations or abnormal expression of Nna1 have been found to cause a Purkinje cell
73
degeneration (pcd) phenotype in mice characterized by degeneration of Purkinje cells, mitral
cells of the olfactory bulb, thalamic neurons and retinal photoreceptor cells as well as
degeneration of sperm (Fernandez_Gonzalez et al., 2002; Mullen et al., 1976). Twelve
alleles in mouse mutants related to Agtpbp1 have been summarized by the Jackson
Laboratory (http://www.informatics.jax.org/searches/accession_report.cgi?id=MGI:2159437).
Among which, loss-of-function of Agtpbp1 was indicated as being responsible for pcd3J and
pcd5J mouse mutants (Fernandez-Gonzalez, et al., 2002; Wang and Morga, 2007; Chakrabarti
et al., 2008). Expression of full-length Nna1 successfully rescued Purkinje cell loss and
ataxic behavior in pcd3J, and retinal photoreceptor loss and Purkinje cell degeneration in
pcd5J mice (Wang et al. 2006; Chakrabarti et al., 2008).
In this study, the missense mutation c.2909G>C on exon 21 of AGTPBP1 induces an Arg to
Pro substitution (p.Arg970Pro) at amino acid 970 (R970). Based on the crystal structure of
human CPA2, it has been reported that arginine 127 (R127) in CPA2 is an electrophile site
crucially responsible for the catalytic activity along with glutamate 270 (Mangani et al., 1994;
Garcia-Saez et al., 1997; Christianson et al., 1989). Similarly, the homologous arginine
residue R127 (numbered according to mature bovine carboxypeptides A1) in C. elegans
AGBL4 (AGP/GTP binding like protein 4) has been predicted to stabilize the oxyanion hole
in the S1’ site of the carboxypeptidase domain (Rodriguez de la Vega et al., 2007).
Sequence comparison and structural modeling of sheep AGTPBP1 with other
carboxypeptidases predicted R970 was located in a region known as the substrate-binding
pocket and catalytic center of metallocarboxypeptidase as mouse Nna1 (Supplementary
Figure S3; Wang et al., 2006). These results enforce the view that R970 in sheep is a very
conserved site being critical for its function of catalytic activity. Therefore, we concluded
74
the Arg970Pro substitution could be a candidate causative mutation possibly altering the
characteristic of the AGTPBP1 protein and very likely impairing its function of catalytic
activity.
Comparing to the phenotype of the original pcd mouse model, the affected sheep do not have
apparent degeneration of Purkinje cells but show their specific characteristics including
degeneration of neurons in ventral horns of the spinal cord and brain stem, Wallerian
degeneration of motor nerve fibers and atrophy of skeletal muscles. The phenotype of these
sheep is more similar to the symptoms in the Sod1 transgenetic mice (mSOD1), which were
constructed as animal models for human ALS patients (Gurney et al., 1994). It is interesting
to note that the Lox (lysyl oxidase) gene, encoding a protein playing an important role in
formation and repairing of extracellular matrix, was found to be up-regulated in both mSOD1
mice and Agtpbppcd-sid mice (Li et al., 2004; Li et al., 2010; Lucero and Kagan, 2006). The
Agtpbppcd-sid mutant was characterized as a deletion of Agtpbp1 exon 7 that produced a stop
codon in exon 8.
Absence of protein Agtpbp1 will decrease dendrite development of
cerebellar Purkinje cells by changing the expression of downstream components, including
Lox propeptide and the NF-κB signaling pathway in mice (Li et al., 2010). Therefore, we
assume the p.Arg970Pro substitution might impair the catalytic activity of AGTPBP1 and
decrease its ability of protein turnover of LOX. The overabundant protein LOX would cause
abnormal development of dendrite in LMN and then develop phenotypes shown in these
sheep.
In order to ascertain that the Arg970Pro substitution was really a causative mutation,
measuring mRNA expression of the genes AGTPBP1 and LOX and their enzymatic activities
in the spinal cord, brain stem and cerebellar tissues would be desired. It could exclude the
75
possibility that the Arg970Pro mutation in protein AGTPBP1 is only a coincidental finding
from the genetic mapping study or this mutation disrupts expression of another gene in the
same region of the genome. Nevertheless, a point mutagenesis of mouse Nna1 gene at
arginine 962 (R962), the homologous residue of R970 in sheep, to generate transgenic mice
and then using the intact full-length Nna1 gene to rescue phenotypes in mice will provide
further evidence for the Arg970Pro substitution being a causative mutation in sheep.
The missense mutation c.2909G>C (p.Arg970Pro) in AGTPBP1 appears to be a spontaneous
one arising from a single sheep family.
In order to avoid spreading out this disease
throughout the sheep population, a simple genetic test can be easily designed based on our
results to identify and remove carriers with the defective c.[2902C] allele, and to decrease the
risk of economic losses in the future .
Regenerating motor neurons expressing protein NNA1 in human nervous tissues indicated
that NNA1 was a key factor for motor neuronal degeneration (Harris et al., 2000). To date,
no human cases have been reported with spontaneous mutations in AGTPBP1 that would
implicate the gene as a cause for a LMN disease. Neither have they been reported to occur in
other large mammalian species. Mice and hamsters are convenient animal models for studies
of human neural diseases (Mashimo et al., 2009; Akita et al., 2007). However, there are still
significant structural and functional differences between rodent and large mammalian species.
Even among rodents themselves, the gene distributions could be different. Hamsters with
undetectable Nna1 mRNA in the brain did not exhibit distinctive neurodegenerative features
as serious as mice mutants, which could possibly be explained due to the different
distributions of Nna1 and other Nna 1-like genes in the brains of mice and hamsters
(Kalinina et al., 2007; Akita et al., 2007; Akita and Arai, 2009). Sheep have large brain size
76
and long life span relative to rodents and similar gestation period as humans. Our finding in
sheep, a large mammalian species, raises the possibility that human patients with similar
etiology may also result from this gene or other genes in the same pathway of neuronal
development. Since there is no effective treatment available for motor neuron diseases, the
identification of molecular pathogenetic targets will shed light on the development of gene
therapy for this type of diseases (Nizzardo et al., 2011). Here we reported that LMN disease
originating in a Romney ram was likely caused by a mutation in a gene locus AGTPBP1
(NNA1). The affected sheep could be used as a valuable animal model for gene therapy if
human patients are successfully diagnosed in the future.
ACKNOWLEDGEMENTS
The authors appreciate the financial support provided by State of Iowa and Hatch Funds and
the support from the College of Agriculture and Life Sciences at Iowa State University.
Animal studies and diagnoses were funded by Massey University. H.T.B. was partially
funded by the National Research Centre for Growth and Development.
Conflict of interest
The authors declare no conflict of interest.
Supplementary information
Supplementary information is available at Heredity’s website.
Data archiving
Data deposited in the Dryad repository: doi:10.5061/dryad.8fp2pc88
REFERENCES
Akita K, Arai S (2009). The ataxic Syrian hamster: an animal model homologous to the pcd
mutant mouse? Cerebellum 8: 202-210.
77
Akita K, Arai S, Ohta T, Hanaya T, Fukuda S (2007). Suppressed Nna1 gene expression in
the brain of ataxic Syrian hamsters. J Neurogenet 21: 19-29.
Aloy P, Companys V, Vendrell J, Aviles FX, Fricker LD, Coll M et al. (2001). The crystal
structure of the inhibitor-complexed carboxypeptidase D domain II and the modeling of
regulator carboxypidases. J Biol Chem 276:16177-16184.
Anderson PD, Parton KH, Collett MG, Sargison ND, Jolly RD (1999). A lower motor neuron
disease in newborn Romney lambs. N Z Vet J 47: 112-114.
Berezniuk I, Sironi J, Callaway MB, Castro LM, Hirata IY, Ferro ES et al. (2010).
CCP1/Nna1 functions in protein turnover in mouse brain: Implications for cell death in
Purkinje cell degeneration mice. Faseb J 24: 1813-1823.
Borasio GD, Gelinas DF, Mechanical YN (1998). Ventilation in amyotrophic lateral sclerosis:
a cross-cultural perspective. J Neurol 245: Suppl 2:S7-12.
Chakrabarti L, Eng J, Martinez RA, Jackson S, Huang J, Possin DE et al. (2008). The zincbinding domain of Nna1 is required to prevent retinal photoreceptor loss and cerebellar
ataxia in Purkinje cell degeneration (pcd) mice. Vision Res 48: 1999-2005.
Chakrabarti L, Zahra R, Jackson SM, Kazemi-Esfarjani P, Sopher BL, Mason AG et al.
(2010). Mitochondrial dysfunction in NnaD mutant flies and Purkinje cell degeneration mice
reveals a role for Nna proteins in neuronal bioenergetics. Neuron 66: 835-847.
Christianson DW, Mangani S, Shoham G, Lipscomb WN (1989). Binding of Dphenylalanine and D-tyrosine to carboxypeptidase A. J Biol Chem 264: 12849-12853.
Dion PA, Daoud H, Rouleau GA (2009). Genetics of motor neuron disorders: new insights
into pathogenic mechanisms. Nat Rev Genet 10: 769-782.
Fernandez-Gonzalez A, La Spada AR, Treadaway J, Higdon JC, Harris BS, Sidman RL et al.
(2002). Purkinje cell degeneration (pcd) phenotypes caused by mutations in the axotomyinduced gene, Nna1. Science 295: 1904-1906.
Garcia-Saez I, Reverter D, Vendrell J, Aviles FX, Coll M (1997). The three-dimensional
structure of human procarboxypeptidase A2. Deciphering the basis of the inhibition,
activation and intrinsic activity of the zymogen. Embo J 16: 6906-6913.
Gomis-Rṻth FX, Companys V, Qian Y, Fricker LD, Vendrell J, Aviles FX et al. (1999).
Crystal structure of avian carboxypeptidase D domain II: a prototype for the regulatory
metallocarboxypeptidase subfamily. EMBO J 18:5817-5826
Gurney ME, Pu H, Chiu AY, Dal Canto MC, Polchow CY, Alexander DD et al. (1994).
Motor neuron degeneration in mice that express a human Cu,Zn superoxide dismutase
mutation. Science 264:1772-1775.
Harris A, Morgan JI, Pecot M, Soumare A, Osborne A, Soares HD (2000). Regenerating
motor neurons express Nna1, a novel ATP/GTP-binding protein related to zinc
carboxypeptidases. Mol Cell Neurosci 16: 578-596.
78
Kalinina E, Biswas R, Berezniuk I, Hermoso A, Aviles FX, Fricker LD (2007). A novel
subfamily of mouse cytosolic carboxypeptidases. Faseb J 21: 836-850.
Lee CN (2012). Reviewing evidences on the management of patients with motor neuron
disease. Hong Kong Med J 18:48-55.
Li J, Gu X, Ma Y, Calicchio ML, Kong D, Teng YD et al. (2010). Nna1 mediates Purkinje
cell dendritic development via lysyl oxidase propeptide and NF-kappaB signaling. Neuron 68:
45-60.
Li PA, He Q, Cao T, Yong G, Szauter KM, Fong KS et al. (2004). Up-regulation and altered
distribution of lysyl oxidase in the central nervous system of mutant SOD1 transgenic mouse
model of amyotrophic lateral sclerosis. Brain Res Mol Brain Res 120:115-122.
Lucero HA, Kagan HM (2006). Lysyl oxidase: an oxidative enzyme and effector of cell
function. Cell Mol Life Sci 63:2304-2316.
Mangani S, Ferraroni M, Orioli P (1994). Interaction of carboxypeptidase A with anions:
crystal structure of the complex with the HPO42- anion. Inor Chem 33: 3421–3423.
Mashimo T, Hadjebi O, Amair-Pinedo F, Tsurumi T, Langa F, Serikawa T et al. (2009).
Progressive Purkinje cell degeneration in tambaleante mutant mice is a consequence of a
missense mutation in HERC1E3 ubiquitin ligase. PloS Genet 5:e1000784.
Mullen RJ, Eicher EM, Sidman RL (1976). Purkinje cell degeneration, a new neurological
mutation in the mouse. Proc Natl Acad Sci USA 73: 208-212.
Nizzardo M, Simone C, Falcone M, Riboldi G, Rizzo F, Magri F et al. (2012). Research
advances in gene therapy approaches for the treatment of amyotrophic lateral sclerosis. Cell
Mol Life Sci 69: 1641-1650.
Rodriguez de la Vega M, Sevilla RG, Hermoso A, Lorenzo J, Tanco S, Diez A et al. (2007).
Nna1-like proteins are active metallocarboxypeptidases of a new and diverse M14 subfamily.
Faseb J 21: 851-865.
Rogowski K, van Dijk J, Magiera MM, Bosc C, Deloulme JC, Bosson A et al. (2010). A
family of protein-deglutamylating enzymes associated with neurodegeneration. Cell 143:
564-578.
Rosen DR, Siddique T, Patterson D, Figlewicz DA, Sapp P, Hentati A et al. (1993).
Mutations in Cu/Zn superoxide dismutase gene are associated with familial amyotrophic
lateral sclerosis. Nature 362:59-62.
Wang T, Morgan JI (2007). The Purkinje cell degeneration (pcd) mouse: an unexpected
molecular link between neuronal degeneration and regeneration. Brain Res 1140: 26-40.
Wang T, Parris J, Li L, Morgan JI (2006). The carboxypeptidase-like substrate-binding site
in Nna1 is essential for the rescue of the Purkinje cell degeneration (pcd) phenotype. Mol
Cell Neurosci 33: 200-213.
Zhao X, Dittmer KE, Blair HT, Thompson KG, Rothschild MF, Garrick DJ (2011). A novel
nonsense mutation in the DMP1 gene identified by a genome-wide association study is
responsible for inherited rickets in Corriedale sheep. PLoS One 6: e21739.
79
Table 1. The genotyping results of three exonic SNPs
Exon 21
Sheep Breed
Romney and
Romney-sired
crossbreds
Texel/Corrieda
le
a
Exon 22
c.2909G>C (p.Arg970Pro)
Rate of
Concordan
C
C
G
ce
C
G
G
T
T
Affected
Known
carrier
Phenotyp
e normala
Normal
19
0
0
100%
19
0
0
100%
19
0
0
100%
0
13
0
100%
3
10
0
76.9%
10
3
0
23.1%
0
13
9
100%
1
13
6
95.0%
18
2
0
10.0%
0
0
14
100%
Normal
0
0
71
100%
LMND
Status
c.3126C>T
C
T
C
C
Rate of
Concordan
ce
c.3133T>C (p.Pro1045Ser)
Rate of
Concordan
C
C
T
ce
C
T
T
These sheep are including phenotypically normal and suspected sheep which were died
without clear reasons.
Figure 1. The distribution of identical homozygous fragments in LMND affected lambs.
Based on the different number of SNPs spanning the homozygous fragment, the counts
of identical homozygous fragments in all affected lambs were shown. Only one
homozygous fragment spanning 136 SNPs was identified to be common to all 19
affected lambs when the “larger than 10-SNP” threshold was applied.
80
Figure 2. Work flow chart for discovering the mutation locus of low motor neuron
disease in Romney sheep. (a) The sheep chromosome 2 (OAR 2). (b) The IBD region
containing the causative locus is located on OAR 2 between 29 and 36 Mb. (c) SNP
locations on the Ovine SNP50 BeadChip in the IBD region (d) The black boxes show
reference genes encompassing the sheep IBD region based on the comparative genomic
maps. The gene AGTPBP1 was depicted by a red lined box. (e) Sequence analyses of
AGTPBP1 showed a point mutation in exon 21, resulting in an Arg to Pro substitution.
This missense mutation is located in a highly conserved residue for the catalytic activity
of this protein. The PCR-RFLPs show different patterns for affected and carrier
groups.
81
Figure 3 The conserved domain structures of partial protein sequence of AGTPBP1. (a)
The conserved domains includes Zinc-binding sites and putative active sites shared by
M14_Nna1 like superfamily genes (b) Alignment of predicted sheep protein AGTPBP1
with 8 most dissimilar protein sequences from species of human, mouse, chicken, frog,
purpuratus, sea squirt, trichoplax and zebrafish. The residues labeled by hash-marks
on the top were known to be involved in substrate binding or catalytic activity. The two
missense mutations were labeled on the protein sequence. Upper case amino acids are
aligned and a read to blue color scale represents the degree of conservation, with red
showing highly conserved residues. Lower case, grey amino acids are unaligned.
Dashes indicate variations in sequence length among multiple sequence alignment. (c)
Three dimensional molecular modeling of the conserved domain based on the crystal
structure of putative carboxypeptidase (PBDID: 3K2K). The short yellow section in the
middle loops represents the position of the mutated amino acid residue.
82
Table S1. Primer sequences, annealing temperatures, PCR amplicon information and
targeted encoding regions of gene AGTPBP1.
Primer
name
Primer sequence
sAGTe1bF
sAGTe1bR
sAGTe2F
sAGTe2R
sAGTe3F
sAGTe3R
sAGTe4F
sAGTe4R
sAGTe5F
sAGTe5R
sAGTe6F
sAGTe6R
sAGTe7F
SAGTE7R
sAGTe8F
sAGTe8R
sAGTe9F
sAGTe9R
sAGTe10F
sAGTe10R
sAGTe13_1F
sAGTe13_1R
sAGTe13_2F
sAGTe13_2R
sAGTe14F
sAGTe14R
sAGTe15_16F
sAGTe15_16R
sAGTe18F
sAGTe18R
sAGTe19F
sAGTe19R
sAGTe20F
sAGTe20R
sAGTe21F
sAGTe21R
LMN_newF
LMN_newR
sAGTe22F
sAGTe22R
sAGTe23F
sAGTe23R
sAGTe25_1F
sAGTe25_1R
sAGTe25_2F
sAGTe25_2R
5’TCAAACATTCTCCTGTGGTTTC3’
5’TGCTCAACACTCATTTTCTGC3’
5’CAGGGTAGGATTTTGGCTTG3’
5’ATGGGGAAAGGAGGACAATC3’
5’AGCAACATCAGTGGAAGGAAC3’
5’CTCCCATTCACTGGCTCTTC3’
5’GCTCAAGTGATTTGCAAGAATTT3’
5’GGTTTGAGGGACACTGGAAG3’
5’CTTCCAGTGTCCCTCAAACC3’
5’CTTCCTTTTTCCATCACATGC3’
5’CCCGCCTCCTACTCTTTCTT3’
5’TCACCTTTGAGGGAGTGGTT3’
5’GTTGCTTTTGGGCAATTCAC3’
5’AAGGCCCCAAATATCAAAGG3’
5’TCCTTTGTGAACTCTGGCCTA3’
5’CATCAATCAACTGCACTAAAAGA3’
5’TGTTTTCAATTTGACTTTACCAGTG3’
5’CCTATCATTAACAATAACAGGAAGAAA3’
5’GCCATATGAGGTTCTTACAAAGAGA3’
5’GAAGCCAGCAAGAAGCATGT3’
5’GGTGTAATGAATTTTGTGCTGGT3’
5’CAACGTAATTCGGTCCAAGG3’
5’AGCAGCTCGGTGATGAAAGT3’
5’TGTCCAATGGCCAGGTTTAT3’
5’TTTACTTGGAGGAAAGTTGTGC3’
5’AAACTGCAGCATTCATACAAAATC3’
5’GTGAGCGAGCAATAATGGTG3’
5’TCCTCTGGATGAACAGACTTCA3’
5’ACAGAATTTTTGTGTGGATGAAT3’
5’GAGATTCAATGCTTCTCATTTAAACA3’
5’TGTTTGAGATCTGGATGCATTT3’
5’GGCAGCTCCCTCATATTCAA3’
5’ATGAAATCAAGGTAATGTTAAAGAACT3’
5’ACACAGGACAGCCACAAAAA3’
5’TTGGAGGGTCCAGATAGCAC3’
5’TGCTTTTGAAACTAACATACAGCTT3’
5’TGTTAGCACCTCCTCTTTGCT3’
5’AGCAGACCCTTGGCATGATA3’
5’ACCCTTGGGACAATGATGAA3’
5’CAGAATAGCAAAATGGCAACA3’
5’TTTTTATCTTTGCAGACTTTGCCTA3’
5’TTCATAAGAACCCAAAGCAACC3’
5’AACCCAAATTACTGCTTGCAT3’
5’CCAAACAGGTCCAAGATGCT3’
5’CTTGAACCCTTTGCCATCAC3’
5’GGTTTCAAGGCAAACTCTGG3’
Annealing
temperature
(°C)
Targeted
exon
60
Exon 1
443
60
Exon 2
678
60
Exon 3
445
60
Exon 4
589
60
Exon 5
521
60
Exon 6
407
60
Exon 7
520
60
Exon 8
321
60
Exon 9
410
60
Exon 10
519
60
Exon 13
815
60
Exon 13
502
60
Exon 14
522
60
Exon 15 & 16
502
60
Exon 18
421
60
Exon 19
540
60
Exon 20
633
60
Exon 21
611
58
Exon 21
611
60
Exon 22
400
60
Exon 23
429
60
Exon 25
522
60
Exon 25
Size
(bp)
783
Table S2. The list of ovine positional candidate genes based on the bovine reference gene sequences.
Bovine Gene
Accession No.
XM_586318
NM_001038082
XM_590060
NM_001024527
XM_599340
NM_001034597
NM_173947
NM_001101069
NM_173946
XM_001253324
XM_580638
XM_001788193
NM_001076436
XM_877619
NM_001081523
XM_001788164
XM_613307
NM_001076439
NM_001082606
NM_001103310
XM_599250
NM_174316
XM_610259
NM_001046164
XM_613544
NM_001083686
XM_593294
Bovine Ref. Gene
Symbol
LOC509375
NINJ1
SUSD3
FGD3
IPPK
ECM2
OMD
IARS
OGN
NOL8
ZNF484
LOC616883
C8H9orf21
CDC14B
HABP4
LOC100140121
ZNF367
HSD17B3
LOC508357
LOC789977
PTCH1
FANCC
AP-O (LOC531757)
FBP2
DAPK1
CTSL2
GAS1
OAR2:34660860..34718709
OAR2:34725688..34738850
OAR2:34778116..34782313
OAR2:34874490..34921202
XM_590860
NM_001034470
XM_001787825
XM_595216
ZCCHC6
ISCA1
LOC785863
GOLM1
OAR2:34929288..35016263
OAR2:34929660..35016263
OAR2:35074125..35075486
OAR2:35163069..35333039
OAR2:35990235..36232516
XM_868391
XR_042634
XR_028207
XM_001252536
NM_001075225
MAK10
LOC782560
LOC785119
AGTPBP1
NTRK2
Bovine Ref. Gene Name
PREDICTED: Bos taurus similar to Ninjurin 1
Bos taurus ninjurin 1
PREDICTED: Bos taurus similar to sushi domain containing 3
Bos taurus FYVE, RhoGEF and PH domain containing 3
PREDICTED: Bos taurus similar to inositol 1,3,4,5,6-pentakisphosphate 2-kinase
Bos taurus extracellular matrix protein 2, female organ and adipocyte specific
Bos taurus osteomodulin
Bos taurus isoleucyl-tRNA synthetase
Bos taurus osteoglycin
PREDICTED: Bos taurus nucleolar protein 8
PREDICTED: Bos taurus similar to zinc finger protein 484
PREDICTED: Bos taurus similar to zinc finger protein 782
Bos taurus chromosome 9 open reading frame 21 ortholog
PREDICTED: Bos taurus similar to CDC14 homolog B, transcript variant 6
Bos taurus hyaluronan binding protein 4
PREDICTED: Bos taurus similar to rCG47065
PREDICTED: Bos taurus similar to zinc finger protein 367, transcript variant 1
Bos taurus hydroxysteroid (17-beta) dehydrogenase 3
Bos taurus chromosome 9 open reading frame 102 ortholog
Bos taurus hypothetical LOC789977
PREDICTED: Bos taurus patched homolog 1 (Drosophila)
Bos taurus Fanconi anemia, complementation group C
PREDICTED: Bos taurus similar to Aminopeptidase O (AP-O)
Description: Bos taurus fructose-1,6-bisphosphatase 2
PREDICTED: Bos taurus similar to death-associated protein kinase 1, transcript variant 1
Bos taurus cathepsin L2
Bos taurus similar to growth arrest-specific 1
PREDICTED: Bos taurus similar to zinc finger, CCHC domain containing 6, transcript
variant 1
Bos taurus iron-sulfur cluster assembly 1 homolog (S. cerevisiae)
PREDICTED: Bos taurus similar to chromosome 9 open reading frame 153
PREDICTED: Bos taurus similar to golgi membrane protein 1
PREDICTED: Bos taurus MAK10 homolog, amino-acid N-acetyltransferase subunit, (S.
cerevisiae)
PREDICTED: Bos taurus misc_RNA
PREDICTED: Bos taurus misc_RNA
PREDICTED: Bos taurus similar to AGTPBP1 protein, transcript variant 1
Bos taurus neurotrophic tyrosine kinase, receptor, type 2
83
Ovine Chromosome
Position
OAR2:29285324..29294958
OAR2:29285341..29296154
OAR2:29338280..29380982
OAR2:29382607..29413924
OAR2:29570653..29607558
OAR2:29648909..29688207
OAR2:29744485..29759383
OAR2:29878441..29959955
OAR2:29762217..29777791
OAR2:29845850..29871806
OAR2:29995325..30004700
OAR2:30316185..30318955
OAR2:30350587..30352681
OAR2:30476264..30592054
OAR2:30610577..30646729
OAR2:30685815..30689928
OAR2:30688798..30713324
OAR2:30790734..30845612
OAR2:31108137..31147487
OAR2:31292157..31295300
OAR2:31555711..31623497
OAR2:31850742..32286795
OAR2:32301777..32654570
OAR2:32877326..32920047
OAR2:33007170..33228793
OAR2:32942781..32949974
OAR2:33971137..33971720
84
Feature
sheep_A
sheep_N
mice
frog
purpuratus
zebrafish
human
chicken
sea squirt
trichoplax
859
859
793
853
694
785
859
876
902
218
# #
LQKLESahn-PQQIYFRKDVLCETLSGNSCPLVTITAMPESNYYEHI cq----FRNRPYVFLSARVHPGETNASWVMKGT
LQKLESahn-PQQIYFRKDVLCETLSGNSCPLVTITAMPESNYYEHI cq----FRNRPYVFLSARVHPGETNASWVMKGT
LQKLESahn-PQQIYFRKDVLCETLSGNICPLVTITAMPESNYYEHI cq----FRTRPYIFLSARVHPGETNASWVMKGT
LQKLESlhs-PQQIYFRQEVLCETLGGNGCPVITITAMPESNYYEHV yq----FRNRPYIFLTSRVHPGETNASWVMKGT
LSKLQAslgeFSDIYYRKQSLCLSLGGNECPVLTITANPTTLDKEGV lq----FRCRRYLFLSARVHPGESNSSYIMKGT
LQKLSAlc--TPEIYYRQEDLCETLGGNGCPLLTITAMPESSSDEHI sq----FRSRPVIFLSARVHPGETNSSWVMKGS
LQKLESahn-PQQIYFRKDVLCETLSGNSCPLVTITAMPESNYYEHI ch----FRNRPYVFLSARVHPGETNASWVMKGT
LQKLESmhn-PQQIYFRQDALCETLGGNICPIVTITAMPESNYYEHI cq----FRNRPYIFLSARVHPGETNASWVMKGT
LQNLMSnid-TSTMYVRQQTLCNTLSGNSVPVLTITSMPISDNAGSA ieavheLRSRPYIFLSARVHPGETNASWTMRGT
LEHIRNvad-NNSIYFKCQELCLTLNGNVCNLMTITNSPNKERLNDD ky----VPRRPYIFLSARVHPGESNSSWIMKGL
933
933
867
927
769
858
933
950
980
292
Feature
sheep_A
sheep_N
mice
frog
purpuratus
zebrafish
human
chicken
sea squirt
trichoplax
934
934
868
928
770
859
934
951
981
293
#
##
LEYLMSn------------- nPTAQSLRESYIFKIVPMLNPDGVINGNH PCSLSGEDLNRQWQSPNPDLHPTIYHAKGLL
LEYLMSn------------- nPTAQSLRESYIFKIVPMLNPDGV INGNHRCSLSGEDLNRQWQSPNPDLHPTIYHAKGLL
LEYLMSn------------- sPTAQSLRESYIFKIVPMLNPDGVINGNHRCSL SGEGLNRQWQSPNPELHPTIYHAKGLL
LEFLMGs------------- sATAQSLRESYIFKIVPMLNPDGVINGNHRCSL SGEDLNRQWQNPNSDLHPTIYHTKGLL
LKFLMSt------------- hPTAQALREIFIFKIVPMLNPDGVINGSHRCSL SGEDLNRRWLQPIKELHPTIYHTKGLL
LEFLMSc------------- sPQAQSLRQSYIFKIMPMLNPDGVINGNHRCSL SGEDLNRQWQNPNAELHPTIYHAKSLL
LEYLMSn------------- nPTAQSLRESYIFKIVPMLNPDGVINGNHRCSL SGEDLNRQWQSPSPDLHPTIYHAKGLL
LEYLMSs------------- nPSAQSLRESYIFKIIPMLNPDGVINGNHRCSL SGEDLNRQWQNPNPDLHPTIYHAKGLL
LKLLLTppasqdqpediriia SIAEDLRKSYIFKIIPMLNPDGVINGHHRCSL SGEDLNRTWLDPNPQFFPTIYHTKGLL
LDFITSd------------- dDCAIQLRESYIFKIIPMLNPDGVVNGCHRCSL SGHDLNRCWISPDPRIHPTIYHTKGLI
1000
1000
934
994
836
925
1000
1017
1060
359
Feature
sheep_A
sheep_N
mice
frog
purpuratus
zebrafish
human
chicken
sea squirt
trichoplax
1001
1001
935
995
837
926
1001
1018
1061
360
##
#
QYLAAVKRLPLVYCDYHGHSRKKNVFMYGC Siketvwhtndnaapcdvvedv GYRTLPKILSHIAPAFCMSSCSFVVEKS
QYLAAVKRLPLVYCDYHGHSRKKNVFMYGC Siketvwhtndnaapcdvvedv GYRTLPKILSHIAPAFCMSSCSFVVEKS
QYLAAVKRLPLVYCDYHGHSRKKNVFMYGC Siketvwhthdnsascdivegm GYRTLPKILSHIAPAFCMSSCSFVVEKS
QYLSAIKRVPLVYCDYHGHSRKKNVFMYGC Siketvwhtnanaascdmvees GYKTLPKVLNQIAPAFSMSSCSFVVEKS
QYLKLIERSPLVYCDFHGHSRKKNVFMYGN Saamshspedl-ikfgipsgvsGAKTLPRILSSIAPAFSHSNCSFAVDKG
QYLRATGRTPLVFCDYHGHSRKKNVFMYGC Siketmwqssvntstcdlnedl GYRTLPKLLSQM TPAFSLSSCSFVVERS
QYLAAVKRLPLVYCDYHGHSRKKNVFMYGC Siketvwhtndnatscdvvedt GYRTLPKILSHIAPAFCMSSCSFVVEKS
QYLAAIKRLPLVYCDYHGHSRKKNVFMYGC Aiketmwhtnvntascdlmedp GYRVLPKILSQTAPAFCMGSCSFVVEKS
QYLQSIQKAPLVYCDYHGHSRKMNVFMYGC Shkdsaaaget-sttvggpddtGFKTLPKLLNHFAPSFSLKGCSYVVEKS
QYMVTIGKSPLIFCDFHGHSRRKNIFTYAC Ypytntn---------hraedqMLRALPRALQEVSPVFSYQLCSFAVEKC
1080
1080
1014
1074
915
1005
1080
1097
1139
430
Feature
sheep_A
sheep_N
mice
frog
purpuratus
zebrafish
human
chicken
sea squirt
trichoplax
1081
1081
1015
1075
916
1006
1081
1098
1140
431
#
KESTARVVVWREIGV QRSYTMESTLCGCDQGKYK QLQIGTRELEEMGAKFCVGLLR
KESTARVVVWREIGV QRSYTMESTLCGCDQGKYK QLQIGTRELEEMGAKFCVGLLR
KESTARVVVWREIGV QRSYTMESTLCGCDQGRYK QLQIGTRELEEMGAQFCVGLLR
KESTARVVVWKEIGV QRSYTMESTLCGCDQGKYK DLQIGTKELEEMGAKFCVGLLR
KESAARVVVWREIGV ARSYTMESTYCGFDQGKYK QLQVTTAMLEEMGQHFCEGLYK
KETTARVVVWREIGV QRSYTMESTLCGCDQGKYK QLQIGTAELEEMGSQFCLALLR
KESTARVVVWREIGV QRSYTMESTLCGCDQGKYK QLQIGTRELEEMGAKFCVGLLR
KESTARVVVWREIGV QRSYTMESTLCGCDQGKYK QLQIGTKELEEMGAKFCVGLLR
KVTTARVVVWKEIGV ARSYTMESTYCGCDQGRYR QLQVGTKELEEMGARFVEVLLH
KESTARVVLWRELCI LRSYTMESTYCGMDQGVYK QYHINTTALEDMGKDFCRGLLK
1136
1136
1070
1130
971
1061
1136
1153
1195
486
Figure S1. Conserved amino acids predicted by “CD-search”. Alignment of the
complete predicted sheep protein AGTPBP1 with 8 most dissimilar protein sequences
from species of human, mouse, chicken, frog, purpuratus, sea squirt, trichoplax and
zebrafish predicted 2 specific feature hits consisting of 9 conserved amino acid residues.
His920, Glu923 and His1017 labeled by hash-marks on the top consist as the zinc
binding motif” HXXE…H”, the conserved feature 1. The conserved feature 2
represents metallocarboxypeptidase active sites consisting of all 9 conserved residues
including zinc binding residues. These residues include His920, Glu923, Arg970,
Asn979, Arg980, His1017, Gly1018, Met1027 and Glu1102 which are labeled by hashmarks on the top.
85
sheep
MSKLKVI PEKSLANNSRI VGLLAQLEKI NTESAESDTARYVTAKI LHLAQSQEKTRREMT 60
cat t l e MSKLKVI PEKSLTNNSRI VGLLAQLEKI NTESTESDTARYVTSKI LHLAQSQEKTRREMT 60
human
MSKLKVI PEKSLTNNSRI VGLLAQLEKI NAEPSESDTARYVTSKI LHLAQSQEKTRREMT 60
mouse
MSKLKVVGEKSLTNSSRVVGLLAQLEKI NTDSTESDTARYVTSKI LHLAQSQEKTRREMT 60
** ** * *: *** *: * . ** : * ** ** *** ** *: : . : ** ** ** ** *: * *** *** *** *** ** * *
sheep
AKGSTGMEVLLSTLENTKDLQTTLNI LSI LVELVSAGGGRRASFLVSKGGSQI LLQLLMN 120
cat t l e AKGSTGMEVLLSTLENTKDLQTTLNI LSI LVELVSAGGGRRASFLVSKGGSQI LLQLLMN 120
human
AKGSTGMEI LLSTLENTKDLQTTLNI LSI LVELVSAGGGRRVSFLVTKGGSQI LLQLLMN 120
mouse
TKGSTGMEVLLSTLENTKDLQTVLNI LSI LI ELVSSGGGRRASFLVAKGGSQI LLQLLMN 120
: * ** * *** : ** ** * *** ** ** *. *** ** ** : * *** : * ** ** . ** **: *** *** *** ** * *
sheep
ASKESPLNEELMVQI HSI LAKI GPKDKKFGVKARI NGALNI TLNLVKQNLQNHRLVLPCL 180
cat t l e ASKESPLNEELMVQI HSI LAKI GPKDKKFGVKARI NGALNI TLNLVKQNLQNHRLVLPCL 180
human
ASKESPPHEDLMVQI HSI LAKI GPKDKKFGVKARI NGALNI TLNLVKQNLQNHRLVLPCL 180
mouse
ASKDSPPHEEVMVQTHSI LAKI GPKDKKFGVKARVNGALTVTLNLVKQHFQNYRLVLPCL 180
** *: * * : *: : ** * ** ** ** ** *** ** ** ** **: ** ** . : *** *** *: : **: *** ** * *
sheep
QLLRVYSANSVNSVSLGKNGVVELMFKI I GPFSKKNSSLMKVALDTLAALLKSKTNARRA 240
cat t l e QLLRVYSANSVNSVSLGKNGVVELMFKI I GPFSKKNSNLMKVALDTLAALLKSKTNARRA 240
human
QLLRVYSANSVNSVSLGKNGVVELMFKI I GPFSKKNSSLI KVALDTLAALLKSKTNARRA 240
mouse
QLLRVYSTNSVNSVSLGKNGVVELMFKI I GPFSKKNSGLMKVALDTLAALLKSKTNARRA 240
** ** * **: *** ** * *** ** ** ** *** ** ** ** *** ** . * : * *** * ** ** ** ** *** ** **
sheep
VDRGYVQVLLTI YVDWHRHDNRHRNMLI RKGI LQSLKSVTNI KLGRKAFI DANGMKI LYN 300
cat t l e VDRGYVQVLLTI YVDWHRHDNRHRNMLI RKGI LQSLKSVTNI KLGRKAFI DANGMKI LYN 300
human
VDRGYVQVLLTI YVDWHRHDNRHRNMLI RKGI LQSLKSVTNI KLGRKAFI DANGMKI LYN 300
mouse
VDRGYVQVLLTI YVDWHRHDNRHRNMLI RKGI LQSLKSVTNI KLGRKAFI DANGMKI LYN 300
** ** * *** *** ** * *** ** ** ** *** ** ** ** *** ** ** ** *** *** *** *** *** ** * *
sheep
TSQECLAVRTLDPLVNTSSLI MRKCFPKNRLPLPTI KSSFHFQLPVI PVTGPVAQLYSLP 360
cat t l e TSQECLAVRTLDPLVNTSSLI MRKCFPKNRLPLSTI KSSFHFQLPVI PVTGPVAQLYSLP 360
human
TS- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - QLPVI PVTGPVAQLYSLP 320
mouse
TSQECLAVRTLDPLVNTSSLI MRKCFPKNRLPLPTI KSSFHFQLPI I PVTGPVAQLYSLP 360
**
sheep
** *: * ** ** ** *** *** *
PEVDDVVDESDDNDDI DLEAENETENEDDLDQNFKNDDI ETDI NKLKPQQEPGRTI EELK 420
cat t l e PEVDDVVDESDDNDDI DLEAENETENEDDLDQNFKNDDI ETDI NKLKPQQEPGRTI EELK 420
human
PEVDDVVDESDDNDDI DVEAENETENEDDLDQNFKNDDI ETDI NKLKPQQEPGRTI EDLK 380
mouse
PEVDDVVDESDDNDDI DLEVENELENEDDLDQSFKNDDI ETDI NKLRPQQVPGRTI EELK 420
** ** * *** *** ** * *** : * . * ** ** ** ** ** . ** ** ** ** *** **: *** ** *** *: * *
Figure S2. Characteristics of protein AGTPBP1. Letters highlighted in green denotes
the aspartic acid rich domain; in pink corresponds to the ATP binding domain; in red
for zinc-binding site; in blue for the carboxypeptidase catalytic site and substrate
binding sites. The entire enzymatic domain is indicated by bold lettering.
86
AGTPBP1
200
400
600
800
1000
Agtpbp1_mouse
AGTPBP1_sheep
CPA2_human
CPA_cattle
ICPLVTITAMPESNYYEHICQFR-TRPYIFLSARVHPGETNASWVMKGTLEYLMS---NS
SCPLVTITAMPESNYYEHICQFR-NRPYVFLSARVHPGETNASWVMKGTLEYLMS---NN
PMNVLKF------------STGG-DKPAIWLDAGIHAREWVTQATALWTANKIVSDYGKD
PIYVLKF------------STGGSNRPAIWIDLGIHSREWITQATGVWFAKKFTENYGQN
Agtpbp1_mouse
AGTPBP1_sheep
CPA2_human
CPA_cattle
*
PTAQSLRESYIFKIVPMLNPDGV-----INGNHRCSLS---------GEGLNRQW----Q
PTAQSLRESYIFKIVPMLNPDGV-----INGNHRCSLS---------GEDLNRQW----Q
PSITSILDALDIFLLPVTNPDGYVFSQTKNRMWRKTRSKVSAGSLCVGVDPNRNWDAGFG
PSFTAILDSMDIFLEIVTNPNGFAFTHSENRLWRKTRSKVTSSSLCVGVDANRNWDAGFG
Agtpbp1_mouse
AGTPBP1_sheep
CPA2_human
CPA_cattle
SP----NP---ELH-------PTIYHAKGLLQYLAAVKRLPLVYCDYHGHSRKKNVFMYG
SP----NP---DLH-------PTIYHAKGLLQYLAAVKRLPLVYCDYHGHSRKKNVFMYG
GPGASSNPCSDSYHGPSANSEVEVKSIVDFIKSHG----KVKAFIILH--S-YSQLLMFP
KAGASSSPCSETYHGKYANSEVEVKSIVDFVKNHG----NFKAFLSIH--S-YSQLLLYP
Agtpbp1_mouse
AGTPBP1_sheep
CPA2_human
CPA_cattle
CSIKETVWHTHDNSASCDIVEGMGYRTLPKILSHIAPAFCMSSCSFVVEKSKESTARVVV
CSIKETVWHTNDNAASCDVVEDVGYRTLPKILSHIAPAFCMSSCSFVVEKSKESTARVVV
YGYKCT--KLDD-FDELSEVAQKAAQSLSRLHGTKY--KVGPICSVIYQASGGSIDWSYD
YGYT-T-QSIPD-KTELNQVAKSAVAALKSLYGTSY--KYGSIITTIYQASGGSIDWSYN
Agtpbp1_mouse
AGTPBP1_sheep
CPA2_human
CPA_cattle
WREIGVQRSYTMESTLCGCDQGRYKGLQIGTRELEEMGAQFCVGLLRLKRLTSSLEYNLP
WREIGVQRSYTMESTLCGCDQGKYKGLQIGTRELEEMGAKFCVGLLRLKRLTSPLEYNLP
Y---GIKYSFAFELR----DTGRYG-FLLPARQILPTAEETWLGLKAIMEHVRDHPY--Q---GIKYSFTFELR----DTGRYG-FLLPASQIIPTAQETWLGVLTIMEHTVNN-----
Figure S3. Secondary structure modeling of the AGTPBP1 protein indicates critical
amino acid residues. The fold structure of the carboxypeptidase (CP) domain (orange
box in upper panel) of the AGTPBP1 protein in sheep and mouse were predicted using
the 3D_PSSM software. The sequences of CPA2-human (PDB ID: 1DTD) and
CPA_cattle (PDB ID: 1HDU) which have known tertiary structures were also included
in the alignment. Green highlighting indicates β-sheet and blue highlighting indicates
α-helix. The conserved amino acids being critical for substrate binding and catalysis
are in red color (His920, Glu923, Arg970, Asn979, Arg980 and Glu1102). The location
of arginine 970 in sheep AGTPBP1 gene is marked with an asterisk on its top. The
numbering is based on the sheep AGTPBP1 protein.
87
CHAPTER 4. IN A SHAKE OF A LAMB’S TAIL: USING GENOMICS TO
UNRAVEL A CAUSE OF CHONDRODYSPLASIA IN TEXEL SHEEP
A paper published in Animal Genetics 2012;43 Suppl 1:9-18.
Xia Zhao1, Suneel K Onteru1, Susan Piripi2,3, Keith G Thompson2, Hugh T Blair2, Dorian J
Garrick1,2, Max F Rothschild1
1
Department of Animal Science and Center for Integrated Animal Genomics, Iowa State
University, Ames, IA 50011, USA
2
Institute of Veterinary, Animal & Biomedical Sciences, Massey University, Palmerston
North 4474, New Zealand
3
College of Veterinary Medicine, Oregon State University, Corvallis, OR 97331, USA
Running title: the SLC13A1 gene responsible for Texel chondrodysplasia
Keywords: deletion; SLC13A1 gene; chondrodysplasia; Texel sheep; genetic disease
SUMMARY
Chondrodysplasia in Texel sheep is a recessively inherited disorder characterized by
dwarfism and angular deformities of the forelimbs. A genome-wide association study using
the Illumina OvineSNP50 BeadChip on 15 sheep diagnosed as affected and 8 carriers
descended from 3 affected rams was conducted to uncover the genetic cause. A homozygous
region of 25 consecutive SNP loci was identified in all affected sheep, covering a region of
1Mbp on ovine chromosome 4. Seven positional candidate genes including solute carrier
88
family 13, member 1 (SLC13A1) were identified and used to search for new single nucleotide
polymorphisms (SNPs) for fine mapping of the causal locus. The SLC13A1 gene, encoding a
sodium/sulfate transporter, was the primary candidate gene due to similar phenotypes
observed in the Slc13a1_knockout mouse model. We discovered a 1-bp deletion of T
(g.25513delT) at 107 bp position of exon 3 in the SLC13A1 gene. Genotyping by direct
sequencing and restriction fragment length polymorphism analysis for this mutation showed
that, all 15 affected sheep were “g.25513delT/g.25513delT”; the 8 carriers were
“g.25513delT/T” and 54 normal controls were “T/T”. The mutation g.25513delT shifts the
open reading frame of SLC13A1 to introduce a stop codon and truncate C-terminal amino
acids. It was concluded that the g.25513delT mutation in the SLC13A1 gene was responsible
for the chondrodysplasia seen in these Texel sheep. The knowledge can be used to identify
carriers with the defective g.[25513delT] allele in order to avoid at-risk matings to improve
animal welfare and decrease economic losses.
INTRODUCTION
Nearly 30 years have passed since the seminal research of Soller and colleagues (Soller &
Beckman 1983; Beckman & Soller 1983) which described the use of genetic markers for
genetic improvement of crops and livestock.
Considerable advancements have allowed
animal geneticists to springboard from these original marker papers to the development of
genetic maps, quantitative trait loci analyses and on to genome sequencing and resultant
development of high-density panels of single nucleotide polymorphisms (SNPs). These
recent tools greatly facilitate the exploration of genetic abnormalities in livestock species.
89
Chondrodysplasia is a general term for diseases of development of cartilage that are present
in human beings and many animal species.
Other terms like achondroplasia,
achondrogenesis, hypochondrogenesis and chondrodystrophy have been used to describe
various forms of chondrodysplasia in published clinical and research literature (Thompson et
al. 2005; Martínez et al. 2000; Superti_Furga et al. 1999; Zankl et al. 2008; Hastbacka et al.
1999).
Chondrodysplasia is primarily caused by defective genes affecting normal
chondrogenesis and cartilage development; these display phenotypically as abnormal shape
and structure of the skeleton.
Mutations in several genes related to the synthesis of collagen or proteoglycans, or the
signal-transduction mechanisms have been reported to be involved in inherited
chondrodysplasia.
Three
forms
of
chondrodysplasia
in
humans
including
achondrodrogenesis 1B, atelosteogenesis type II and diastrophic dysplasia are due to
mutations in the solute carrier family 26, member 2 (SLC26A2) gene, which encodes a sulfate
transporter protein (DTDST). The impaired sulfate uptake affected the normal development
of cartilage (Superti_Furga et al. 1999; Hastbacka et al. 1996; Hastbacka et al. 1999). A
nonsense mutation of the collagen, type X, alpha 1 (COL10A1) gene has been found to be
responsible for Schmid type metaphyseal chondrodysplasia (MCDS) in a human patient
(Woelfle et al. 2011). A multibreed association study in domestic dogs demonstrated that an
evolutionarily recently acquired fgf4 retrogene, an extra copy of the fibroblast growth factor
4 gene (FGF4), was strongly associated with chondrodysplasia (Parker et al. 2009). The
most common inherited chondrodysplasia of sheep is the “spider lamb syndrome” (SLS) in
the Suffolk breed. The mutation responsible for the disease is a non-synonymous T > A
90
transversion on the highly conserved tyrosine kinase II domain of the fibroblast growth factor
receptor 3 (FGFR3) gene (Cockett et al. 1999; Beever et al. 2006).
Chondrodysplasia in Texel sheep has been determined by outcross and backcross
experiments to be a simple autosomal recessive disorder (Byrne et al. 2008).
It was
distinguished from SLS and other chondrodysplasias of sheep by gross phenotypic
observations
and
microscopic
analysis
(Thompson
et
al.
2005;
Piripi
2008).
Chondrodysplasia in Texel sheep is phenotypically characterized by disproportionate
dwarfism, a stocky appearance with shortened neck and limbs, bilateral varus deformities of
the forelimbs, flexion of the carpal joints, a barrel-shaped chest and wide-based stance (Fig.
1A &1B).
Consistent lesions were observed grossly and microscopically in many
cartilaginous tissues especially in hyaline cartilage. The marked narrowing of the tracheal
lumen (Fig. 1C) in the affected lambs could result in sudden death due to tracheal collapse.
Immunohistochemistry of cartilage in Texel chondrodysplasia demonstrated that both types
II and VI collagen were distributed abnormally in the matrix, suggesting that the extracellular
assembly of these collagen types was disturbed. Moreover, a significantly reduced level of
chondroitin 4- sulfate was observed using capillary electrophoresis in chondrodysplastic
sheep compared to age-matched controls (Piripi et al. 2011). Most cases were able to be
diagnosed at 14 days of age, but all those showing a characteristic chondrodysplastic
appearance by approximately 4 months of age were diagnosed as affected.
Due to the similar sulfation pattern with achondrogenesis 1B and diastrophic dysplasia in
humans, the ovine SLC26A2 gene was previously considered as a prime candidate gene for
this disease, but no mutations were identified in 85.4% of exonic DNA sequenced (Byrne,
91
2005).
That initial investigation was followed by targeted linkage disequilibrium (LD)
mapping to locate the causative mutation for this disorder. However, it again failed since the
microsatellite markers were selected only from ovine chromosomes 1, 5, 6, 13, and 22 based
on the limited known locations of functional candidate genes or their linkage to
chondrodysplastic diseases in other species (Piripi 2008).
In order to understand the genetic basis for chondrodysplasia in Texel sheep, we recently
performed a homozygosity mapping through whole genome analysis using the Illumina
OvineSNP50 BeadChip to identify the genomic region containing the mutation, and then
identified the disease-causing mutation by direct examination of the positional candidate
genes.
MATERIALS AND METHODS
Ethics statement
Approvals dealing with DNA or tissue collection from animals were obtained from the
Massey University Animal Ethics Committee (APPROVAL NUMBER 05/123). Iowa State
University Animal Care & Use Committee did not require additional approvals.
Animals
Three related affected Texel rams were out-crossed to 221 unrelated phenotypically normal
mixed-breed ewes. Then 123 F1 hogget daughters were backcrossed to the same 3 rams and
these matings produced 46 surviving back-cross lambs. In another mating trial 22 mixedbreed ewes, which either had chondrodysplasia or had previously produced chondrodyplastic
offspring, were mated to the 3 affected Texel rams and produced 26 lambs. The 3 affected
rams and the 22 ewes with chondrodysplasia background were purchased from the Southland
92
property in New Zealand which was the original source for this disease. A total of 23 lambs
were genotyped using the OvineSNP50 BeadChip for genome wide homozygosity mapping
and the following fine mapping experiments. These lambs included 11 affected (5 males and
6 females) and 7 presumed carriers (4 males and 3 females) chosen from the back-cross
lambs, and 4 affected females with 1 presumed male carrier chosen from the lambs produced
in the second mating trial. The pedigree information for these 23 lambs is provided (Fig. S1).
Fine mapping also included 54 Romney control sheep from an unrelated population without
evidence of this inherited chondrodysplasia.
DNA extraction
DNA was extracted from samples of spleen collected during necropsy. Most samples were
extracted using the Wizard genomic DNA extraction kit (Promega, Madison, WI, USA).
DNA for an additional 2 samples were extracted with a GenRlute mammalian genomic DNA
miniprep kit (Sigma_Aldrich Corp., St. Louis, MO, USA), and a high pure PCR template
preparation kit (Roche Diagnostics, Mannheim, Germany). The DNA extractions were made
following standard procedures for each of the kits.
Genome wide homozygosity mapping
DNA samples from the 23 Texel crossbred sheep were genotyped using the Ovine SNP50
BeadChip (Illumina, San Diego, CA, USA) with standard procedures at GeneSeek Inc.
Lincoln, NE, USA. The homozygosity mapping was conducted using the same in-house IBD
(identical by descent) approach implemented using command line UNIX tools as described
previously (Zhao et al. 2011). This approach identified common homozygous segments in
the affected lambs. Among those segments, it was presumed that one would contain the
93
mutation associated with chondrodysplasia in Texel sheep. A list of positional candidate
genes were obtained when the targeted IBD segments were identified using comparative
maps between ovine and bovine genomes accessed through the Ovine Genome Assembly
v1.0.
Copy number variation (CNV) analysis
The CNV analysis based on the whole genome SNP chip genotyping data was carried out
with the CNV partition software, a plug-in within the GenomeStudio program (V1.0
Illumina). The CNV partition software helped to count the copies of probes for each SNP on
the ovineSNP50 BeadChip within each individual. Three files including the sample sheet,
the intensity and the manifest files were required for data evaluations in this program.
Software parameters were kept at the developers’ default values. The CNV value and
confidence for chromosomal regions in each sample were computed. An estimated copy
number for each allele was represented by the CNV value. Viewing the results of CNV
analyses was accomplished by the heat map produced using the CNV partition software.
Each column in the heat map represents the copy numbers of SNP loci across the genome for
one individual animal.
New genetic variants discovery
In order to amplify exons and adjacent genomic sequences for the sheep positional candidate
genes located in the homozygous region, primers (IDT, Coralville, IA, USA) were designed
using the online Primer3 tool (Rozen & Skaletsky 2000) targeting the sheep genomic
sequences which were downloaded from the Ovine Genome Assembly v1.0 (Table S1).
PCR was performed using a 10 µl system, which included 1x GoTaq reaction buffer
94
containing 1.5 mM Mg2+ (Promega, Madison, WI, USA), 0.125 mM of each dNTPs, 2.5 pM
of each primer, 12.5 ng of genomic DNA, and 0.25 units of GoTaq DNA polymerase
(Promega, Madison, WI, USA). The PCR program was 35 cycles of 30 s at 94°, 30 s at the
corresponding annealing temperature, and 30 s at 72°C with additional 3 min denaturation in
the first cycle and 5 min extension in the last cycle. In all reactions, negative controls were
performed to check for the presence of contaminants. PCR products were verified by 1.5%
agarose gel electrophoresis, and treated with ExoSAP-IT®. The PCR pools from 2 affected
sheep, 2 carriers and 2 normal controls, were used for commercial re-sequencing using the
3730xl DNA Analyzer. The sequences were aligned and compared to discover new genetic
variants using Sequencher software version 3.0.
Mutation analyses
The cattle reference SLC13A1 mRNA (GenBank: NM_001193219) was used as a query in
cross-species BLAST searches to identify sheep ESTs. The sheep re-sequencing results
using templates from the affected, carrier and normal control sheep along with identified
sheep ESTs were used to predict a complete ovine SLC13A1 mRNA sequence. The ovine
mRNA sequence was then translated into a protein sequence by applying the ExPASy
translate tool. The predicted sheep SLC13A1 protein sequence was compared to proteins in
other species including cattle (Bos Taurus, GenBank: DAA30458), human (Homo sapiens,
GenBank: NP_071889), mouse (Mus musculus, GenBank: AAH22672), rat (Rattus
norvegicus, GenBank: NP_113839), dog (Canis lupus familiaris, GenBank: XP_539548) and
Zebrafish (Danio rerio, GenBank: NP_954975) using the CLUSTALW alignment tool to
check the completeness of the sheep protein sequence and the conserved regions. The
95
topology prediction of transmembrane domains (TMDs) was undertaken using the online tool
TopPred 0.01. The upper and lower cutoff values were set as 1 and 0.6. Each genetic variant
was analyzed to check whether it was a missense, or nonsense mutation or whether it could
shift the open reading frame of the candidate gene. Every suspected genetic variant was
genotyped on all the sheep samples to check whether it was concordant with the recessive
inheritance pattern of the disease.
Mutation “g.25513delT” genotyping
Direct sequencing and PCR restriction enzyme digestion (PCR-RFLP) were carried out to
genotype the suspected causative mutation “g.25513delT” on all Texel crossbred sheep with
clear status of chondrodysplasia as well as 54 Romney control sheep. The restriction enzyme
digestion procedures were as follows: DNA fragments from PCR reactions were subjected to
restriction enzyme digestion with HpyAV (New England Biolabs Inc., USA). The 10 µl
system reaction mix contained 1 µl of NEB buffer 4, 3 units of HpyAV (1.5 µl), 0.1 µl of
BSA (100X) and 3 µl of PCR products. Then the reaction mix was incubated at 37º C
overnight, and the products were analyzed on a 2.5% (w/v) ultra-pure agarose gel.
RESULTS
Identity by descent region
The IBD analysis narrowed the defective region down to a single 1 Mb homozygous segment
consisting of 25 consecutive SNP loci on sheep chromosome 4 (OAR 4) in all 15 affected
sheep, by using a larger than 10-SNP threshold in our IBD approach (Fig. S2). This region
was heterozygous for 7 presumed carriers but not for lamb 14-04.
Lamb 14-04 was
phenotypically normal but produced affected offspring. It was considered as a carrier, but
96
was homozygous in the 1 Mb segment with the same haplotype as affected sheep. The 25SNP
homozygous
segment
was
defined
by
SNP
OAR4_92447165.1
to
SNP
OAR4_93438733.1, spanned a chromosomal region between 92.45 Mb and 93.44 Mb (Fig.
2A & 2B). It encompassed 7 genes based on the cow reference sequences including Ca2+
dependent activator protein for secretion 2 (CADPS2) (OAR4:92155977..92738612), solute
carrier family 13 member 1 (SLC13A1) (OAR4:92888272..92993056), IQ motif and
ubiquitin domain containing (IQUB) (OAR4:93209447..93265179), ring finger protein 148
(RNF148) (OAR4:92554935..92556138), ankyrin repeat and SOCS box-containing 15
(ASB15)
(OAR4:93343952..93353409),
(LMOD2)(OAR4:93370342..93377214)
and
leiomodin
Wiskott-Aldrich
2
(cardiac)
syndrome-like
(WASL)(OAR4:93431017..93481693) (Fig. 2C). The SLC13A1 gene, spanning over 79 kb in
length, was the most plausible candidate due to its biological functions and previous reports
of its involvement with chondrodysplasia in other species. Most of the subsequent effort
then focused on this gene in further mutation analyses.
CNV analysis
The heat map from CNV analyses showed copy numbers of SNP loci for each individual
sheep. No CNVs were discovered that were associated with chondrodysplasia in Texel sheep.
There were only two loci with CNV values from 0 to 0.5 on OAR1 and OAR3, respectively
(the dark red horizontal bars in Fig. S3). However, the 2 CNVs with rare copy numbers
showed up mostly among the first 12 animals on the heat map, and did not occur specifically
in either the affected group or the carrier group (Fig. S3). Therefore, the CNV analysis was
not pursued further.
97
Mutation analysis
Two corresponding ovine EST entries (GenBank: FE030665.1, FE030666.1) were found
using the cattle reference SLC13A1 mRNA as a query. The putative ovine genomic structure
was determined by alignments of the 2 ovine ESTs, the complete bovine SLC13A1 mRNA
and the human SLC13A1 mRNA. Even though the human SLC13A1 gene contains 15 exons
and spanned 83 kb in length (Lee et al. 2000), the evidence from the alignments with the
bovine homolog showed the presence of exon 16 in both bovine and ovine SLC13A1 mRNA,
in which exon 16 only contained a long 3’ un-translated region. The “GT-AG” splice site
groups in the ovine SLC13A1 gene support this genomic structure. The translated protein
length ranged from 583 to 597 amino acids in the 7 species. In sheep and cattle, it is 597
amino acids long, 2 amino acids longer than in humans. There are many conserved regions
in this protein shared by the 7 species (Fig. S4). The topology prediction showed the
predicted SLC13A1 protein was comprised of 13 transmembrane domains (TMDs) in sheep.
Each peak above the upper cut-off value represents a TMD (Fig. 3).
Re-sequencing all available exons and the adjacent introns of the SLC13A1 gene identified a
mutation potentially responsible for this disease, with one base pair deletion T
(JN108880:g.25513delT) at the 107 bp position of exon 3 (148 bp on the “sRFLPdelT”
fragment, Table S1) by comparing affected, carriers and normal controls. This 1-bp deletion
was 3,751 bp far away from the nearest published SNP OAR4_92963794.1 on one side and
27,654 bp from SNP s11336.1 on another side.
An additional 11 novel SNPs in this
candidate gene and 5 genetic variants in the other 6 candidate genes were identified to be
either intronic or synonymous, and were excluded as mutations for this defect (Table S1).
98
The g.25513delT deletion on SLC13A1 gene changed the open reading frame and induced a
nonsense transition (p.Val113Stop) at amino acid 113on the predicted ovine SLC13A1
protein sequence (Fig 2D).
Genotyping of mutation g.25513delT
Genotyping all the sheep including unrelated controls by direct sequencing showed that all
the
affected
Texel
sheep
were
homozygous
with
the
deletion
of
“T”
(g.25513delT/g.25513delT) on the exon 3 of SLC13A1 gene, the 8 carriers including lamb
14-04 were heterozygous with a single “T” allele (g.25513delT/T) and the pooled normal
controls were all homozygous with “T” allele (T/T) (Fig. 4). The g.25513delT mutation
created a HpyAV restriction endonuclease recognizing site for specific cuttings on the PCR
fragment “sRFLPdelT”, which was used to distinguish the affected, carriers and normal
controls.
In the PCR-RFLP experiment, PCR fragment “sRFLPdelT” with a
“g.25513delT/g.25513delT” genotype was cut into 2 fragments with the lengths of 192bp and
156bp; the one with a “g.25513delT/T” genotype was cut into 4 fragments with lengths of
349bp, 348bp, 192bp, and 156bp; and the one with a “T/T” genotype could not be cut and
kept the original fragment size with a length of 349bp. The PCR-RFLP genotyping results
were consistent with the results deduced from direct sequencing. This g.25513delT mutation
was the only polymorphism 100% concordant with the recessive pattern of inheritance in
affected, carriers and unrelated normal individuals (Table 1).
Taken together, our results strongly support the conclusion that the deletion “T” on exon 3 of
SLC13A1 gene is the causative mutation for inherited chondrodysplasia in Texel sheep.
99
DISCUSSION
The use of genetic markers to either develop a genetic map or for QTL analysis has come a
long way since early reports suggesting their use (Soller & Beckmann, 1983). However,
with the advent of the ovine genome sequence and a high density SNP chip, the tools for
analyses of quantitative inherited traits and genetic disorders have improved markedly.
These tools, combined with comparative genomic approaches to determine useful candidate
genes (Rothschild & Soller, 1997), make it possible to find genetic causes of both
quantitative traits and monogenic simple recessive disorders in a relatively short time.
Previous reports showed that the mutations in the SLC26A2 gene were responsible for human
achondrogenesis.
The
structural,
microscopic,
and
biochemical
analyses
on
chondrodysplasia in Texel sheep suggested a similar pathogenic mechanism involved in this
defect. The genes called solute carriers, encoding membrane transport proteins, especially
the sulfate transporters, are prime candidate genes. However, there are almost 300 members
for solute carrier genes distributed throughout the human genome and these are organized
into 51 gene families (Hediger et al. 2004). Both the low density microsatellite - based
linkage mapping and the direct candidate gene method were not suitable to handle these data
directly and were incapable of localizing the mutations underlying this Mendelian disease.
The recent development of the ovine SNP chip containing more than 50,000 SNP markers
well-distributed across the sheep genome has become the best and most efficient tool for the
genome-wide dissection of likely mutant loci (Becker et al. 2010). The rationale for this is as
follows.
Provided a disease is recessive, accurately diagnosed when present, and the
defective allele is non-existent in that part of the population not descended from the founder,
then regardless of the level of penetrance the affected animals must all be identical by
100
descent at the causative mutation and most likely in the surrounding area. The expected size
of the IBD region that contains the causative mutation will depend in part upon the number
of meioses that separate the common ancestor from the affected individual, and the number
of affected individuals. In our mapping strategy, a homozygous fragment spanning larger
than 10 SNPs was chosen as a threshold for valid candidate regions that needed to be
examined (Zhao et al. 2011). A 10-SNP interval on the sex-average map (Sheep Best
Positions Linkage Map Version 4.7, http://rubens.its.unimelb.edu.au/~jillm/jill.htm) is about
0.64 cM. The probability that a 0.64 cM region would be inherited intact is 1-0.0064=0.9936,
and that this would occur in all three meioses that separate a founder and a back-crossed
affected individual is 0.9936^3=0.9810. The 10-SNP region will only be IBD if it was
inherited intact in all 11 backcross offspring, that outcome having a probability of 0.9810^11
=0.81. This value is large enough to give us confidence that the causative mutation would be
contained in regions of larger than 10 consecutive SNP and that we should examine those
regions first as they had the highest likelihood of carrying the mutation.
In this study, only one such IBD fragment was detected, which comprised 25 SNP loci and
the causative mutation was right located on this fragment. The results from our current and
previous studies (Zhao et al. 2011) show our genome wide homozygosity mapping strategy
has enough power to detect causative loci for rare recessive genetic diseases where
consanguineous matings are available. The candidate gene approach (Rothschild & Soller,
1997) can act like a bridge between the targeted genomic regions and the causative mutations.
This helps to verify and find the mutation. Candidate gene analysis can be accomplished by
direct sequencing or PCR-RFLP tests. The RFLP method was first described by Dr. Soller
and colleagues in 1983 (Soller & Beckmann, 1983; Beckmann & Soller, 1983). It has been
101
successfully applied for about 30 years for a variety of polymorphism analyses including the
SNP analysis associated with many interesting traits (Rothschild et al. 1997; Zhao et al. 2009;
Zhao et al. 2010, etc.). The candidate gene approach using RFLP tests is usually more
convenient and historically more economic than using direct sequencing. Our work provides
a successful example for combining GWAS and the standard candidate gene analysis.
In this study, the presumed carrier lamb 14-04 was not heterozygous for the consecutive 25
SNPs in this targeted IBD region as were the other carriers, but was homozygous with the
same haplotype as the affected sheep. This was probably due to this lamb carrying the
ancestral genomic region within which the g.25513delT mutation arose in some other
ancestor. The genotyping results of g.25513delTfor lamb 14-04 demonstrated it was in fact
heterozygous which was consistent with the presumed disease status.
The SLC13A1 gene was considered as the most plausible functional candidate gene within
the 1Mb-interval on OAR 4. This gene belongs to the solute-linked carrier gene family 13
(SLC13) and encodes a protein called renal Na+ - dependent inorganic sulfate transporter-1
(Sodium/sulfate cotransporter, NaS1). DNA sequencing revealed the 1-bp deletion of “T”
was perfectly associated with the chondrodysplasia phenotype in Texel sheep. We also
demonstrated that this deletion was not observed in any of the 54 normal Romney sheep
controls. We confirmed the presence of this deletion as a mutation on the genomic DNA and
believe that the ovine NaS1 protein in the affected sheep may undergo a nonsense mRNA
mediated decay, a process which detects nonsense mutations and prevents the expression of
truncated or erroneous proteins. The RT-PCR analysis on SLC13A1 mRNA could provide
further evidence to support our conclusions. Unfortunately, most of the originally affected
102
animals either died due to tracheal collapse or were euthanized at approximately 5 months of
age. New lambs from designed matings between carrier animals would need to be produced
for any future experiments.
Our study is the first report showing that the mutation in the SLC13A1 gene is responsible for
inherited chondrodysplasia in a large mammal species. A previous research described that a
homologous Slc13a1 knockout mouse model produced similar clinical phenotypes as our
affected Texel sheep, including reduced serum sulfate concentration and a general growth
retardation throughout adulthood (Dawson et al. 2003). The knockout mouse model further
supports our findings.
The NaS1 co-transporter locates on the brush border membranes of epithelial cells lining the
renal proximal tubule and the gastrointestinal tract. It helps to absorb inorganic sulfate in the
small intestine and to reabsorb in the kidney proximal tubule to maintain the concentration of
serum sulfate (Markovich, 2001). Sulfate is the fourth most abundant anion in human plasma,
and plays an essential role during growth, development, cellular metabolism, and
bone/cartilage formation. For example, the sulfated proteoglycans as the largest group of
sulfoconjugate in mammals, are required for the maintenance of normal structure of bone and
cartilage (Fernandes, et al. 1997). The SLC13 family was believed to be distinct functionally
and structurally from the SLC26 gene family which constitutes Na+ - independent inorganic
sulfate transporter (Markovich, 2008). A recent study has indicated that the SLC26A2
protein expression was limited to the microvillus membrane of kidney proximal tubule
(Chapman & Karniski, 2010). We believe NaS1 and SLC26A2 proteins would share similar
103
functions during the sulfate transportation due to their similar localizations. The exact roles
of sulfate transportation for each of the two proteins are still uncertain.
Copy number variation analysis is receiving more attention since the development of high
density SNP chips as well as the dramatic improvements in new sequencing technologies.
Several CNVs have been found associated with human diseases such as obesity (Chen et al.
2011), autoimmune disease (Mamtani et al. 2010) and autism (Sebat et al. 2007). Moreover,
an extra copy of the FGF4 gene, as a form of evolutionarily acquired fgf4 retrogene, was
identified strongly associated with chondrodysplasia in dogs (Parker et al. 2009). With the
intention of determining whether CNVs were associated with this inherited chondrodysplasia
in Texel sheep, we performed a CNV analysis based on the SNP chip genotyping data.
However, there were no CNVs apparently associated only with this defect. The CNV
variations detected on OAR1 and OAR3 using the partition software probably came from the
SNP chips themselves rather than being due to the disease status themselves.
In summary, our study indicates the 1-bp coding deletion of “T” in the ovine SLC13A1 gene
is responsible for this form of inherited chondrodysplasia in Texel sheep. Our results suggest
NaS1 is essential for maintaining sulfate homeostasis in sheep. The comparison of the gene
expression of SLC13A1 or directly detecting the levels of the NaS1 protein using in situ
hybridization in the kidney tissues among chondrodysplasia affected, carriers and normal
Texel sheep should be done in the future. Using the ovine high density SNP array with a
candidate gene approach was confirmed again by this study as an efficient way for
localization of a mutation locus causing a simple recessive inherited defect in sheep. Similar
methodologies have been carried out to identify mutations in other recessive inherited defects
104
in sheep, such as microphthalmia and inherited rickets (Becker et al. 2010; Zhao et al. 2011).
The SLC13A1 gene can be rapidly applied as a selection marker to identify carriers with the
defective g.[25513delT] allele in order to cull carriers, or avoid at-risk matings to improve
animal welfare and decrease economic losses. Due to the moderate size and the relatively
low cost of maintenance of carriers, sheep with chondrodysplasia could be used as a potential
animal model for future human chondrodysplasia clinical drug development if any similar
human cases are diagnosed in the future.
ACKNOWLEDGEMENTS
The authors appreciate the opportunity for this paper to be included in this journal’s special
issue in honor of Dr. Moshe Soller and thank the editor and referees for their comments. The
authors appreciate the financial support provided by State of Iowa and Hatch Funds and the
support from the College of Agriculture and Life Sciences at Iowa State University. Animal
studies and diagnoses were funded by Massey University. H.T.B. was partially funded by
the National Research Centre for Growth and Development.
Conflicts of Interest
The authors have declared no potential conflicts.
Web URLs
Primer3: http://fokker.wi.mit.edu/primer3/
Ovine Genome Assembly v1.0:
http://www.livestockgenomics.csiro.au/perl/gbrowse.cgi/oar1.0/#search
ExPASy translate tool: http://ca.expasy.org/tools/dna.html
105
CLUSTALW: http://align.genome.jp/
TopPred 0.01: http://bioweb.pasteur.fr/seqanal/interfaces/toppred.html
REFERENCES
Becker D., Tetens J., Brunner A., Burstel D., Ganter M., Kijas J. & Drogemuller C. (2010)
Microphthalmia in Texel sheep is associated with a missense mutation in the pairedlike homeodomain 3 (PITX3) gene. PLoS One 5, e8689.
Beckmann J.S. & Soller M. (1983) Restriction fragment length polymorphisms in genetic
improvement: methodologies, mapping and costs. Theoretical and Applied Genetics
67, 35-43.
Beever J.E., Smit M.A., Meyers S.N., Hadfield T.S., Bottema C., Albretsen J. & Cockett N.E.
(2006) A single-base change in the tyrosine kinase II domain of ovine FGFR3 causes
hereditary chondrodysplasia in sheep. Animal Genetics 37, 66-71.
Byrne T.J. (2005) Investigation into the inheritance and biochemistry of chondrodysplasia in
Texel sheep. Master’s thesis. Massey University, New Zealand.
Byrne T.J., Blair H.T., Thompson K.G., Stowell K.M. & Piripi S. (2008) Inheritance of
chondrodysplasia in Texel sheep. Proceedings of the New Zealand Society of Animal
Production 68, 20-2.
Chapman J.M. & Karniski L.P. (2010) Protein localization of SLC26A2 (DTDST) in rat
kidney. Histochemistry Cell Biology 133, 541-7.
Chen Y., Liu Y.J., Pei Y.F., Yang T.L., Deng F.Y., Liu X.G., Li D.Y. & Deng H.W. (2011)
Copy Number Variations at the Prader-Willi Syndrome Region on Chromosome 15
and associations with Obesity in Whites. Obesity (Silver Spring) 19, 1229-34.
Cockett N.E., Shay T.L., Beever J.E., Nielsen D., Albretsen J., Georges M., Peterson K.,
Stephens A., Vernon W., Timofeevskaia O., South S., Mork J., Maciulis A. & Bunch
T.D. (1999) Localization of the locus causing Spider Lamb Syndrome to the distal
end of ovine Chromosome 6. Mammalian Genome 10, 35-8.
Dawson P.A., Beck L. & Markovich D. (2003) Hyposulfatemia, growth retardation, reduced
fertility, and seizures in mice lacking a functional NaSi-1 gene. Proceedings of the
National Academy of Sciences USA 100, 13704-9.
Fernandes I., Hampson G., Cahours X., Morin P., Coureau C., Couette S., Prie D., Biber J.,
Murer H., Friedlander G. & Silve C. (1997) Abnormal sulfate metabolism in vitamin
D-deficient rats. The Journal of Clinic Investigation 100, 2196-203.
Hastbacka J., Kerrebrock A., Mokkala K., Clines G., Lovett M., Kaitila I., de la Chapelle A.
& Lander E.S. (1999) Identification of the Finnish founder mutation for diastrophic
dysplasia (DTD). European Journal of Human Genetics 7, 664-70.
Hastbacka J., Superti-Furga A., Wilcox W.R., Rimoin D.L., Cohn D.H. & Lander E.S. (1996)
Atelosteogenesis type II is caused by mutations in the diastrophic dysplasia sulfatetransporter gene (DTDST): evidence for a phenotypic series involving three
chondrodysplasias. The American Journal of Human Genetics 58, 255-62.
106
Hediger M.A., Romero M.F., Peng J.B., Rolfs A., Takanaga H. & Bruford E.A. (2004) The
ABCs of solute carriers: physiological, pathological and therapeutic implications of
human membrane transport proteins - Introduction. Pflugers Archive 447, 465-8.
Lee A., Beck L. & Markovich D. (2000) The human renal sodium sulfate cotransporter
(SLC13A1; hNaSi-1) cDNA and gene: organization, chromosomal localization, and
functional characterization. Genomics 70, 354-63.
Mamtani M., Anaya J.M., He W. & Ahuja S.K. (2010) Association of copy number variation
in the FCGR3B gene with risk of autoimmune diseases. Genes Immunity 11, 155-60.
Markovich D. (2001) Physiological roles and regulation of mammalian sulfate transporters.
Physiological reviews 81, 1499-533.
Markovich D., Romano A., Storelli C. & Verri T. (2008) Functional and structural
characterization of the zebrafish Na+-sulfate cotransporter 1 (NaS1) cDNA and gene
(slc13a1). Physiology Genomics 34, 256-64.
Martínez S., Valdés J. & Alonso R.A. (2000) Achondroplastic dog breeds have no mutations
in the transmembrane domain of the FGFR-3 gene. Canadian Journal of Veterinary
research 64, 243-5.
Parker H.G., VonHoldt B.M., Quignon P., Margulies E.H., Shao S., Mosher D.S., Spady T.C.,
Elkahloun A., Cargill M., Jones P.G., Maslen C.L., Acland G.M., Sutter N.B., Kuroki
K., Bustamante C.D., Wayne R.K. & Ostrander E.A. (2009) An expressed fgf4
retrogene is associated with breed-defining chondrodysplasia in domestic dogs.
Science 325, 995-8.
Piripi S.A. (2008) Chondrodysplasia of Texel sheep. Doctorate thesis. Massey University,
New Zealand.
Piripi S.A., Williams M.A.K. & Thompon K.G. (2011) On the sulfation pattern of
polysaccharides in the extracellular matrix of sheep with chondrodysplasia. Cartilage
2, 36-9.
Rothschild M., Jacobson C., Vaske D., Tuggle C., Wang L., Short T., Eckardt G., Sasaki S.,
Vincent A., McLaren D., Southwood O., van der Steen H., Mileham A. & Plastow G.
(1996) The estrogen receptor locus is associated with a major gene influencing litter
size in pigs. Proceedings of the National Academy of Sciences USA 93, 201-5.
Rothschild M.F. & Soller M. (1997) Candidate gene analysis to detect traits of economic
importance in domestic livestock. Probe 8, 13-22.
Rozen S. & Skaletsky H.J. (1999) Primer3 on the WWW for general users and for biologist
programmers. In: Bioinformatics Methods and Protocols:Methods in Molecular
Biology (eds. by Misener S & Krawetz SA), pp. 365-86. Humana Press Inc., Totowa,
NJ.
Sebat J., Lakshmi B., Malhotra D., Troge J., Lese-Martin C., Walsh T., Yamrom B., Yoon S.,
Krasnitz A., Kendall J., Leotta A., Pai D., Zhang R., Lee Y.H., Hicks J., Spence S.J.,
Lee A.T., Puura K., Lehtimaki T., Ledbetter D., Gregersen P.K., Bregman J.,
Sutcliffe J.S., Jobanputra V., Chung W., Warburton D., King M.C., Skuse D.,
Geschwind D.H., Gilliam T.C., Ye K. & Wigler M. (2007) Strong association of de
novo copy number mutations with autism. Science 316, 445-9.
Soller M. & Beckmann J.S. (1983) Genetic polymorphisms in varietal identification and
genetic improvement. Theoretical and Applied Genetics 67:25-33.
107
Superti-Furga A., Hastbacka J., Wilcox W.R., Cohn D.H., van der Harten H.J., Rossi A.,
Blau N., Rimoin D.L., Steinmann B., Lander E.S. & Gitzelmann R. (1996)
Achondrogenesis type IB is caused by mutations in the diastrophic dysplasia sulphate
transporter gene. Nature Genetics 12, 100-2.
Thompson K.G., Blair H.T., Linney L.E., West D.M. & Byrne T. (2005) Inherited
chondrodysplasia in Texel sheep. New Zealand Veterinary Journal 53, 208-12.
Woelfle J.V., Brenner R.E., Zabel B., Reichel H. & Nelitz M. (2011) Schmid-type
metaphyseal chondrodysplasia as the result of a collagen type X defect due to a novel
COL10A1 nonsense mutation : A case report of a novel COL10A1 mutation. Journal
of Orthopaedic Science 16, 245-9.
Zankl A., Mornet E. & Wong S. (2008) Specific ultrasonographic features of perinatal lethal
hypophosphatasia. American Journal of Medical Genetics 146A, 1200-4.
Zhao X., Dittmer K.E., Blair H.T., Thompson K.G., Rothschild M.F. & D.J. G. (2011) A
novel nonsense mutation in the DMP1 gene identified by a genome-wide association
study is responsible for inherited rickets in Corriedale sheep. PLoS ONE 6, e21739.
doi:10.1371/journal.pone.0021739.
Zhao X., Du Z.Q. & Rothschild M.F. (2010) An association study of 20 candidate genes with
cryptorchidism in Siberian Husky dogs. Journal of Animal Breeding and Genetics
127, 327-31.
Zhao X., Du Z.Q., Vukasinovic N., Rodriguez F., Clutter A.C. & Rothschild M.F. (2009)
Association of HOXA10, ZFPM2, and MMP2 genes with scrotal hernias evaluated
via biological candidate gene analyses in pigs. Journal of Animal Breeding and
Genetics 70, 1006-12.
108
Table 1. The genotyping results of the causative mutation (g.25513delT)
Sheep
breed
Texela
Romney
a
Phenotype
Genotype
Rate of
concordance
Status
No. of
sheep
Direct
sequencing
PCR-RFLP
No. of
Sheep
Affected
15
g.25513delT/
g.25513delT
g.25513delT/
g.25513delT
15
100%
8
g.25513delT/T
g.25513delT/T
8
100%
54
T/T
T/T
54
100%
Presumed
Carrier
Normal
These sheep include pure Texel sheep and Texel crossbred sheep.
109
Figure 1. Chondrodysplasia of Texel sheep (A) A pair of full sibling lambs at 56 days.
The left lamb (lamb ID: 23-04) was phenotypically normal and its twin brother (lamb
ID: 24-04) was characterized with stunted growth. (B) A thirty-two day-old
chondrodysplasia lamb with characteristic short neck and wide-based stance. (C)
Sections through the trachea of a chondrodysplastic lamb (bottom) and a normal lamb
(top) illustrating the thickened, flattened cartilage ring in the affected lamb.
110
Figure 2. Work flow chart for discovering the mutation locus of chondrodysplasia in
Texel sheep. (A) The sheep chromosome 4 (OAR 4). (B) The IBD region containing the
causative locus is located on OAR 4 between 92.45 and 93.44 Mb. (C) Seven cow
reference genes including SLC13A1 encompassing the sheep IBD region based on the
comparative genomic maps. (D) The comparisons of the partial SLC13A1 mRNA
sequences and the partial protein sequences between the normal (wide type) and
chondrodysplasic sheep. The 3 underlined nucleotides in the mRNA sequence
highlighted with yellow color represent the codon affected by the g.25513delT mutation.
The corresponding amino acids were also underlined in the protein sequences. The next
following codon in the protein sequence of affected sheep, highlighted in red color,
introduced a stop codon represented by “X”. WT: wild type; A: affected sheep
111
Figure 3. The topology prediction of transmembrane domains (TMDs) of ovine
SLC13A1 protein. The peaks above the cut-off value of 1 represent the TMDs. There
are 13 putative TMDs in this protein.
112
Figure 4. The direct sequencing results of the causative mutation g.25513delT. All the
affected sheep with two copies of deletion T; the presumed carriers show one copy of
deletion T; the pooled normal controls show no copy of deletion T.
113
Figure S1. The pedigree information for the 23 sheep genotyped by Ovine SNP chip.
The result was attained from microsatellite results (data not shown). The boxes below
animal identifiers indicate the status of chondrodysplasia. The 3 terminal rams are at
the top-left. They sired dams 3-17 in the out-cross trials, and F1 daughters in the backcross and other 4 ewes with chondrodysplasia background. Arrows indicate parentage
results deduced by observation of maternal behavior soon after birth of lambs and
microsatellite makers. M: male; F: female.
114
Figure S2. The distribution of homozygous fragments commonly in affected lambs.
Based on the different number of SNPs spanning the homozygous fragment, the counts
of homozygous fragments commonly in all affected lambs were shown. Only one
homozygous fragment spanning 25 SNPs was identified commonly in all 15 affected
lambs (pointed by the blue color arrow), when the “larger than 10-SNP” threshold was
applied.
115
Figure S3. The CNV region display. Each column represents one individual sheep with
the copy numbers of SNP loci across the whole genome. Different colour shows different
ranges of copy numbers. The chondrodysplasia status for each sheep was listed on the
top of the display map. A: affected; C: presumed carriers.
116
CLUSTALW ( 1. 81)
Mouse
Rat
Sheep
Cat t l e
Human
Dog
Zebr af i sh
mul t i pl e sequence al i gnment
MKLLNYALVYRRFLLVVFTI LVFLPLPLI I RTKEAQCAYI LFVI AI FWI TEALPLSI TAL
MKLLNYAFVYRRFLLVVFTVLVLLPLPLI I RSKEAECAYI LFVI ATFWI TEALPLSI TAL
MKFFSYVLVYRRLLLVVFTLLI LLPLPLI LRSKEAECAYTLFVVAI FWLTEALPLSI TAL
MKFFSYI LVYRRLLLI VFTLLI LLPLPLI LRSKEAECAYTLFVI AI FWLTEALPLSI TAL
MKFFSYI LVYRRFLFVVFTVLVLLPLPI VLHTKEAECAYTLFVVATFWLTEALPLSVTAL
MKVFSYI LVYRRFLLI VLPPLVLLPLPI I VRTKEAECAYTWAVVAALWLTEALPLPVTAL
MRRLRCSWSALKPLI I I STPLAI LPLPLVI KTKEAECAYVLI LMAVYWMTEVLPLSVTAL
*: :
: *: : : . * : ** * * : : : : : * ** : * **
: : * *: * *. * ** . : ** *
Mouse
Rat
Sheep
Cat t l e
Human
Dog
Zebr af i sh
LPGLMFPMFGI MRSSQVAS- AYFKDFHLLLI GVI CLATSI EKWNLHKRI ALRMVMMVGVN
LPGLMFPMFGI MSSTHVAS- AYFKDFHLLLI GVI CLATSI EKWNLHKRI ALRMVMMVGVN
LPGLMFPMFGI MPSKEVAS- AYFKDFHLLLI GVI CLATSI EKWNLHKRI ALRMVMMVGVN
LPGLMFPMFGI MTSKEVAS- AYFKDFHLLLI GVI CLATSI EKWNLHKRI ALRMVMMVGVN
LPSLMLPMFGI MPSKKVAS- AYFKDFHLLLI GVI CLATSI EKWNLHKRI ALKMVMMVGVN
LPGLLFPLFGI TASKEVVASAYFKDFHLLLI GVI CLATSI EKWNLHTRI ALRMVMTVGVN
I PALLFPLFGI MKSSQVAS- QYFKDFHLLLLGVI CLATSI EKWGLHRRI ALRLVTGLGVN
: * . *: : * : * * * *. . * . : * * ** * * ** * : * * ** * * ** * ** * . * * ** * *: : * : ** *
Mouse
Rat
Sheep
Cat t l e
Human
Dog
Zebr af i sh
PAWLTLGFMSSTAFLSMWLSNTSTAAMVMPI VEAVAQQI I SAEAEAEATQMTYFNESAAH
PAWLTLGFMSSTAFLSMWLSNTSTAAMVMPI VEAVAQQI TSAEAEAEATQMTYFNESAAQ
PAWLSLGFMSSTAFLSMWLSNTSTAAMVMPI VEAVAQQVI SAEAEVEATQMTYFSGSTNQ
PAWLSLGFMSSTAFLSMWLSNTSTAAMVMPI VEAVAQQVI SAEAEVEATQMTYFSGSTNQ
PAWLTLGFMSSTAFLSMWLSNTSTAAMVMPI AEAVVQQI I NAEAEVEATQMTYFNGSTNH
PTWLTLGLMSSTAFLSMWLSNTSTAAMVMPI VEAVAQQI I SAQAEVEATQMTSLGGSTNH
PGMLMLGFMVGCAFLSMWLSNTSTVAMVMPI VEAVI QQVLSASEEANKDHKGALNGI SNP
* * * * : * . * ** * * ** * ** * *. * * ** * *. * ** * *: . * . * . : :
:. :
Mouse
Rat
Sheep
Cat t l e
Human
Dog
Zebr af i sh
GLDI DETVI GQETNEKKEKTKPAPGSSHDKGKVSRKMETEKNAVTG- - AKYRSRKDHMMC
GLEVDETI I GQETNERKEKTKPALGSSNDKGKVSSKMETEKNTVTG- - AKYRSKKDHMMC
ALEI DESI NGQEANERKEKTKPVPGYSNDAGKI SSKMELEKNSGTNSGNKYRTKKDHMMC
ALEI DESI NGQEANERKEKTKPVPGYSNDAGKI SSKMELEKNSGTNSGNKYRTKKDHMMC
GLEI DESVNGHEI NERKEKTKPVPGYNNDTGKI SSKVELEKNSGMR- - TKYRTKKGHVTR
GLEI DESVNGQETNERNEKTI PAPGYGNDAGKI SSKTEPEKNSDTG- - TKYRTKKDHMMC
ALQLEDI EAQHDQSE- - - KI KSI EAGI TEPEPTSEEHETAKAPTPPYSGKYRTREDHMMC
. *: : : :
:: .*
* . .
:
* : * * .
* * *: : : . * :
Mouse
Rat
Sheep
Cat t l e
KLMCLSVAYSSTI GGLTTI TGTSTNLI FSEHFNTRYPDCRCLNFGSWFLFSFPVALI LLL
KLMCLCI AYSSTI GGLTTI TGTSTNLI FSEHFNTRYPDCRCLNFGSWFLFSFPVAVI LLL
KLMCLCI AYSSTI GGLTTI TGTSTNLI FAEHFNMRYPDCRCI NFGSWFI MSFPAAI I I LL
KLMCLCI AYSSTI GGLTTI TGTSTNLI FAEHFNMRYPDCQCI NFGSWFI MSFPAAI VI LL
Figure S4. The comparison of the SLC13A1 protein sequences among 7 species. The
comparison was conducted using the multiple sequence alignment tool (CLUSTALW).
Table S1. A list of candidate gene symbol, primer sequence, annealing temperature, PCR amplicon information, genetic
variants identified by sequencing and accessed template sequence.
Gene
symbol
SLC13A1
Primer name
Primer sequence
5’AGCCACCTGTGCAAGACAAT3’
5’CGCATTGCTGCCTAGTCTC3’
5’AATAACCATTTGCGGCCCTA3’
sSLCre2R
sSLCdelrF
sSLCdelrR
sRFLPdelTFa
sRFLPdelTRa
sSLCre5_6F
5’AAGTCAAACATTCGCAGGAAA3’
5’TGGATAAGGGATGTGGAACTG3’
5’TGCCTGGATGTGCATTACTT3’
5’TGAATTACACTGCCATACAAATG
A3’
5’GGTCAAAGTTAGGAGCAGCAA3’
5’TCCACCTTCACACACACACA3’
sSLCre5_6R
sSLCre8F
sSLCre8R
sSLCre9F
sSLCre9R
5’TGCAAACTGCCATCACTATG3’
5’CTGTTCCCAGGTCTGCTTGT3’
5’AGAACACGGAACCCAGACAC3
5’CTCGCCACCATATCCCTCTA3’
5’AACCAGAAACAAAGCCAGGA3’
sSLCre10F
sSLCre10R
sSLCre11F
5’CCAGGTAGAAGGATAAGAAGCTG
A3’
5’GCTCCAAGGTCTCTACTCCTGA3’
5’AAGGGAACTTTTGGCCTAGC3’
sSLCre11R
5’AGAGGAAGGGTGCAGTCTGA3’
sSLCre12F
5’AAAATGCAAAGAAGTTTTCCAA3’
sSLCre12R
sSLCre13F
5’GCTGGGACATCATAAAAATGG3’
5’AAAAACCACAGGAAAATATGAAA
GA3’
5’CTGCTGCTTGAACATGGAAT3’
5’CAATCACCATATAAACAACCGAA
A3’
5’GCACGTTTGCTCCAGGTTAT3’
5’GAAACTTCTCTGATGCTCTCCAA3’
sSLCre13R
sSLCre14F
sSLCre14R
sSLCre15F
Variant name
SNP
characteristic
Sequence access
No.
Annealing
Temp. (ºC)
Size
(bp)
Fragment
name
60
472
sSLCe1
no
-
-
JN108880
60
301
sSLCe2
no
-
-
OAR4:9288600092995055
56
411
sSLCdel
349
sRFLPdelT
g.25395C>T
g.25513delTb
g.25513delTb
intronic
exonic
exonic
JN108880
58
C > T (170bp)
Deletion T (288bp)
Deletion T (148bp)
64
510
sSLCe5_6
A > G (181bp)
g.78757A>G
intronic
64
585
sSLCe8
A > G (261bp)
C > T (381bp)
g.78677A>G
g.59017C>T
intronic
intronic
58
711
sSLCre9
C > T (61bp)
A > G (212bp)
g.63875C>T
g.64026A>G
58
609
sSLCre10
no
-
exonic,
synonymous
-
60
545
sSLCre11
G > T (169bp)
g.39802G>T
intronic
g.39732A>C
g.39527C>T
-
intronic
intronic
-
(GenBank/Ovine
1.0)
JN108880
OAR4:9288600092995055
JN108880
JN108880
JN108880
OAR4:9288600092995055
60
535
sSLCre12
A > C (239bp)
C > T (444bp)
no
60
505
sSLCre13
no
-
-
OAR4:9288600092995055
60
507
sSLCre14
A > T (54bp)
g.30951A>T
intronic
OAR4:9288600092995055
58
529
sSLCre15
A > G (27bp)
g.29232A>G
intronic
OAR4:9288600092995055
OAR4:9288600092995055
117
sSLCre1F
sSLCre1R
sSLCre2F
Genetic variant
(position in the
fragment)
Table S1. (continued)
Annealing
Temp. (ºC)
Size
(bp)
Fragment
name
5’GGCACATAGATGCTCTTGGTT3’
5’TGTTGAGTAGCCAGTTGTGGTT3’
58
567
sSLCre16
no
-
-
5’TTGCAATTGACTTTCCTTGC3’
5’TGGTGGGCAATTCACTTCTT3’
OAR4:9288600092995055
58
339
sCADPS-1
no
-
-
OAR4:9215574592255745
58
405
sCADPS-2
no
-
-
OAR4:9215574592255745
60
515
sCADPS2-3
no
-
-
OAR4:9215574592255745
58
353
sCADPS2-4
no
-
-
OAR4:9215574592255745
5’AGTTCCAGGGAGACCATTCC3’
58
302
sIQUB-1
no
-
-
sIQUB-1R
sIQUB-2F
5’TGATTAAACCACCAGCACCA3’
5’GAATACAGGGCAGGGAAAGG3’
OAR4:9320975293265179
58
320
sIQUB-2
no
-
-
sIQUB-2R
sIQUB-3R
5’GCAATGGTTCATTACTTTTGGA3’
5’CCCCCAAAAGCAGAACTTTA3’
OAR4:9320975293265179
58
392
sIQUB-3
no
-
-
sIQUB-3R
sIQUB-4F
5’GCATATCTACTTCAGAGGGCTTT3’
5’AAATGAAGCCAGTAGCGGAAT3’
OAR4:9320975293265179
58
404
sIQUB-4
no
-
-
sIQUB-4R
sASB15_1F
sASB15_1R
sASB15_2F
sASB15_2R
sASB15_3F
sASB15_3R
sASB15_4F
5’TTTGTTATCATGGAAGAGACTTGC3’
5’ CTGTGTGAAGCGATGGTCAG3’
5’GGCACTCGGATTTGTAGGTC3’
5’GGGGACCTACAAATCCGAGT3’
5’GGGGAAAGTGGTCTGTTTGA3’
5’TTGAAAAGGCCAAGAACACA3’
5’CTGTTTGGATTTTGGGAGACA3’
5’TTGCCTGTGGCTCTCATACA3’
OAR4:9320975293265179
58
607
sASB15-1
A > G (150bp)
g.150A>G
exonic
JN108881
58
519
sASB15-2
no
-
-
JN108881
58
404
sASB15-3
no
-
-
JN108881
58
343
sASB15-4
no
-
-
sASB15_4R
5’TTACCAGCACAGCCATACGA3’
OAR4:9334394893353368
Primer
name
Primer sequence
SLC13A1
sSLCre15R
sSLCre16F
sSLCre16R
sCADPS21F
sCADPS21F
sCADPS22F
sCADPS22R
sCADPS23F
sCADPS23R
sCADPS24F
sCADPS24R
sIQUB-1F
IQUB
ASB15
Variant
name
SNP
characteristic
5’CGGGTCAACTTAGCAGGTTT3’
5’TTGCAGACACACCCTCAAAA3’
5’GGACAGAGAAGCCTGGAGTG3’
5’GAACCTTCCTGTTGATGCAAG3’
5’CCATGTTTCCAGATAAATTCTGTTC3’
5’AAATCATGGGCATTCAGACC3’
5’CATGAAAGGCAGCATGAGAA3’
(GenBank/Ovine
1.0)
118
Gene
symbol
CADPS2
Sequence access
No.
Genetic variant
(position in the
fragment)
Table S1. (continued)
Annealing
Temp. (ºC)
Size
(bp)
Fragment
name
5’TGCAGATGCCCTCATTGTTA3’
58
307
sASB15-5
no
-
-
sASB15_5R
sASB15_6F
5’CCTGTTGAACGGAGCGTAA3’
5’TCCAAACATGCAATCCAGAA3’
58
372
sASB15-6
no
-
-
sASB15_6R
sLMOD2-1F
5’ACTGTATGGCACTGGGGAAG3’
5’AGCTGGACGACATTGAACCT3’
OAR4:9334394893353368
58
434
sLMOD2-1
C > T (207bp)
g.470C>T
intronic
sLMOD2-1R
sLMOD2-2F
5’TCTAAACCCGTGCTTTGCTT3’
5’GGGTGTCAACCCATTTCAAC3’
58
444
sLMOD2-2
G > T (230bp)
no
g.493G>T
-
intronic
-
OAR4:9337023993377211
sLMOD2-2R
sLMOD2-3F
5’GTGCCAACAGGACTTGGTTT3’
5’TGTGAGTTCATCCCTGGTCA3’
58
337
sLMOD2-3
no
-
-
sLMOD2-3R
sLMOD2-4F
5’TGCTCTGGAAAACCCTTGAT3’
5’CAGCGAAGGAGAGGAAAGAA3’
OAR4:9337023993377211
63
430
sLMOD2-4
no
-
-
sLMOD2-4R
sLMOD2-5F
5’GATGTTGACGCTGGTGATGT3’
5’CCTTTGGCTACAGGAGAGGA3’
OAR4:9337023993377211
58
306
sLMOD2-5
no
-
-
sLMOD2-5R
sWASL-1F
5’GGGATTGGGAATAGCTGAGG3’
5’GGGGGTCTCGTCTTTTCTCT3’
OAR4:9337023993377211
63
466
sWASL-1
C > T (248bp)
g.256C>T
intronic
sWASL-1R
sWASL-2F
5’TTGTCCCTCTGCTGTCACTG3’
5’TGTTTTGTGTAAAGGGCCAGA3’
OAR4:
9343101793481693
60
494
sWASL-2
no
-
-
sWASL-2R
sWASL-3F
5’TCCATGCTTTTTGCCTTCTT3’
5’CGTTGCTTCCTATTAAGGCGTA3’
OAR4:
9343101793481693
58
381
sWASL-3
Deletion A (309bp)
g.5692delA
intronic
sWASL-3R
sWASL-4F
5’TGCTCTGTGTGTAAAGGCTGTT3’
5’AACGCCCATGCTTCAATATC3’
OAR4:
9343101793481693
58
399
sWALS-4
no
-
-
sWASL-4R
5’GGGGTTCTGGTTCAGTGTCT3’
OAR4:
9343101793481693
Gene
symbol
Primer
name
Primer sequence
ASB15
sASB15_5F
LMOD2
Variant
name
SNP
characteristic
(GenBank/Ovine
1.0)
OAR4:9334394893353368
OAR4:9337023993377211
119
WASL
Sequence access
No.
Genetic variant
(position in the
fragment)
120
CHAPTER 5. BAYESIAN INFERENCE IDENTIFIES A CANDIDATE
REGION ASSOCIATED WITH CANINE CRYPTORCHIDISM THAT INCLUDES
THE AMHR2 GENE
A paper prepared to submit to the Journal of Heredity
Xia Zhao, Suneel Onteru, Mahdi Saatchi, Dorian Garrick, Max Rothschild
Department of Animal Science, Iowa State University, IA, 50011 USA.
Key word: AMHR2, cryptorchidism, BayesB analysis
ABSTRACT
Cryptorchidism is a condition whereby one or both testes fail to descend into the scrotal
sac.
It is a common congenital disorder prevalent in many mammalian species.
Cryptorchidism is a complex genetic disease considered to be caused by the interaction of
genetic, epigenetic and environmental factors. No single mutated gene has been found to
cause more than 10% of cryptorchids. Here we performed a genome wide association
study (GWAS) using case-control analysis in PLINK software and the BayesB approach
in GenSel software with a 1Mb window of SNPs or haplotypes. The haplotypes were
constructed from a genealogical tree on the population of 204 Siberian Huskies and its 3
subgroups clustered by genomic relationships to account for population structure. No
significant association was observed in the case-control analyses after FDR corrections in
PLINK. In contrast, the BayesB analysis identified a 1Mb region on dog chromosome 27
121
(CFA27) containing the AMHR2 (anti-mullerian hormone type II receptor) gene that
explained a high percentage of genetic variance in subgroup 1 (~ 10 fold greater than the
expected value for an infinitesimal model in the SNP window analysis and ~ 9 fold
greater in the haplotype window analysis). It is known that mutations in AMHR2 cause
persistent mullerian duct syndrome (PMDS) in humans and the majority of such patients
with PMDS exhibit cryptorchidism. In conclusion, we identified a putative candidate
region at the 4th Mb on CFA27 (CanFam2.0) harboring a very promising functional
candidate gene AMHR2 associated with the development of cryptorchidism in a subgroup
of our Siberian Husky population.
INTRODUCTION
Testicular descent is an essential developmental stage for normal reproduction in male
individuals of many mammalian species.
Normally, testes descend from an intra-
abdominal position down into the scrotal sac. That environment with 2 - 4 °C below
normal body temperature is a prerequisite for spermatogenesis in humans and many other
mammals (Klonisch et al. 2004; Thonneau et al. 1998). The approximate timing of
testicular descent varies in different species. In mice, rats and especially dogs, testicular
descent is normally completed postnatally, while in humans, rabbits, pigs and horses, this
process is normally completed before birth (Amann and Veeramachaneni, 2007; Hughes
and Acerini, 2008). The failure of a testis to descend into the scrotal sac is called
cryptorchidism (Greek: hidden testicle). This defect is regarded as one of the most
common congenital human defects. About 3% of full term and 30% of premature infant
boys are born with at least one testicle undescended (Klonisch et al. 2004).
Cryptorchidism has been observed in other mammalian species, such as dogs (Cox et al.
122
1978), cats (Yates et al. 2003), horses (Cox et al. 1979), pigs (Rothschild et al. 1988),
goats (Singh and Ezeasor 1989), sheep (Smith et al. 2007) and cattle (St Jean et al. 1992),
as well as in wild animals like maned wolves (Burton and Ramsay, 1986), black bears
(Dunbar et al. 1996) and reindeer (Leader-Williams 1979). An undescended testicle is
considered a risk factor for germ-cell tumors and bilateral cryptorchidism can result in
male infertility (Berkowitz et al. 1993; Carizza et al. 1990).
A general perception of causes for cryptorchidism is a multiplicity of factors including
genetic, epigenetic and environmental components (Spencer et al. 1991; Klonisch et al.
2004; Thonneau et al. 2003; Hutson and Hasthorpe, 2005; Amann and Veeramachaneni
2007). A number of environmental factors have been recognized to be associated with an
undescended testis in humans, and these factors include low birth weight, premature birth,
maternal consumption of alcohol, smoking and caffeine, gestational diabetes and prenatal
exposure to pesticides (Hughes and Acerini 2008; Wang and Wang 2002).
Familial analyses show that cryptorchidism is a genetic disorder.
Various genes
implicated in the regulation of testicular descent have been investigated for
cryptorchidism. The processes of testicular descent have been proposed as a biphasic
model, with a “transabdominal” phase followed by an “inguinalscrotal” phase (Hutson
1985). Androgens might play a role in both phases. Impaired fetal androgen action can
result in cryptorchidism. Genes related to androgen action have been investigated in the
study of cryptorchidism. The androgen receptor gene (AR) is strongly expressed in the
gubernaculums (George and Peterson 1988). The length variant of trinucleotide repeats
like GGN (encoding glycine) or CAG (encoding glutamine) in AR was identified
respectively to be associated with a risk of cryptorchidism in an Iranian population and in
123
Hispanic boys (Radpour et al. 2007; Davis_Dao et al. 2012). Mice with conditional
inactivation of Ar developed cryptorchidism with testes located in a suprascrotal position
(Kaftanovskaya et al. 2012). Androgens act on testicular descent via the release of a
calcitonin gene-related peptide by the genitofemoral nerve.
Calcitonin gene-related
peptides are encoded by two separated genes named CALCA (CGRP-alpha) and CALCB
(CGRP-beta). Exogenous calcitonin gene-related peptides injection into the scrotum can
change the direction of gubernacular migration in the mutant trans-scrotal (TS) rat
(Griffiths et al. 1993; Clarnette and Hutson 1999). However, the CGRP gene was not
found to be a major factor for cryptorchidism based on a mutation association analysis in
a human study (Zuccarello et al. 2004).
The swelling and migration of male gubernaculum with relaxation of the cranial
suspensory ligament in the process of testicular descent allow testes to move from the
posterior abdominal wall to the internal inguinal ring (Hutson 1985). The gubernaculum
is a continuous column of mesenchyme connecting the testis to the scrotal sac. Insulinlike factor 3 (INSL3) is expressed in the pre-and postnatal Leydig cells of the testis and
postnatal theca cells of the ovary, and is a member of the insulin-like hormone
superfamily (Zimmermann et al. 1997).
Targeted disruption of this gene prevents
gubernaculum growth and causes bilateral cryptorchidism in mice (Nef and Parada 1999;
Zimmermann et al. 1999).
The receptor of INSL3 is encoded by a gene called
relaxin/insulin-like family peptide receptor 2 (RXFP2), also named the leucine-rich
repeat-containing G protein-coupled receptor 8 gene (LGR8). LGR8 is highly expressed
in the gubernaculum and is responsible for mediating hormonal signals that control
normal testicular descent (Bogatcheva et al. 2003). Mutant mice for this gene exhibit
124
bilateral intra-abdominal cryptorchidism due to the undifferentiated gubernaculum
(Gorlov et al. 2002). Summarized from previous reports, only a minority of affected
cases have a mutation either in the INSL3 gene or its receptor gene LGR8.
Disruptions of other hormone related genes also show associations with aberrant
testicular descent.
Mullerian inhibiting substance (MIS), or anti-Mullerian hormone
(AMH), encoding a 140-kDa glycoprotein produced by Sertoli cells is critical for the
mullerian duct regression during male sexual differentiation (Bashir and Wells 1995) and
is associated with the swelling reaction of gubernaculum occurring during the first phase
of testicular descent (Kubota et al. 2002). The mutated gonadotropin-releasing hormone
receptor (Gnrhr) gene is associated with undescended testes in mice (Pask et al. 2005).
In humans, maternal exposure to the synthetic nonsteroidal estrogen, diethylstilbestrol
(DES), results in a high risk of offspring with cryptorchidism (Gill et al. 1979). The
receptor of estrogen, estrogen receptor 1 (ESR1), is one of the ligand-activated
transcription factors and is important in the process of testicular descent (Donaldson et al.
1996; Przewratil et al. 2004). Retracted gonads in knockout male mice of Esr1 suggest
estrogens played a direct role in the scrotal positioning of the testis (Donaldson et al.
1996). Investigations have also been carried out on homeobox (Hox) genes in mice for
cryptorchidism. Homeobox A10 (Hoxa10) is highly expressed in the gubernaculum and
this gene has been found to be essential for the gubernacular migration during testicular
descent (Nightingale et al. 2008). In mice, disruption of this gene and another Hox gene,
called Hoxa11, results in the development of cryptorchidism (Satokata et al. 1995; HsiehLi et al. 1995).
125
There is a high prevalence (about ~14%) of cryptorchidism in Siberian Husky dogs (Zhao
et al. 2010).
Although several studies show association of some candidate genes with
cryptorchidism, the genetic basis of the development of cryptorchidism is still
controversial. Here we performed a genome wide association study in Siberian Husky
dogs using the Illumina CanineHD BeadChip with more than 170,000 single nucleotide
polymorphisms (SNPs) and tried to identify genomic regions and propose some
positional candidate genes associated with variation in the incidence of cryptorchids.
METHODS
Animals and phenotypes
A total of 205 male Siberian Huskies and 12 male Chinooks were included in this study.
Among them, 43 Siberian Huskies were from the UK, 8 from Canada, and the remaining
154 Siberian Huskies and all Chinooks were from USA. Dogs were diagnosed at 6
months of age by veterinarians or dog owners as being either unilateral or bilateral
cryptorchids, or as normal unaffected dogs. In this study, there are 106 dogs with
cryptorchidism and 99 diagnosed normal in the Siberian Husky breed. Six affected and 6
control Chinooks were included as a control population for population stratification.
Samples and DNA preparation
Buccal samples were collected by owners from 186 Siberian Huskies and 12 male
Chinooks using sterile cytology brushes (Fisher Scientific, Hampton, NH, USA). The
Canine Health Information Center (CHIC) DNA repository provided 19 male Siberian
Huskies DNA samples with an average concentration of 50 ng/ul. DNA was extracted
from 2 to 3 cytology brushes per dog within a single reaction of the QIAamp DNA Mini
Kit (QIAGEN, Inc., Valencia, CA, USA). The DNA concentrations ranged from 10.0 to
126
50.0 ng/µl. Samples with DNA concentration less than 20 ng/µl were subjected to wholegenome amplification (WGA) using the GenomiPhi (GE Healthcare, Buckinghamshire,
England) kit to yield high-quality DNA suitable for the Illumina infinium canine
genotyping platform.
Data generation and analyses
SNP array genotyping and quality control
DNA samples with a total DNA amount of 700-1,000 ng were sent to GeneSeek, Inc.
(Lincoln, NE, USA) for genotyping on the CanineHD BeadChip. The SNPs with call
rate ≤ 40%, Gentrain score ≤ 40%, and minor allele frequency ≤ 0.01 were excluded from
the data set.
Analyses of population structure
Genotypes of all 217 dogs including 205 Siberian Huskies and 12 Chinooks were
imported into Haploview v4.2 software (Barrett et al. 2005).
Pairwise linkage
disequilibrium (LD) comparisons among markers of each chromosome were performed.
Tagging SNPs were identified using the TAGGER routine implemented in Haploview.
In order to remove the underlying group effects we stratified the population, using the
model-based clustering method in STRUCTURE v2.3 software (Pritchard et al. 2000) for
inferring population structure using multi-locus Illumina 170K CanineHD BeadChip
genotypes for all animals.
The admixture model was chosen in STRUCTURE on
selected tagging SNP makers and most of the parameters were at their default values.
The length of both the burn-in and post burn in Markov chain Monte Carlo (MCMC)
interations were 5,000 wth 3 runs for each K (the number of presumed population
127
clusters). The presumed value of K was varied from 1 to 7. To detect the true value of K,
a second order rate of changes of the likelihood function with respect to K represented as
∆K was applied. The highest value of the ∆K distribution was used as an indicator of the
strength of the signal detected by STRUCTURE (Evanno et al., 2005). The dogs were
assigned into different subgroups according to the constitution of Q values (estimated
membership coefficient) in presumed clusters for each individual.
Genome wide association studies
Genome wide association studies were carried out on each of the subgroups assigned by
STRUCTURE as well as on all male Siberian Huskies. Three approaches were applied
for the data analyses. The first approach was the case-control analysis implemented in
PLINK v1.06 software (Purcell et al. 2007) using single SNP markers. The linear step-up
procedure of Benjamini and Hochberg FDR control in PLINK was utilized in order to
correct for multiple testing.
The second approach was a BayesB model averaging
approach using the GenSel v4R software (http:// bigs.ansci.iastate.edu) with the
categorical analysis option, to obtain the variance explained by SNPs in one megabase
(1Mb) genomic windows (which we refer SNP model). The statistical model used for
BayesB was as follows:
1 where is the phenotype of animal i, is an overall mean, is the total number of SNPs
in the panel, is genotype at marker j in individual i, is a random substitution effect
for marker with its own variance 0 ( with a probability 1- ) or 0 (with a
128
probability of ), and ei is the random residual on animal i that are assumed to be
normally distributed N (0, . The parameter was assumed to be 0.9999 so as to fit
10-20 markers (0.0001x 170K) per iteration of the Markov chain in a mixture model for
the estimation of individual SNP effects. A total of 51,050 MCMC iterations with burnin of 1,000 iterations and with an output frequency of 50 iterations were used for
inference. There were 2,346 unique 1Mb windows across the dog genome based on the
genotyped markers. The number of markers in each 1Mb window varied according to the
genomic regions. It was assumed that under an infinitestimal model all windows would
account for the same variance of about 0.04%. Hence, the 1Mb windows explaining at
least 0.4% of genetic variance, which is 10 fold greater than expected, were considered as
indicative candidate regions for a gene influencing the incidence of cryptorchids.
In order to utilize haplotype information for identifying genes underlying complex traits,
a tree-based association method that captures information about the evolutionary history
(genealogy) of the genome were implemented in the analysis. The third approach, we
refer to as the BayesB analysis with haplotype model, was carried out on each of the
subgroups and their combined dataset to further verify regions showing associations with
cryptorchidism in the BayesB analysis with SNP model.
To perform this analysis,
genotypes for each individual were phased on each chromosome (based on the CanFam
2.0 assembly) using BEAGLE v3.3 software in which a localized haplotype-cluster mode
was fitted using an EM algorithm (Browning & Browning 2007). Subsequently, local
phylogenetic trees were constructed for each SNP marker with default values in Blossoc
software (Mailund et al. 2006) and were drawn as a rooted binary tree (Sahana et al.
2011). At the third level of each tree, there are eight possible nodes and all haplotypes
129
descending from each of the nodes (with the same phylogenetic lineage) were considered
as a cluster. A new incidence matrix was developed to relate each of the genotyped
animals to the detected haplotype clusters and its original SNP genotypes. The same
BayesB model was applied to the new incidence matrix and the genetic variance
explained by each 1 Mb genomic window was obtained using GenSel software. The
windows passing the criteria (> 0.40% of genetic variance) in the SNP model and also
explaining at least 0.20% of genetic variance, which is 5 fold greater than expected, were
retained as candidate regions in this study.
RESULTS
There were 143,286 SNPs for GWAS analyses after quality control edits. One Siberian
Husky dog was excluded due to its low genotype call rate and the remaining 105 affected
and 99 controls were included for the GWAS analyses. A total of 47,672 tagging SNP
markers were selected using Haploview (data not shown) and then used to run
simulations in STRUCTURE. Under a model assuming admixture, the results recover 3
population clusters (K=3) according to the ∆K statistic which is based on the rate of
change in the log probability of the data (Figure 1B). The estimated proportion of
ancestry (Q value) derived from the 3 assumed population clusters varied among
individual dogs. The 12 Chinook dogs were distinct from the Siberian Husky dogs
(Figure 1A). Here we re-assigned dog samples into 4 subgroups according to their
constitution of Q values in the 3 population clusters (Figure 1C, Supplementary Table S1).
The first 3 subgroups are Siberian Huskies and subgroup 4 included all 12 Chinooks with
extreme Q values of cluster 3 (0.928 to 1). Due to the small sample size, the Chinook
dogs in subgroup 4 were used only as a control breed for the structure clustering, but not
130
included in subsequent GWAS. There were 129 dogs in subgroup 1 with a range of Q
values from 0.709 to 1 for cluster 1, the majority of these dogs were collected from USA
(95%) and 5% were from Canada. Sixty dogs are cryptorchids and 69 are controls in this
subgroup. Subgroup 3 included 20 cryptorchids and 15 control dogs, with high Q values
of cluster 2 (0.700-0.838). Most of these dogs were sampled from the UK (94%).
Individuals in the subgroup 2 had intermediate Q values for both clusters 1 and 2. These
dogs were of mixed origins, being sampled from USA (71%), the UK (24%) and Canada
(5%). There were 41 individuals in subgroup 2, and 26 were cryptorchids. The sample
sources and breeds for each dog subgroup are in Figure 1C and the supplementary Table
S1.
Twelve 1Mb SNP windows showed a high percentage of genetic variance (> 0.4%, 10
fold greater than the expected value) when the BayesB analyses were conducted in
subgroup 1. These regions include one window on each of dog chromosomes CFA1,
CFA3, CFA14 and CFA18, four windows on each of CFA5 and CFA27. Notably, the
four windows on CFA27 explain 4.67% of genetic variance and one of them located at
the 5th Mb explains 3.24% of genetic variance by itself (Table 1, Figure 2). Furthermore,
three out of these four windows on CFA27 remain with high genetic variance compared
to other regions (> 5 fold of the expected values, 0.34%, 0.53% and 0.22% for the 1Mb
windows located at the 4th, 5th and 6th Mb on CFA27, respectively) when the window
approach using haplotype model was conducted in BayesB analyses for subgroup 1
(Table 1). The windows on CFA1 and CFA14 were both at 50 Mb and were also detected
by the haplotype analysis accounting for genetic variance over 0.2% (Table 1, Figure 2).
By searching the database of CanFam 2.0 using Ensembl browser, 74 characterized genes
131
have been mapped between the 4th to the 6th Mb on CFA27, 8 on the 50th Mb of CFA1,
and 5 on the 50th Mb of CFA14 (Supplementary Table S2). Among them, a candidate
gene named anti-mullerian hormone type II receptor (AMHR2) was found to be located in
the region of the 4th Mb of CFA27 (Figure 2, Supplementary Table S2). Human patients
with mutations in this gene typically develop cryptorchidism (Clarnette et al. 1997).
Moreover, a cluster of HOX-C genes including HOXC4, HOXC5, HOXC6 HOXC8,
HOXC10, HOXC11, HOXC12 and HOXC13 were identified as being located within the
5th Mb window on CFA27 (Table S2).
The results from BayesB analyses by SNP model for subgroup 2 and subgroup 3
identified six different regions. Among those six 1Mb regions, only two of the 1Mb
windows were also identified as candidate regions with the haplotype model, those
windows at 75 Mb on CFA4 for subgroup 2 (Table 1, Supplementary Figure S1) and 18
Mb on CFA31 for subgroup 3 (Table 1, Supplementary Figure S2). Eight characterized
genes such as CAPSL (calcyphosine-like), IL7R (interleukin 7 receptor) and SPEF2
(sperm flagellar 2) etc. have been mapped to 75 Mb of CFA4 and two types of U6
spliceosomal RNA have been mapped to 18 Mb of CFA31 (Supplementary Table S3). No
apparent genes in these regions have been reported so far to be associated with the
processes of testicular descent or the development of cryptorchidism. The sample size
for these two subgroups is relative small. More samples are needed in the future for
discovering new loci if it is the fact that dogs with cryptorchidism in these two subgroups
have distinct causative mutations from individuals in subgroup1.
The results from GWAS analyses using all Siberian Huskies (105 affected and 99
controls) show that 1Mb SNP windows on CFA3, 6, 9, 24, 27, X explain a high
132
percentage of genetic variance (higher than 0.40%) (Table 1). Among those, four regions
including the 31st and 33rd Mb windows on CFAX, 56th Mb on CFA9 and 13th Mb
window on CFA27 also showed a high percentage of variance (higher than 0.20%) when
the window were fitted using the haplotype model (Table 1, Supplementary Figure S3).
Comparing the results of this pooled analysis with previous analyses from each of the 3
subgroups separately, we did not identify a common window explaining high percentage
of genetic variance.
It is to be noted that most of the SNPs with lowest raw P-values on each chromosome in
the case-control analyses in PLINK were located within the windows detected from the
BayesB analyses (Figure 2A, Supplementary Figure S1A, S2A and S3A), even though
their significance disappeared after FDR corrections (data not shown).
Additionally, we summarized the results from the BayesB analyses for genes that had
previously been shown to be associated with cryptorchidism, including ESR1, NR5A1
(nuclear receptor subfamily 5, group a, member 1), GNRHR, HOXA10, HOXA11, FGFR1
(fibroblast growth factor receptor 1), SOS1 (son of sevenless, drosophila, homolog 1),
WT-1 (Wilms tumor gene 1), INSL3, AMH, CALCA, PROKR2 (prokineticin receptor 2),
LGR8, COL2A1, KAL1 (Kallmann syndrome interval gene 1) and AR (Table 2). There
are no regions covering these genes passing the criteria set in this study to show apparent
associations with cryptorchidism, neither in the three subgroup analyses nor in the pooled
analysis of the Siberian Husky population. The causative loci for cryptorchidism in our
study do not seem to coincide with those previously reported candidate genes.
133
DISCUSSION
Genetic association studies relate differences in disease status reflected as case and
control groups with differences in allele frequencies at SNPs. Undetected population
substructure in an admixed population can lead to spurious allelic associations (Cardon
and Palmer 2003). Diseases could be heterogeneous when the same condition is caused
by different genes or different alleles at the same gene in different populations. The
genetic basis for cryptorchidism is likely heterogeneous in different populations. Perhaps
at least 20 genes might be associated with cryptorchidism, since no single mutated gene
has been found to cause more than 10% of cryptorchids (Klonisch et al. 2004; Amann
and Veeramachaneni 2007). Previous studies have shown that only 3-5% cryptorchid
men exhibit gene aberrance for INSL3 or GREAT and < 16% for AR or ESR1 (Amann and
Veeramachaneni 2007). In another instance, a specific haplotype of ESR1 has been
found to be associated with cryptorchidism status in a Japanese population (Yoshida et al.
2005), but only being associated with the level of severity of cryptorchidism in other
populations (Wang et al. 2008).
In such cases, isolating a relatively homogeneous
subpopulation from a heterogeneous population is necessary for cryptorchidism studies.
Here we used STRUCTURE to infer the presence of distinct population clusters and then
re-assigned individuals to 4 new subgroups in order to discover aberrant genes associated
with the condition in each subsample. The re-assigned dog subgroups were consistent for
the most part with their geographic distribution but the regions (windows) identified in
each subgroups were distinctly different in each sub population. It demonstrates that
different genes might contribute to cryptorchidism in different groups of dogs.
134
Comparing the results of the pooled analyses (all 204 Siberian Huskies) using BayesB
approaches with results from the three subgroups separately, we did not identify a
common window accounting for high genetic variance. This may be due to fact that dogs
from different geographical backgrounds in the pooled population blended the allele
frequencies of SNPs associated with the disease in any particular subgroup.
In this study, the AMHR2 gene was considered a promising candidate gene for canine
cryptorchidism in Siberian Husky subgroup 1 based on results from BayesB analyses
using the window approaches. The SNP (BICF2G630137913, 4791435bp on CFA27)
located within AMHR2 gene was not significantly associated with cryptorchidism from
the results of single marker analysis in PLINK after FDR corrections (praw = 0.002, pgenome
= 0.9971). This marker is about 0.59Mb away on CFA27 from SNP BICF2G6301380
which has the smallest raw P value in the analyses (praw = 2.39E-05). By checking the
LD structure for SNPs on CFA27 using Haploview, BICF2G630137913 did not show
strong LD with nearby SNPs (r2 = 0.2 with the left SNP BICF2G630137904 or the right
SNP BICF2G630137922). The average LD for BICF2G630137913 with other markers
on CFA27 was very low (r2 = 0.04).
It is not surprising to find that the window
containing BICF2G630137913 shows high percentage of genetic variance, but no
significant association was observed between this SNP and cryptorchidism in the single
marker analysis using PLINK. Supposing a true causative mutation in AMHR2 is not in
high LD with SNP BICF2G630137913, the single SNP analysis will not help to capture
the association with the studied disease. Therefore, the cumulative effects of 1Mb SNP
windows or windows with integration of haplotype information would facilitate detection
of regions harboring the causative loci that may not be apparent in single SNP analyses.
135
To confirm whether AMHR2 is a causative locus for cryptorchidism, expression array
studies and/or fine mapping by sequencing the whole gene are necessary.
The case-control analysis using single SNP tests in the PLINK program is a frequentist
approach by calculating a p-value for the null hypothesis of no association. Multiple
testing corrections like FDR and Bonferroni adjustment are usually performed (Noble
2009). The sample size and minor allele frequency will strongly affect the results using
this approach (Stephens and Balding 2009).
In this study, there is no significant
association remaining after FDR corrections for SNPs on the BeadChip with the given
disease in any data sets. The relatively small sample size would limit the power of
detecting associated markers or regions for a complex disease. In this study, Bayesian
approaches with SNP or haplotype windows were also used to assess evidence for an
association between genomic regions containing multiple genetic variants and
cryptorchidism. The 1Mb window approach accounts for linkage disequilibrium among
adjacent SNP and accumulates the effects from multiple genetic variants in any particular
region. This method reduces the chance of spurious associations that can easily occur
from a single locus analysis (Balding, 2006). Incorporating biological information of
genes with GWAS analyses may also help find causative gene loci. Here we found a
potential gene AMHR2 that has a known biological function on sex differentiation and is
plausibly related to the development of cryptorchidism.
It is known that anti-mullerian hormone (AMH) is one of the hormones mediating male
sex differentiation by regression of mullerian ducts which would otherwise differentiate
into the uterus and fallopian tubes (Behringer 1994). Double-knockout mouse models for
AMH and AMHR2 suggest that AMH is the only ligand of the AMHR2 receptor (Mishina
136
et al. 1996).
Mutations in AMH and AMHR2 have been found to cause persistent
mullerian duct syndrome (PMDS), a rare form of male pseudohermaphroditism (Imbeaud
et al. 1995; Messika-Zeitoun et al. 2001). There are 60-70% patients with PMDS that
have their testes located in the abdomen at a position analogous to the ovaries without
descending into the normal scrotal destinations (Clarnette et al. 1997). Many PMDS
patients have been found with the absence of gubernacular structures or feminized
gubernaculum (Clarnette et al. 1997).
The disruptions of AMH or AMHRs may cause
human cryptorchidism by preventing transabdominal descent through failure of the
gubernacular swelling reaction (Clarnette et al. 1997). It was reported that a boy carried
a heterozygous WT1 nonsense mutation had bilateral cryptorchidism, nystagmu and
Wilms’ tumor. The nonsense mutation leads to a truncated protein by losing three zincfinger domains for DNA binding (Terenziani et al. 2009).
The cDNA microarray
analyses on Wt1 (Wilms’ tumor suppressor gene) knockout mice have shown that
inactivation of the Wt1 gene dramatically down-regulates gene expression of Amkr2.
This shows that Amkr2 is a target gene of Wt1 when it plays the role in urogenital and
gonadal development during sex differentiation (Klattig et al. 2007). Therefore, the WT1,
AMHR2 and AMH genes might be within the same pathway to regulate sex differentiation.
Gene disruptions of one of these genes could prompt the development of undescended
testicles in males.
It is known that HOX genes are a group of related genes determining the regional identity
of an embryo along the anterior- posterior axia and the type of segment structure (Joseph
et al. 2005). The gubernaculum is central to testicular descent, with evidence suggesting
that it elongates to the scrotum like a limb bud (Huynh et al. 2007). The genes HOXA10
137
and HOXA11 involved in limb bud outgrowth are expressed within the gubernaculum
(Yuan et al. 2006), and these two genes have been found to play a role in the developing
genito-urinary tract.
The Hoxa10-/- and Hoxa11-/- mutant male mice present
cryptorchidism (Satokata et al. 1995; Hsieh-Li et al. 1995). Disrupted gene functions or
ectopic expression of some of the Hoxc genes such as Hoxc4, Hoxc6 or Hoxc8 have been
considered being responsible for the lack of hindlimb phenotypes in chickens (Nelson et
al. 1996).
However, there is no sufficient research evidence to show relationships
between any of the HOX-C genes with cryptorchidism.
In conclusion, the important findings in this present GWAS are (i) the AMHR2 gene was
identified in a putative candidate region associated with canine cryptorchidism by using
1Mb window approach and Bayesian inference (ii) different chromosomal regions and
thus different genes appear to be associated with the disease in dogs from pedigrees from
different geographical regions.
Population structure clustering will help correct for
stratification; (iii) cryptorchidism is a complex disease and shows genetic heterogeneity;
(iv) single SNP analyses may not capture the genomic regions harboring the causative
loci when there is not enough LD between markers on the chip nearby the causative
mutations. The 1Mb window approach along with BayesB analysis may help to resolve
this problem; (v) whole genome sequencing for pairs of affected and unaffected dogs
from the same family may facilitate finding causative mutations without needing to
consider LD structure and disease heterogeneity.
Software
BEAGLE: http://faculty.washington.edu/browning/beagle/beagle.html
Blossoc: http://www.daimi.au.dk/~mailund/Blossoc/index.html
GenSel: http://bigs.ansci.iastate.edu
138
Haploview:
http://www.broadinstitute.org/scientific-community/science/programs/medical-andpopulation-genetics/haploview/haploview
PLINK: http://pngu.mgh.harvard.edu/~purcell/plink/
STRUCTURE: http://pritch.bsd.uchicago.edu/structure.html
Funding
This work was supported by the AKC Canine Health Foundation (Grant No. 1248) and
State of Iowa Funds.
ACKNOWLEDGEMENTS
The help of sample collections provided by Sheila E. (Blanker) Morrissey
(DVM), individual dog breeders and owners, Eddie Dziuk (Chief Operating Officer,
CHIC), Caroline Kisko (secretary, Kennel Club in UK) and Dr. Nancy Bartol (Chinook
Club of America) is appreciated. We also thank Ziqing Weng for helping in utilizing
BEAGLE program.
REFERENCES
Amann RP, Veeramachaneni DN, 2007. Cryptorchidism in common eutherian mammals.
Reproduction. 133:541-61.Balding DJ, 2006. A tutorial on statistical methods for
population association studies. Nat Rev Genet 7:781–791.
Barrett JC, Fry B, Maller J, Daly MJ, 2005. Haploview: analysis and visualization of LD
and haplotype maps. Bioinformatics. 21:263-5.
Bashir MS, Wells M, 1995. Müllerian inhibiting substance. J Pathol. 176:109-10.
Behringer RR, Finegold MJ, Cate RL, 1994. Müllerian-inhibiting substance function
during mammalian sexual development. Cell. 79:415-25.
Behringer RR, 1994. The in vivo roles of müllerian-inhibiting substance. Curr Top Dev
Biol. 29:171-87. Review.
Berkowitz GS, Lapinski RH, Dolgin SE, Gazella JG, Bodian CA, Holzman IR.
Prevalence and natural history of cryptorchidism, 1993. Pediatrics. 92:44-9.
139
Bogatcheva NV, Truong A, Feng S, Engel W, Adham IM, Agoulnik AI, 2003.
GREAT/LGR8 is the only receptor for insulin-like 3 peptide. Mol Endocrinol. 17:263946.
Browning SR, Browning BL, 2007. Rapid and accurate haplotype phasing and missing
data inference for whole genome association studies using localized haplotype clustering.
Am J Hum Genet. 81:1084-97.
Burton M, Ramsay E, 1986. Cryptorchidism in Maned Wolves. J Zoo An Med. 17:133-5.
Cardon LR, Palmer LJ, 2003. Population stratification and spurious allelic association.
Lancet. 361:598-604. Review.
Carizza C, Antiba A, Palazzi J, Pistono C, Morana F, Alarcón M, 1990. Testicular
maldescent and infertility. Andrologia. 22:285-8.
Clarnette TD, Hutson JM, 1999. Exogenous calcitonin gene-related peptide can change
the direction of gubernacular migration in the mutant trans-scrotal rat. J Pediatr Surg.
34:1208-12.
Clarnette TD, Sugita Y, Hutson JM, 1997. Genital anomalies in human and animal
models reveal the mechanisms and hormones governing testicular descent. Br J Urol.
79:99-112.
Cox JE, Edwards GB, Neal PA, 1979. An analysis of 500 cases of equine cryptorchidism.
Equine Vet J. 11:113-6.
Cox VS,Wallace LJ, Jessen CR, 1978. An anatomic and genetic study of canine
cryptorchidism. Teratology. 18:233-40.
Davis-Dao C, Koh CJ, Hardy BE, Chang A, Kim SS, De Filippo R, Hwang A, Pike MC,
Carroll JD, Coetzee GA, 2012. Shorter androgen receptor CAG repeat lengths associated
with cryptorchidism risk among Hispanic white boys. J Clin Endocrinol Metab. 97:E3939.
Donaldson KM, Tong SY, Washburn T, Lubahn DB, Eddy EM, Hutson JM, Korach KS,
1996. Morphometric study of the gubernaculum in male estrogen receptor mutant mice. J
Androl. 17:91-5.
Dunbar MR, Cunningham MW, Wooding JB, Roth RP, 1996. Cryptorchidism and
delayed testicular descent in Florida black bears. J Wildl Dis. 32:661-4.
Evanno G, Regnaut S, Goudet J, 2005. Detection the number of clusters of individuals
using the software STRUCTURE: a simulation study. Mol Ecol. 14:2611-20.
George FW, Peterson KG, 1988. Partial characterization of the androgen receptor of the
newborn rat gubernaculum. Biol Reprod. 39:536-9.
Gill WB, Schumacher GF, Bibbo M, Straus FH 2nd, Schoenberg HW, 1979. Association
of diethylstilbestrol exposure in utero with cryptorchidism, testicular hypoplasia and
semen abnormalities. J Urol. 122:36-9.
Gorlov IP, Kamat A, Bogatcheva NV, Jones E, Lamb DJ, Truong A, Bishop CE,
McElreavey K, Agoulnik AI, 2002. Mutations of the GREAT gene cause cryptorchidism.
Hum Mol Genet. 11:2309-18.
140
Griffiths AL, Middlesworth W, Goh DW, Hutson JM, 1993. Exogenous calcitonin generelated peptide causes gubernacular development in neonatal (Tfm) mice with complete
androgen resistance. J Pediatr Surg. 28:1028-30.
Hadziselimovic NO, de Geyter Ch, Demougin P, Oakeley EJ, Hadziselimovic F, 2010.
Decreased expression of FGFR1, SOS1, RAF1 genes in cryptorchidism. Urol Int. 84:35361.
Hou JW, Tsai WY, Wang TR, 1999. Detection of KAL-1 gene deletion with fluorescence
in situ hybridization. J Formos Med Assoc. 98:448-51.
Hsieh-Li HM, Witte DP, Weinstein M, Branford W, Li H, Small K, Potter SS, 1995.
Hoxa 11 structure, extensive antisense transcription, and function in male and female
fertility. Development. 121:1373-85.
Hughes IA, Acerini CL, 2008. Factors controlling testis descent. Eur J Endocrinol. 159
Suppl 1:S75-82.
Hutson JM, Hasthorpe S, 2005. Abnormalities of testicular descent. Cell Tissue Res.
322:155-8. Review.
Hutson JM, 1985. A biphasic model for the hormonal control of testicular descent. Lancet.
24:419-21.
Huynh J, Shenker NS, Nightingale S, Hutson JM, 2007. Signalling molecules: clues from
development of the limb bud for cryptorchidism? Pediatr Surg Int. 23:617-24.
Imbeaud S, Faure E, Lamarre I, Mattéi MG, di Clemente N, Tizard R, Carré-Eusèbe D,
Belville C, Tragethon L, Tonkin C, 1995. Insensitivity to anti-müllerian hormone due to a
mutation in the human anti-müllerian hormone receptor. Nat Genet. 11:382-8.
Kaftanovskaya EM, Huang Z, Barbara AM, De Gendt K, Verhoeven G, Gorlov IP,
Agoulnik AI, 2012. Cryptorchidism in mice with an androgen receptor ablation in
gubernaculum testis. Mol Endocrinol. 26:598-607.
Klattig J, Sierig R, Kruspe D, Besenbeck B, Englert C, 2007. Wilms' tumor protein Wt1
is an activator of the anti-Müllerian hormone receptor gene Amhr2. Mol Cell Biol.
27:4355-64.
Klonisch T, Fowler PA, Hombach-Klonisch S, 2004. Molecular and genetic regulation of
testis descent and external genitalia development. Dev Biol. 270:1-18.
Kubota Y, Temelcos C, Bathgate RA, Smith KJ, Scott D, Zhao C, Hutson JM, 2002. The
role of insulin 3, testosterone, Müllerian inhibiting substance and relaxin in rat
gubernacular growth. Mol Hum Reprod. 8:900-5.
Leader-Williams N, 1979. Abnormal testes in reindeer, Rangifer tarandus. J Reprod Fertil.
57:127-30.
Mailund T, Besenbacher S, Schierup MH, 2006. Whole genome association mapping by
incompatibilities and local perfect phylogenies. BMC Bioinformatics. 7:454.
Messika-Zeitoun L, Gouédard L, Belville C, Dutertre M, Lins L, Imbeaud S, Hughes IA,
Picard JY, Josso N, di Clemente N, 2001. Autosomal recessive segregation of a
truncating mutation of anti-Müllerian type II receptor in a family affected by the
141
persistent Müllerian duct syndrome contrasts with its dominant negative activity in vitro.
J Clin Endocrinol Metab. 86:4390-7.
Mishina Y, Rey R, Finegold MJ, Matzuk MM, Josso N, Cate RL, Behringer RR, 1996.
Genetic analysis of the Müllerian-inhibiting substance signal transduction pathway in
mammalian sexual differentiation. Genes Dev. 10:2577-87.
Nef S, Parada LF, 1999. Cryptorchidism in mice mutant for Insl3. Nat Genet. 22:295-9.
Nelson CE, Morgan BA, Burke AC, Laufer E, DiMambro E, Murtaugh LC, Gonzales E,
Tessarollo L, Parada LF, Tabin C, 1996. Analysis of Hox gene expression in the chick
limb bud. Development. 122:1449-66.
Nightingale SS, Western P, Hutson JM, 2008. The migrating gubernaculum grows like a
"limb bud". J Pediatr Surg. 43:387-90.
Noble WS, 2009. How does multiple testing correction work? Nat Biotechnol. 27:1135-7.
Pask AJ, Kanasaki H, Kaiser UB, Conn PM, Janovick JA, Stockton DW, Hess DL,
Justice MJ, Behringer RR, 2005. A novel mouse model of hypogonadotrophic
hypogonadism: N-ethyl-N-nitrosourea-induced gonadotropin-releasing hormone receptor
gene mutation. Mol Endocrinol. 19:972-81.
Pearson JC, Lemons D, McGinnis W, 2005. Modulating Hox gene functions during
animal body patterning. Nat Rev Genet. 6:893-904. Review.
Pritchard JK, Stephens M, Rosenberg NA, Donnelly P, 2000. Association mapping in
structured populations. Am J Hum Genet. 67:170-81.
Przewratil P, Paduch DA, Kobos J, Niedzielski J, 2004. Expression of estrogen receptor
alpha and progesterone receptor in children with undescended testicle previously treated
with human chorionic gonadotropin. J Urol. 172:1112-6.
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P,
de Bakker PI, Daly MJ, 2007. PLINK: a tool set for whole-genome association and
population-based linkage analyses. Am J Hum Genet. 81:559-75.
Radpour R, Rezaee M, Tavasoly A, Solati S, Saleki A, 2007. Association of long
polyglycine tracts (GGN repeats) in exon 1 of the androgen receptor gene with
cryptorchidism and penile hypospadias in Iranian patients. J Androl. 28:164-9.
Rothschild MF, Christian LL, Blanchard W, 1988. Evidence for multigene control of
cryptorchidism in swine. J Hered. 79:313-4.
Sahana G, Mailund T, Lund MS, Guldbrandtsen B, 2011. Local genealogies in a linear
mixed model for genome-wide association mapping in complex pedigreed populations.
PLoS One.6:e27061.
Sarfati J, Dodé C, Young J, 2010. Kallmann syndrome caused by mutations in the
PROK2 and PROKR2 genes: pathophysiology and genotype-phenotype correlations.
Front Horm Res. 39:121-32. Review
142
Satokata I, Benson G, Maas R, 1995. Sexually dimorphic sterility phenotypes in Hoxa10deficient mice. Nature. 374:460-3.
Singh A, Ezeasor D, 1989. Ultrastructure of Sertoli cells in scrotal testis of unilaterally
cryptorchid goat. Prog Clin Biol Res.296:159-64.
Smith KC, Brown PJ, Morris J, Parkinson TJ, 2007. Cryptorchidism in North Ronaldsay
sheep. Vet Rec. 161:658-9.
Spencer JR, Torrado T, Sanchez RS, Vaughan ED Jr, Imperato-McGinley J, 1991.Effects
of flutamide and finasteride on rat testicular descent. Endocrinology. 129:741-8.
Stephens M, Balding DJ, 2009. Bayesian statistical methods for genetic association
studies. Nat Rev Genet. 10:681-90.
St Jean G, Gaughan EM, Constable PD, 1992. Cryptorchidism in North American cattle:
breed predisposition and clinical findings. Theriogenology. 38:951-8.
Terenziani M, Sardella M, Gamba B, Testi MA, Spreafico F, Ardissino G, Fedeli F,
Fossati-Bellani F, Radice P, Perotti D, 2009. A novel WT1 mutation in a 46,XY boy with
congenital bilateral cryptorchidism, nystagmus and Wilms tumor. Pediatr Nephrol.
24:1413-7.
Thonneau P, Bujan L, Multingner L, Mieusset R, 1998. Occupational heat exposure and
male fertility: a review. Hum Reprod. 13: 2122.
Thonneau PF, Gandia P, Mieusset R, 2003. Cryptorchidism: incidence, risk factors, and
potential role of environment; an update. J Androl. 24:155-62. Review.
Wada Y, Okada M, Hasegawa T, Ogata T, 2005. Association of severe micropenis with
Gly146Ala polymorphism in the gene for steroidogenic factor-1. Endocr J. 52:445-8.
Wang J, Wang B, 2002. Study on risk factors of cryptorchidism Zhonghua Liu Xing Bing
Xue Za Zhi. 23:190-3.
Wang Y, Barthold J, Figueroa E, González R, Noh PH, Wang M, Manson J, 2008.
Analysis of five single nucleotide polymorphisms in the ESR1 gene in cryptorchidism.
Birth Defects Res A Clin Mol Teratol. 82:482-5.
Wang Y, Barthold J, Kanetsky PA, Casalunovo T, Pearson E, Manson J, 2007. Allelic
variants in HOX genes in cryptorchidism. Birth Defects Res A Clin Mol Teratol.
Apr;79(4):269-75.
Yates D, Hayes G, Heffernan M, Beynon R, 2003. Incidence of cryptorchidism in dogs
and cats. Vet Rec. 152:502-4.
Yoshida R, Fukami M, Sasagawa I, Hasegawa T, Kamatani N, Ogata T, 2005.
Association of cryptorchidism with a specific haplotype of the estrogen receptor alpha
gene: implication for the susceptibility to estrogenic environmental endocrine disruptors.
J Clin Endocrinol Metab. 90:4716-21.
Yuan FP, Lin DX, Rao CV, Lei ZM, 2006. Cryptorchidism in LhrKO animals and the
effect of testosterone-replacement therapy. Hum Reprod. 21:936-42.
Zhao X, Du ZQ, Rothschild MF, 2010. An association study of 20 candidate genes with
cryptorchidism in Siberian Husky dogs. J Anim Breed Genet. 127:327-31.
143
Zimmermann S, Schöttler P, Engel W, Adham IM, 1997. Mouse Leydig insulin-like (Ley
I-L) gene: structure and expression during testis and ovary development. Mol Reprod
Dev. 47:30-8.
Zimmermann S, Steding G, Emmen JM, Brinkmann AO, Nayernia K, Holstein AF, Engel
W, Adham IM, 1999. Targeted disruption of the Insl3 gene causes bilateral
cryptorchidism. Mol Endocrinol. 13:681-91.
Zuccarello D, Morini E, Douzgou S, Ferlin A, Pizzuti A, Salpietro DC, Foresta C,
Dallapiccola B, 2004. Preliminary data suggest that mutations in the CgRP pathway are
not involved in human sporadic cryptorchidism. J Endocrinol Invest. 27:760-4.
144
Figure 1. Re-grouping dogs based on the value of cluster membership estimated
from STRUCTURE. A. Summary plot of estimated Q values (cluster membership
coefficients) for each individual dog collected from USA, UK and Canada. Each
individual is represented by a single vertical line broken into K colored segments,
with lengths proportional to each of the K inferred clusters. B. Identification of the
most possible estimate of K calculated from 47,672 tagging SNPs. Delta K was
calculated for K=1 through K= 6 under a model assuming admixture. K=3 is the
best estimate based on ∆K statistics on the plot. C. Dogs were assigned into 4
subgroups based on Q values when K=3. The re-grouping of dogs is consistent with
the dogs’ geographical distributions. SH: Siberian Husky; CHI: Chinook.
145
Figure 2. Whole genome association analyses for subgroup 1 comprising 129 mostly
USA sourced (60 cases and 69 controls) Siberian Huskies. A. Shows results of single
marker analyses in PLINK before false discovery rate (FDR) implementation. B.
Depicts results from the GenSel 1Mb SNP window analyses (SNP model). Each spot
is a 1Mb window containing a group of consecutive SNPs. C. Display results using
haplotypes and SNP genotypes together in GenSel (SNP & haplotype models). Each
spot is a 1Mb window containing a group of consecutive SNPs and haplotypes. The
red breaking arrows link SNPs in PLINK to their 1Mb SNP windows with genetic
variance being over 0.40% (10 folds than the expected value). The blue breaking
arrows show windows repeatedly identified with high variance (> 0.20%, 5 folds
than the expected value) when haplotypes were integrated into the analyses in
The most promising putative regions for cryptorchidism in this
GenSel.
subpopulation are on CFA27, 1 and 14.
146
Figure S1. Whole genome association analyses for subgroup 2 comprising 41 mixed
sourced (26 cases and 15 controls) Siberian Huskies. A. Shows results of single
marker analyses in PLINK before false discov
discovery
ery rate (FDR) implementation. B.
Depicts results from the GenSel 1Mb SNP window analyses (SNP model). Each spot
is a 1Mb window containing a group of consecutive SNPs. C. Displays results using
haplotypes and SNP genotypes together in GenSel (SNP & haplotype models). Each
spot is a 1Mb window containing a group of consecutive SNPs and haplotypes. The
red breaking arrows link SNPs in PLINK to their 1Mb SNP windows with genetic
variance being over 0.40% (10 folds than the expected value). The blue breaking
arrows show windows repeatedly identified with high variance (> 0.20%, 5 folds
than the expected value) when haplotypes were integrated into the analyses in
GenSel. The most promising putative regions for cryptorchidism are on CFA4 in
subgroup 2.
147
Figure S2. Whole genome association analyses for subgroup 3 comprising 35 mostly
UK sourced (20 cases and 15 controls) Siberian Huskies. A. Shows results of single
marker analyses in PLINK before false discov
discovery
ery rate (FDR) implementation. B.
Depicts results from the GenSel 1Mb SNP window analyses (SNP model). Each spot
is a 1Mb window containing a group of consecutive SNPs. C. Displays results using
haplotypes and SNP genotypes together in GenSel (SNP & haplotype models). Each
spot is a 1Mb window containing a group of consecutive SNPs and haplotypes. The
red breaking arrows link SNPs in PLINK to their 1Mb SNP windows with genetic
variance being over 0.40% (10 folds than the expected value). The blue breaking
arrows show windows repeatedly identified with high variance (> 0.20%, 5 folds
than the expected value) when haplotypes were integrated into the analyses in
GenSel. The most promising putative region for cryptorchidism is on CFA31 in
subgroup 3.
148
Figure
S3. Whole genome association analyses for a pooled sample of 204 Siberian Huskies
in which 105 were diagnosed with cyptorchidism. A. Shows results of single marker
analyses in PLINK before false discov
discovery
ery rate (FDR) implementation. B. Depicts
results from the GenSel 1Mb SNP window analyses (SNP model). Each spot is a
1Mb window containing a group of consecutive SNPs. C. Displays results using
haplotypes and SNP genotypes together in GenSel (SNP & haplotype models). Each
spot is a 1Mb window containing a group of consecutive SNPs and haplotypes. The
red breaking arrows link SNPs in PLINK to their 1Mb SNP windows with genetic
variance being over 0.40% (10 folds than the expected value). The blue breaking
arrows show windows repeatedly identified with high variance (> 0.20%, 5 folds
than the expected value) when haplotypes were integrated into the analyses in
GenSel. Windows on CFA9, 27 and X were identified as putative candidate regions
for cryptorchidism in the joined population.
149
Table1. The top 1 Mb windows with the highest percentage of genetic variance
obtained from BayesB analyses in different Siberian Husky subgroups
1Mb window
Group
SNP model
SNP & haplotype
(%)
models (%)
Subgroup 1
1_50
0.73
0.22
(60 A & 69 U)
3_86
0.44
0.08
5_8
0.54
0.16
5_10
0.47
0.07
5_12
0.68
0.06
5_16
1.04
0.05
14_50*
0.4
0.22
18_11
0.55
0.18
27_4*
0.42
0.34
27_5*
3.24
0.53
27_6*
0.59
0.22
27_18
0.42
0.07
Subgroup 2
4_75*
0.56
0.23
(26 A & 15 U)
27_13
0.58
0.16
35_26
0.4
0.02
Subgroup 3
24_37
0.39
0.04
(20 A & 15 U)
25_14
0.47
0.14
31_18*
0.52
0.33
Pooled
3_87
0.42
0.15
(subgroup1,
6_62
0.45
0.16
subgroup 2 and 9_56*
1.42
0.27
subgroup 3)
24_31
0.55
0.11
(106 A & 99 U) 24_36
0.55
0.16
27_13*
1.14
0.21
27_18
0.43
0.14
X_31*
0.71
0.31
X_33*
1.1
0.39
X_34
0.43
0.15
X_35
0.53
0.14
*These windows were considered as putative candidate regions for cryptorchidism with
genetic variance over 0.40% (10 fold than the expected value) in the 1Mb SNP window
analyses (SNP model) and repeatedly being with over 0.20% (5 fold than the expected
value) for the 1Mb window analyses by integrating with haplotypes (SNP & haplotype
models). The positional candidate genes were listed in the supplementary Table S2, S3
&S4.
Window
(Chr_Mb)
Table 2. The results of BayesB analyses on 17 previously reported genes being associated with cryptorchidism
Gene
symbol
Chromosom
e (CFA)
Gene name
CFA1
Estrogen receptor 1
NR5A1
(SF1)
CFA9
GNRHR
CFA13
HOXA1
0
HOXA1
1
FGFR1
CFA14
nuclear
receptor
subfamily 5, group
A, member 1
gonadotropinreleasing
hormone
receptor
homeobox A10
CFA14
homeobox A11
CFA16
SOS1
CFA17
WT-1
CFA 18
fibroblast
growth
factor receptor 1
son of sevenless
homolog 1
Wilms tumor 1
INSL3
CFA20
insulin-like factor 3
AMH
(MIS)
RAF1
CFA20
CALCA
CFA21
anti-Mullerian
hormone
v-raf-1
murine
leukemia
viral
oncogene homolog 1
Calcitonin/calcitonin
-related polypeptide,
alpha
PROKR2
CFA24
LGR8
CFA 25
COL2A1
CFA27
KAL1
CFA X
CFA8
prokineticin receptor
2
Relaxin receptor 2
collagen, type II,
alpha 1
Kallmann syndrome
1 sequence
window
(chr_Mb)
Ref.
Subpop1
SNP
Haplo.
Haplo.
SNP
Haplo.
Pooled
population
SNP
Haplo.
Subpop2
SNP
Subpop3
4512999345409811
6180475561824397
1_45
Yoshida et al. 2005
0.01
0.05
0
0.04
0.03
0.07
0.03
0.03
9_61
Wada et al. 2005
0.05
0.01
0.03
0.05
0.08
0.09
0.08
0.06
6119058861204076
13_61
Pask et al. 2005
0
0.04
0.03
0.02
0.04
0.03
0.01
0.04
4330196443304348
4331298143315443
2999143430031766
3411260134186729
381195393167266
4807272148074397
5992629759928993
88990948975879
14_43
Satokata et al. 1996
0
0.07
0.08
0.09
0.02
0.03
0
0.05
14_43
Hsieh-Li et al. 1995
0
0.07
0.08
0.09
0.02
0.03
0
0.05
16_30
0.15
0.17
0.12
0.06
0.03
0.05
0.14
0.03
0
0.02
0.07
0.01
0.02
0.03
0.04
0.06
18_38
Hadziselimovic et al.
2010
Hadziselimovic et al.
2010
Terenziani et al. 2008
0.09
0.11
0.08
0.04
0.03
0.05
0.04
0.13
20_48
Nef and Parada 1999
0.12
0.1
0.04
0.04
0.1
0.03
0.09
0.07
20_59
Behringer et al. 1994
0.02
0.05
0.03
0.04
0
0.03
0.02
0.04
20_8
Hadziselimovic et al.
2010
0.02
0.04
0.04
0.05
0.06
0.05
0.1
0.06
4110482241108840
21_41
Griffiths et al. 1993;
Clarnette and Hutson
1999)
0.03
0.02
0.04
0.04
0.06
0.04
0.14
0.03
1935153719359239
1129467111340203
97685119799248
53699415428059
24_19
Sarfati et al. 2010
0.01
0.03
0.07
0.09
0.02
0.06
0
0.04
25_11
Tomyama et al. 2003
0.01
0.03
0.02
0.12
0.13
0.05
0.03
0.03
27_9
Zhao et al. 2010
0.06
0.07
0.06
0.07
0.06
0.03
0.03
0.05
X_5
Hou et al. 1999
0
0.03
0
0.06
0
0.05
0
0.01
17_34
150
ESR1
Genome position
(CanFam2.0)
Percentage of Genetic Variance (%)
(1Mb window with BayesB approach)
Table S1. Re-grouping dogs based on the value of cluster membership (K=3)
K=3
Breed
Subgroup 1
Subgroup 2
Subgroup 3
Subgroup 4
SH
SH
SH
CHI
No. of
cases
60
26
20
6
No. of
controls
69
15
15
6
Cluster1
0.709-1
0.304-0.694
0.162-0.3
0.01-0.031
Q value
Cluster2
0-0.291
0.306-0.696
0.7-0.838
0-0.041
Cluster3
0-0.036
0-0.092
0-0.024
0.928-1
USA_SH
0.95
0.71
0.06
0
Geographical distribution
Canada_SH
UK_SH
0.05
0
0.05
0.24
0
0.94
0
0
USA_CHI
0
0
0
1
The Q value represents the estimated proportion of ancestry when 3 assumed populations were given. SH: Siberian Husky; CHI: Chinook.
151
152
Table S2. A list of genes in the putative regions for subgroup1
Window
(chr_Mb)
1_50th
14_50
th
27_4th
CFA
Gene End
(bp)
50258876
Ensembl Gene ID
Gene symbol
Description
1
Gene Start
(bp)
50145766
ENSCAFG00000000618
ZDHHC14
zinc finger, DHHC-type containing 14
1
50286593
50286847
ENSCAFG00000028358
7SK
7SK RNA
1
50340242
50341092
ENSCAFG00000000624
Novel gene
pseudogene Ensembl genes
1
50436290
50498735
ENSCAFG00000000627
SNX9
sorting nexin 9
1
50550972
50622090
ENSCAFG00000000644
SYNJ2
synaptojanin 2
1
50642818
50692684
ENSCAFG00000000647
SERAC1
serine active site containing 1
1
50711829
50723881
ENSCAFG00000000648
GTF2H5
general transcription factor IIH, polypeptide 5
1
50919150
50993873
ENSCAFG00000000652
TULP4
tubby like protein 4
1
50935969
50936076
ENSCAFG00000028188
U6
U6 spliceosomal RNA
14
50199420
50200394
ENSCAFG00000003182
NOVEL pseudogene
14
50211571
50244090
ENSCAFG00000003194
HERPUD2
14
50219951
50220703
ENSCAFG00000003199
homocysteine-responsive endoplasmic
reticulum-resident ubiquitin-like domain,
member 2 protein
NOVEL_pseudogene
14
50373100
50456616
ENSCAFG00000003216
SEPT7L
septin 7-like
14
50573856
50682597
ENSCAFG00000003227
EEPD1
14
50708853
50745061
ENSCAFG00000003232
KIAA0895
endonuclease/exonuclease/phosphatase
family domain containing 1
KIAA0895
14
50773901
50815300
ENSCAFG00000003243
ANLN
Actin-binding protein anillin
14
50856685
50950463
ENSCAFG00000003252
AOAH
Acyloxyacyl hydrolase (neutrophil)
27
4008479
4010926
ENSCAFG00000006572
NFE2
nuclear factor (erythroid-derived 2)
27
4016380
4020656
ENSCAFG00000006580
Novel gene
Uncharacterized protein
27
4038247
4050570
ENSCAFG00000006593
CBX5
chromobox homolog 5
27
4090107
4092358
ENSCAFG00000006603
SMUG1
27
4211932
4213230
ENSCAFG00000006611
HOXC4
single-strand-selective monofunctional uracilDNA glycosylase 1
homeobox C4
27
4231278
4232665
ENSCAFG00000006618
HOXC5
homeobox C5
27
4231753
4231837
ENSCAFG00000028294
cfa-mir-615
cfa-mir-615
27
4235965
4237469
ENSCAFG00000006622
HOXC6
homeobox C6
27
4254284
4256431
ENSCAFG00000006626
HOXC8
homeobox C8
27
4272335
4272444
ENSCAFG00000020562
cfa-mir-196a-2
cfa-mir-196a-2
27
4274730
4278914
ENSCAFG00000006636
HOXC10
homeobox C10
27
4288980
4291010
ENSCAFG00000025599
HOXC11
homeobox C11
27
4307265
4308845
ENSCAFG00000006645
HOXC12
homeobox C12
27
4318372
4324857
ENSCAFG00000006648
HOXC13
homeobox C13
27
4408336
4408471
ENSCAFG00000025595
Novel gene
Uncharacterized protein
27
4465522
4465593
ENSCAFG00000028208
Novel gene
miRNA ncRNAs
27
4471215
4473053
ENSCAFG00000006653
Novel gene
Uncharacterized protein
27
4526984
4543023
ENSCAFG00000006699
CALCOCO1
calcium binding and coiled-coil domain 1
27
4618776
4707507
ENSCAFG00000006726
ATF7
activating transcription factor 7
27
4716529
4721245
ENSCAFG00000006874
TARBP2
TAR (HIV-1) RNA binding protein 2
27
4730433
4736128
ENSCAFG00000006918
MAP3K12
mitogen-activated protein kinase kinase
kinase 12
153
Table S2. (continued)
Window
(chr_Mb)
27_5
th
CFA
Gene End
(bp)
4760860
Ensembl Gene ID
Gene symbol
Description
27
Gene Start
(bp)
4737476
ENSCAFG00000006821
PCBP2
poly(rC) binding protein 2
27
4771276
4771641
ENSCAFG00000006927
PRR13
proline rich 13
27
4790871
4796699
ENSCAFG00000006934
AMHR2
anti-Mullerian hormone receptor, type II
27
4808070
4846934
ENSCAFG00000006950
SP1
Sp1 transcription factor
27
4852684
4913472
ENSCAFG00000007016
PFDN5
prefoldin subunit 5
27
4886171
4888554
ENSCAFG00000025559
SP7
Sp7 transcription factor
27
4893129
4903977
ENSCAFG00000006963
AAAS
27
4904239
4909825
ENSCAFG00000006981
C12orf10
achalasia, adrenocortical insufficiency,
alacrimia
chromosome 12 open reading frame 10
27
4915195
4937670
ENSCAFG00000007030
ESPL1
27
4945566
4947561
ENSCAFG00000007035
MFSD5
27
4962069
4981222
ENSCAFG00000007050
RARG
extra spindle pole bodies homolog 1 (S.
cerevisiae)
major facilitator superfamily domain
containing 5
retinoic acid receptor, gamma
27
4993303
4998519
ENSCAFG00000007065
ITGB7
integrin, beta 7
27
5002178
5004869
ENSCAFG00000007070
ZNF740
zinc finger protein 740
27
5028008
5039893
ENSCAFG00000007077
CSAD
cysteine sulfinic acid decarboxylase
27
5056925
5066187
ENSCAFG00000007085
SOAT2
sterol O-acyltransferase 2
27
5067628
5071493
ENSCAFG00000007097
IGFBP6
insulin-like growth factor binding protein 6
27
5084602
5098690
ENSCAFG00000007101
SPRYD3
SPRY domain containing 3
27
5100630
5115952
ENSCAFG00000007105
TENC1
27
5123868
5154188
ENSCAFG00000007117
EIF4B
tensin like C1 domain containing phosphatase
(tensin 2)
eukaryotic translation initiation factor 4B
27
5193231
5197229
ENSCAFG00000007154
Novel gene
Uncharacterized protein
27
5214844
5215090
ENSCAFG00000025408
Novel gene
Uncharacterized protein
27
5225567
5233056
ENSCAFG00000007199
Novel gene
Uncharacterized protein
27
5261219
5268719
ENSCAFG00000025402
KRT78
keratin 78
27
5273786
5283987
ENSCAFG00000007204
KRT79
keratin 79
27
5294126
5300384
ENSCAFG00000007208
KRT4
keratin 4
27
5403819
5415836
ENSCAFG00000025382
KRT77
keratin 77
27
5425745
5458087
ENSCAFG00000007212
K2C1_CANFA
Keratin, type II cytoskeletal 1
27
5481521
5490471
ENSCAFG00000007233
KRT73
keratin 73
27
5498594
5510056
ENSCAFG00000025358
KRT72
keratin 72
27
5523441
5530483
ENSCAFG00000025346
KRT74
keratin 74
27
5540426
5547762
ENSCAFG00000007278
D6BR72_CANFA
keratin 71
27
5570885
5576777
ENSCAFG00000007223
KRT5
keratin 5
27
5595688
5601257
ENSCAFG00000007251
Novel gene
Uncharacterized protein
27
5613256
5618136
ENSCAFG00000007209
Novel gene
Uncharacterized protein
27
5625571
5627338
ENSCAFG00000025323
Novel gene
Uncharacterized protein
27
5638886
5648810
ENSCAFG00000025322
KRT75
keratin 75
27
5650991
5663478
ENSCAFG00000025318
Novel gene
Uncharacterized protein
27
5667080
5678048
ENSCAFG00000007281
KRT82
keratin 82
27
5686336
5686456
ENSCAFG00000028307
5S_rRNA
5S ribosomal RNA
154
Table S2. (continued)
Window
(chr_Mb)
27_6th
CFA
Gene End
(bp)
5695036
Ensembl Gene ID
Gene symbol
Description
27
Gene Start
(bp)
5687503
ENSCAFG00000007284
KRT84
keratin 84
27
5709947
5716497
ENSCAFG00000007286
KRT85
keratin 85
27
5722652
5735581
ENSCAFG00000007293
Novel gene
Uncharacterized protein
27
5749848
5755811
ENSCAFG00000007311
Novel gene
Uncharacterized protein
27
5764377
5770662
ENSCAFG00000007307
Novel gene
Uncharacterized protein
27
5779873
5784692
ENSCAFG00000007300
Novel gene
Uncharacterized protein
27
5790522
5794208
ENSCAFG00000007318
Novel gene
Uncharacterized protein
27
5808380
5814164
ENSCAFG00000007320
Novel gene
Uncharacterized protein
27
5816046
5830969
ENSCAFG00000007322
KRT7
keratin 7
27
5865148
5884620
ENSCAFG00000007328
KRT80
keratin 80
27
5931294
5931569
ENSCAFG00000014784
Novel gene
Uncharacterized protein
27
5934133
5936981
ENSCAFG00000007331
C12orf44
chromosome 12 open reading frame 44
27
5949972
5957895
ENSCAFG00000007338
NR4A1_CANFA
27
5996708
6003640
ENSCAFG00000025231
GRASP
27
6014634
6032573
ENSCAFG00000007357
ACVR1B
Nuclear receptor subfamily 4 group A
member 1
GRP1 (general receptor for phosphoinositides
1)-associated scaffold protein
activin A receptor, type IB
27
6079269
6086578
ENSCAFG00000025220
ACVRL1
activin A receptor type II-like 1
27
6104064
6106685
ENSCAFG00000007345
ANKRD33
ankyrin repeat domain 33
27
6179810
6298254
ENSCAFG00000007600
SCN8A
27
6242075
6242213
ENSCAFG00000027574
U2
sodium channel, voltage gated, type VIII,
alpha subunit
U2 spliceosomal RNA
27
6407456
6462367
ENSCAFG00000007728
Q9GK56_CANFA
Uncharacterized protein
27
6517095
6538877
ENSCAFG00000007758
GALNT6
27
6544212
6733918
ENSCAFG00000007774
CELA1_CANFA
UDP-N-acetyl-alpha-Dgalactosamine:polypeptide Nacetylgalactosaminyltransferase 6 (GalNAcT6)
Chymotrypsin-like elastase family member 1
27
6565180
6596945
ENSCAFG00000007778
BIN2
bridging integrator 2
27
6619721
6620699
ENSCAFG00000007785
SMAGP
small cell adhesion glycoprotein
27
6622959
6626272
ENSCAFG00000007787
DAZAP2
DAZ associated protein 2
27
6656427
6671622
ENSCAFG00000007790
POU6F1
POU class 6 homeobox 1
27
6682523
6741630
ENSCAFG00000007831
TFCP2
transcription factor CP2
27
6755279
6763644
ENSCAFG00000007860
CSRNP2
cysteine-serine-rich nuclear protein 2
27
6767612
6774873
ENSCAFG00000007872
LETMD1
LETM1 domain containing 1
27
6805741
6820814
ENSCAFG00000007902
SLC11A2
27
6831874
6871607
ENSCAFG00000007924
METTL7A
solute carrier family 11 (proton-coupled
divalent metal ion transporters), member 2
methyltransferase like 7A
27
6924625
6954071
ENSCAFG00000007928
ATF1
activating transcription factor 1
155
Table S3. A list of genes in the putative regions for subgroup2 and subgroup3
Group
Subgroup
2
Subgroup
3
Window
(chr_Mb)
4_75th
31_18
th
CFA
Gene Start
(bp)
Gene End
(bp)
Ensembl Gene ID
Gene symbol
Description
4
75800049
75809381
ENSCAFG00000018745
CAPSL
calcyphosine-like
4
75832871
75860982
ENSCAFG00000018752
IL7R
interleukin 7 receptor
4
75896215
76060543
ENSCAFG00000018767
4
75133068
75211061
ENSCAFG00000018704
SPEF2
Q9N280_CAN
FA
4
75464205
75489129
ENSCAFG00000018711
RANBP3L
4
75494657
75536151
ENSCAFG00000018721
NADKD1
4
75551591
75571161
ENSCAFG00000018725
SKP2
4
75591977
75621896
ENSCAFG00000018731
LMBRD2
sperm flagellar 2
excitatory amino acid
transporter 1
RAN binding protein 3like
NAD kinase domain
containing 1
S-phase kinaseassociated protein 2, E3
ubiquitin protein ligase
LMBR1 domain
containing 2
4
75658019
75676510
ENSCAFG00000023144
Novel gene
Uncharacterized protein
4
75680288
75711566
ENSCAFG00000018738
Novel gene
Uncharacterized protein
4
75751766
75776502
ENSCAFG00000018743
Novel gene
Uncharacterized protein
31
18049292
18049396
ENSCAFG00000021702
U6
U6 spliceosomal RNA
31
18936469
18936573
ENSCAFG00000027925
U6
U6 spliceosomal RNA
156
Table S4. A list of genes in the putative regions for all Siberian Huskies (including
subgroup1, 2 and 3)
Window
(chr_Mb)
CFA
Gene
Start
(bp)
Gene End (bp)
Ensembl Gene
ID
Gene symbol
9_56th
56770082
56817552
ENSCAFG00000019945
ASS1
argininosuccinate synthase 1
56765474
56765884
ENSCAFG00000019944
HBXIP
hepatitis B virus x interacting protein
56458287
56458469
ENSCAFG00000024407
QRFP
pyroglutamylated RFamide peptide
56831407
56978021
ENSCAFG00000019950
Novel gene
Uncharacterized protein
55986530
56045429
ENSCAFG00000019908
PRRC2B
56125301
56139605
ENSCAFG00000019911
PPAPDC3
56151597
56164178
ENSCAFG00000019912
FAM78A
proline-rich coiled-coil 2B
phosphatidic acid phosphatase type 2 domain containing
3
family with sequence similarity 78, member A
56187199
56289132
ENSCAFG00000019920
NUP214
nucleoporin 214kDa
56292948
56418185
ENSCAFG00000019926
AIF1L
allograft inflammatory factor 1-like
56316975
56360237
ENSCAFG00000019927
LAMC3
laminin, gamma 3
56419461
56448416
ENSCAFG00000019930
FIBCD1
fibrinogen C domain containing 1
27_13th
X_31
st
X_33rd
56465925
56487400
ENSCAFG00000019938
ABL1
c-abl oncogene 1, non-receptor tyrosine kinase
56495457
56496470
ENSCAFG00000023093
Novel gene
Uncharacterized protein
56612649
56621679
ENSCAFG00000019941
EXOSC2
exosome component 2
56629556
56644738
ENSCAFG00000023076
PRDM12
PR domain containing 12
56662039
56688826
ENSCAFG00000019942
FUBP3
far upstream element (FUSE) binding protein 3
56235131
56235229
ENSCAFG00000021075
U6
U6 spliceosomal RNA
13293117
13305402
ENSCAFG00000009652
TWF1
twinfilin, actin-binding protein, homolog 1 (Drosophila)
13326156
13343992
ENSCAFG00000009700
IRAK4
interleukin-1 receptor-associated kinase 4
13356837
13379958
ENSCAFG00000009729
PUS7L
13504107
13665614
ENSCAFG00000009744
ADAMTS20
13189041
13189531
ENSCAFG00000025198
Novel gene
pseudouridylate synthase 7 homolog (S. cerevisiae)-like
ADAM metallopeptidase with thrombospondin type 1
motif, 20
pseudogene Ensembl genes
13159473
13160267
ENSCAFG00000024514
Novel gene
protein coding Ensembl genes
31935641
31936435
ENSCAFG00000024521
Novel gene
protein coding Ensembl gene
31622005
31623093
ENSCAFG00000013845
PCBP1
poly(rC) binding protein 1
31853711
31854505
ENSCAFG00000024210
Novel gene
Uncharacterized protein
30996456
31056775
ENSCAFG00000013840
CXorf59
chromosome X open reading frame 59
31148254
31150659
ENSCAFG00000023630
Novel gene
Uncharacterized protein
31277509
31342758
ENSCAFG00000023600
CXorf30
chromosome X open reading frame 30
31820228
31821353
ENSCAFG00000023558
Novel gene
Uncharacterized protein
33390642
33413474
ENSCAFG00000014051
Novel gene
Uncharacterized protein
33498639
33499190
ENSCAFG00000014056
MID1IP1
MID1 interacting protein 1
33792251
33794292
ENSCAFG00000014059
Novel gene
Uncharacterized protein
33006703
33055120
ENSCAFG00000014003
RPGR_CANFA
X-linked retinitis pigmentosa GTPase regulator
33080945
33149321
ENSCAFG00000014023
OTC
ornithine carbamoyltransferase
157
CHAPTER 6. GENERAL CONCLUSIONS
Mapping causative mutations responsible for genetic diseases is the primary goal for medical
geneticists. Precisely mapped genetic variants can contribute to future use of gene therapy to
cure patients. Therefore, understanding the inherited mode of a given disease and utilizing a
suitable mapping strategy to identify causative mutations are very important. Moreover,
given that animals and humans share many genetic diseases, animal models such as dogs and
sheep are useful for investigating gene functions and human diseases. By examining gene
alterations in these animal models, one can map the gene loci for a given disease to help
understand the causal relationship between mutations in homologous genes and the
corresponding human disease.
The relationship between phenotypes and genotypes
The classical forward genetics approach allows researchers to locate the gene on its
chromosome associated with a trait of interest. The genotype is the genetic makeup of an
individual that represents a specific genetic character under consideration. Measuring the
genotype of several individuals will provide information on the differences among
individuals in a population. Using various mapping strategies, genotype data will help to
map the causative or major genes responsible for the traits being studied. In the past several
decades, a lot of efforts have been done to improve genotyping techniques in order to obtain
accurate genotypes with less expense. To gain a highly successful rate of mapping traits in a
population, researchers have realized that improving only the quality of genotypes is not
enough when the measurement and categorization of phenotypes is not precise for
quantitative and categorical traits. In this dissertation, we have successfully mapped three
158
monogenic sheep disorders (Chapters 2, 3 and 4). The prerequisites for being successful in
these studies were: the accurate diagnoses of disease statuses of each sheep; and the very
clear inherited mode of each disorder. Without those accomplishments, we would not have
so quickly located the candidate genes, performed the fine mapping processes and eventually
found the putative causative mutations. Moreover, in the canine cryptorchidism study in this
dissertation (Chapter 5), dogs without one or two testicles in the scrotal sac were diagnosed
as cryptorchidism.
It is known that disruptions of different regulatory factors during
testicular descent will lead to gonads remain in various positions inside the body, such as the
abdominal cavity close to the kidneys, the locations nearby the internal/external inguinal ring
or the inguinal canal. Therefore, the undescended testicle staying at different body locations
in affected dogs has likely increased the complexity and heterogeneity of this disease and has
made gene mapping difficult.
Since different hormones regulate testicular descent in
different descending phases, it is important to know exact details of cryptorchidism in each
dog in order to efficiently find the associated genes for this disorder. Therefore, improving
the measurement and categorization of phenotypes becomes a new challenge for many
studies involved in a forward genetics approach.
Comparative genomics for monogenic disorders and complex diseases
Finding the genetic variations for a disease is just a start. Providing evidence to confirm that
the genetic variant is a causative mutation needs a lot of work. Studies in comparative
genomics, for example, biochemical or genetic studies in other species for the homologous
gene will provide detailed functions of the studied genes. Gene knock-out mouse models are
likely to show similar phenotypes as in affected human patients with a loss of function of
159
genes. However, many knock-out mice exhibit apparently different or no phenotypes due to
different physiological systems between humans and mice. Selecting a suitable animal
model is important for studying of human diseases. Similar characteristics in specific disease
affected dogs, sheep or other animal species will provide an alternative choice of selecting a
good model system for human diseases rather than mouse models.
Comparing expression panels of genes in the same pathway between different species will
give some hints for key genes affecting phenotypes of interest. For example, testicular
descent is a prerequisite for normal spermatogenesis in many mammalian species.
Cryptorchidism is an abnormal condition of testicular descent. It is interesting to know that
elephants do not perform the process of testicular descent.
The comparisons of gene
expression, gene coding sequence and translated protein sequence among elephants,
cryptorchid dogs and normal dogs to produce a list of genes involved in testicular descent in
dogs might show some interesting findings.
From candidate gene approach to whole genome scan
Scientists have applied the candidate gene approach to search for causative mutations or
associated DNA markers on a given trait for several decades.
Biological functions of
candidate genes should be clear when performing this approach.
Reverse genetics by
utilizing multiple techniques such as mutagenesis, gene targeting, or targeting induced local
lesions in genomes (TILLING) to engineer a change or disruption in the DNA can provide
information of how a gene influences phenotypes. If there is not sufficient research evidence
to show gene functions, it will be hard to decide where to start in the genome to map the
causative gene. Whole genome scans using different forms of genetic markers may solve
160
this problem. A SNP array based whole genome association study has been widely applied
for study of disease traits. The higher the density of markers, the higher the level of
resolution that can be obtained for mapping associated genes. SNPs are DNA markers with
the highest density in the genome. Due to their widely distributions in the genome, one can
detect association signals for almost every chromosome.
From DNA marker assisted mapping to whole genome sequencing
Advanced sequencing technology has pushed the reality of direct sequencing of genomic or
coding sequences in a genome wide scale into the fact. Compared to traditional mapping
approaches by using linkage and linkage disequilibrium, whole genome sequencing is not as
abstract to understand as linkage mapping but it is straight forward because the researcher
could observe almost any changes in the DNA sequences. This will help many scientists
quickly search for disease loci without devoting too many efforts for preparing suitable
populations for a specific marker assisted mapping approach. However, DNA sequencing
generates large volumes of genomic sequence data relatively rapidly. There are too many
genetic changes across the genome to be identified and that will increase the difficulty in
finding the real mutation. Monogenic disorders usually have a clear mode of inheritance and
the corresponding phenotypes are mostly caused by mutations that disrupt protein functions.
Therefore, whole genome sequencing can be an efficient method to discover genes causing
monogenic disorders. To map causative mutations and risk alleles for common complex
diseases, a larger sample size is still preferred to find the common genetic variants
responsible for common disease.
DNA marker assisted mapping and whole genome
161
sequencing can be applied together to efficiently and economically identify genes
contributing to diseases.
The genetic basis underlying phenotypes is not simple for many cases.
The common
mutations found are likely changes in DNA sequences. However, more and more evidence
shows that epigenetic changes such as changes in the pattern of methylation, or changes of
histone modification play important roles in the development of specific phenotypes.
Therefore, it is necessary to screen epigenetic changes in the genome in disease studies. The
robust and cheaper technologies are under developed in commercial companies. A new era
is expected for genetic disease mapping when powerful tools become available by integrating
genetic and epigenetic knowledge for discovering causative mutations responsible for genetic
disorders in humans and other species.
FUTURE RESEARCH
Future uses for our findings in the sheep industry in relation to the three sheep disorders are
to exclude affected individuals and carriers by using genetic tests based on the mutations we
discovered in the sheep populations to avoid at risk mating.
Additionally, the disease
affected sheep and carriers can be retained as animal models for studying of related human
diseases. Gene expression and protein analyses can be done for these causative gene loci in
the sheep model to further understand the gene functions involved in the normal bone and
cartilage metabolisms as well as motor neuron activities.
Future research plans regarding cryptorchidism should include fine mapping of the promising
candidate gene AMHR2 by screening its promoter region, exons and splicing regions to
discover causative mutations responsible for cryptorchidism.
Gene expression analysis
162
should also be done for the AMHR2 gene using gubernaculum tissues from dogs diagnosed as
affected and normal to provide solid evidence for the causative relationship between this
gene and cryptorchidism. Second, collecting more Siberian Husky samples is desirable to
increase the power of the experiment for studying cryptorchidism. Multiple offspring from
closely related dog families will help to decipher the genetic basis for cryptorchid dogs with
a similar genetic background. Third, whole genome sequencing can be done with a small
group of dogs. Whole genome sequencing for pairs of affected and unaffected dogs from the
same family may facilitate finding causative mutations without needing to consider LD
structure and disease heterogeneity.
CONCLUSIONS
In conclusion, we have successfully found putative causative mutations for the three sheep
monogenic diseases including a nonsense mutation in DMP1 for inherited rickets, a missense
mutation in AGTPBP1 for lower motor neuron disease, and 1 base pair deletion in SLC13A1
for chondrodysplasia. This conclusion is based on 100% concordant rate of genotypes with
the recessive pattern of inheritance in affected, carrier and phenotypically normal individuals,
as well as implication of gene functions in other species.
Moreover, by applying Bayesian inference, we identified a candidate region at the 4th Mb
window on dog chromosome CFA27 associated with the development of cryptorchidism in a
subgroup of our Siberian Husky population. The candidate region harbors a very promising
functional candidate gene AMHR2 based on the studies for human diseases.
Once
significantly associated markers or causative mutations are identified in AMHR2 for
cryptorchidism, this information could be quickly applied to design a genetic test for the dog
163
breeding program to cull dogs carrying the deleterious allele and to test other breeds.
Moreover, our finding implicates the promising role of the AMHR2 gene during the normal
testicular descent. This result will shed light on the future studies of genes in the same
pathway for development of human cryptorchidism.
164
ACKNOWLEDGEMENTS
The completion of my dissertation and subsequent Ph. D. has been a long journey. The first
three research projects (chapters 2, 3, and 4) in my dissertation were completed smoothly
even though the breaking though of the first one took me more than half a year. A large
amount of time and efforts were put on the last project (chapter 5). Enormous problems rise
during this study. I felt deeply sad, lost confidence on this project many times and got
writer’s block before starting to write this dissertation. Eventually, I have finished my
dissertation, but not alone. I could not have succeeded without the invaluable support of
many people. Without these supporters, especially the select few I’m about to mention, I
may not have gotten to where I am today.
Firstly, special thanks go to Dr. Max F. Rothschild, my major professor, for his gentle
encouragement and help in pushing me through the first to the last chapter of my dissertation;
for his guidance not only on my study but also on the real life aspect of being a strong and
successful person.
I would also like to give a heartfelt, special thanks to Dr. Dorian Garrick, my co-major
professor. His patience, flexibility, knowledgeable insight and genuine caring enabled me to
complete my four research projects.
I am very grateful to the remaining members of my study committee, Dr. Sue Lamont, Dr.
Matthew Ellinwood and Dr. Peng Liu. Their academic support and input and personal
cheering are greatly appreciated.
Thanks too to the members of the Rothschild lab group (Dr. Danielle Gorbach, Dr.
Zhi_Qiang Du, Dr. Suneel K. Onteru, Dr. Fan Bin, Agatha Ampaire and Muhammed
165
Walugembe) for your kind concern and support. Thanks to Dr. Rothschild’s family,
especially Denise for the countless times she hosted the members of the lab group at her
home.
Next I would like to thank several of my good friends: Dawn Koltes, James Koltes, Xinhua
Xiao and Difei Shen. They always show their warmly caring, concern and support during
these years, and encourage me with their best wishes.
I would want to sincerely thank my parents-in laws for their love and support. Without their
help in taking care of my daughter, I would not have been able to complete my study and this
dissertation.
Thanks also go to my parents and younger sister’s famiy for taking
responsibilities of family affairs in China over the past several years.
Finally, I must thank my husband Chuanhe for his unfailing soulful love that gave me the
constant inspiration to fulfill my study. He was always there cheering me up and stood by
me through the good and bad times. Many thanks go to my daughter Jasmine. She is such a
sweet girl and her smile is my central motivation to finish my Ph.D. study.
There is no doubt that numerous people were involved in making my Ph.D. study possible.
To the countless others that have been with me and helped me along the way, I apologize for
not specifically mentioning you, but please know that your friendship, help and guidance is
just as greatly appreciated. Thank you all.