Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Artificial Selection on Brain-Expressed Genes during the Domestication of Dog Yan Li,1,2 Bridgett M. vonHoldt,3 Andy Reynolds,4 Adam R. Boyko,5 Robert K. Wayne,6 Dong-Dong Wu,*,2 and Ya-Ping Zhang*,1,2 1 Laboratory for Conservation and Utilization of Bio-Resource, Yunnan University, Kunming, China State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China 3 Department of Ecology and Evolutionary Biology, University of California, Irvine 4 Department of Biological Statistics and Computational Biology, Cornell University 5 Department of Biomedical Sciences, College of Veterinary Medicine, Cornell University 6 Department of Ecology and Evolutionary Biology, University of California, Los Angeles *Corresponding authors: E-mail: [email protected]; [email protected]. Associate editor: Joshua Akey 2 Abstract Domesticated dogs have many unique behaviors not found in gray wolves that have augmented their interaction and communication with humans. The genetic basis of such unique behaviors in dogs remains poorly understood. We found that genes within regions highly differentiated between outbred Chinese native dogs (CNs) and wolves show high bias for expression localized to brain tissues, particularly the prefrontal cortex, a specific region responsible for complex cognitive behaviors. In contrast, candidate genes showing high population differentiation between CNs and German Shepherd dogs (GSs) did not demonstrate significant expression bias. These observations indicate that these candidate genes highly expressed in the brain have rapidly evolved. This rapid evolution was probably driven by artificial selection during the primary transition from wolves to ancient dogs and was consistent with the evolution of dog-specific characteristics, such as behavior transformation, for thousands of years. Key words: artificial selection, dog domestication, brain evolution, behavioral evolution. Introduction ß The Author 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 30(8):1867–1876 doi:10.1093/molbev/mst088 Advance Access publication May 8, 2013 1867 Article As the premier domesticated animal in human society, the domesticated dog (Canis familiaris) is both a companion and a powerful model for the study of morphology, disease, and behavior (Ostrander and Wayne 2005; Wayne and Ostrander 2007). One of the critical behavioral traits that differentiate the domesticated dog from wolves is that of interspecific communication with humans. For example, dogs can locate hidden food items by following human pointing gestures or tracking human gazes in a manner that is much more sophisticated than those of primates or wolves (Hare et al. 2002; Miklósi et al. 2003; Hare and Tomasello 2005; Tomasello and Kaminski 2009; Topál et al. 2009). Even more surprising is the observation that young puppies, before any major interaction with humans, display developmental levels of social–cognitive skills that parallel those seen in human children (Hare et al. 2002; Tomasello and Kaminski 2009; Topál et al. 2009). In contrast to dogs, wolves reared by humans are not as skilled as dogs in following human signals, suggesting that such social–cognitive skills are derived traits and evolved during the process of domestication in dogs (Hare et al. 2002; Miklósi et al. 2003; Hare and Tomasello 2005). Research with domesticated foxes (reviews see [Trut 1999, 2001]) suggests that this social–cognitive ability might have initially appeared as an incidental by-product of behavioral selection for tameness toward humans (Hare et al. 2005). It was hypothesized more than 50 years ago that this behavioral transformation toward tameness was the primary target of selection during domestication and was produced through gradual changes of developmental regulation (Belyaev 1969; Trut et al. 2009), for example, temporal parameters of developmental maturation of sensory systems and locomotion (Scott 1962; Fox 1971; Serpelle and Jagoe 1997). These changes increased the developmental period sensitive to human conditioning, which enhanced social adaption and the development of human temperament-like traits (Trut et al. 2004; Hare and Tomasello 2005; Kukekova, Temnykh, et al. 2011). Indeed, a reduced fearful–aggressive response with increased exploration was observed in both dogs and domestic foxes and was lacking in their respective untamed relatives (Trut et al. 2004). Directly following from this hypothesis, it could be inferred that the expected underlying genetic changes are not in the genes themselves but rather in their regulation. A previous comparison of brain-specific gene expression profiles across domesticated dogs, gray wolves, and coyotes identified that gene expression in the hypothalamus of the domestic dogs rapidly diverged from that of the wolf and coyote and was suspected to be a result of the selection for tameness and other neuroendocrine responses to domestication (Saetre et al. 2004). Further, examination of the brains of tame and aggressive foxes showed notable differences in Li et al. . doi:10.1093/molbev/mst088 gene expression in the amygdale, frontal cortex, and hypothalamus (Lindberg et al. 2005), with significant biochemical changes in neurotransmitter metabolism in the brain of the tamed fox (Popova et al. 1997; Trut 2001). However, candidate gene studies have failed to identify genetic variants associated with dog behavior (Masuda et al. 2004; Takeuchi et al. 2005; Ogata et al. 2006), with this failure attributed to the limitations of the candidate gene approach for investigating the genetic mechanisms of phenotypically complex behavioral traits (Kukekova et al. 2008). In contrast, a population-based approach had identified a candidate canine locus linked to dog behavior, which was independently verified as an orthologous region in the tame fox model (vonHoldt et al. 2010; Kukekova, Trut, et al. 2011). Further quantification of differential gene expression in the brain of the tame and aggressive foxes revealed a likely candidate (HTR2C) that may be linked to this behavioral transformation, as the gene functions in the serotonergic and dopaminergic signaling pathway of the prefrontal cortex (Kukekova, Johnson, et al. 2011). Despite these efforts, the genetic basis of the behavioral transformation during the domestication of dogs is far less well understood than for morphological traits such as body mass, coat type, hair color, and orientation (Ostrander and Wayne 2005; Wayne and Ostrander 2007). In this report, we study how artificial selection has targeted and shaped the expression profiles of a specific subset of developmental genes during the evolution of dogs. We find evidence that directional selection has driven the evolution of brain-associated genes during the domestication of dogs. Results Intensive selective breeding occurred very recently in dogs and has followed primary selection for characteristics such as behavior transformation that have occurred for thousands of years. To avoid potentially misleading conclusions derived from the extremely small gene pool that resulted from breed creation and mating rules, outbred dogs, which retain great genetic diversity from ancient dogs, would be the best model for exploring the genetic foundation of dog domestication. Previous studies have found that dogs in South China have among the highest levels of canine mtDNA and Y chromosome diversity, implying high levels of nuclear diversity (Savolainen et al. 2002; Pang et al. 2009; Brown et al. 2011; Ding et al. 2012). We, therefore, used outbred Chinese native dogs (CNs) to capture the representation of dog diversity before recent breed creation of purebred dogs although our sampled individuals likely represent only part of dog diversity and 22 wild gray wolves from Northern China and the Mongolian plain for comparison. The single-nucleotide polymorphism (SNP) genotypes of these dogs and wolves were assayed with Affymetrix v2 canine SNP mapping array chips. German Shepherd dogs (GSs) have distinct breed behavioral characteristics and are famous for their intelligence and obedience (American Kennel Club 1998). To determine whether the genetic basis of the behavior transformation from wolf to primitive dog to modern purebred dog breeds was associated with specific genetic changes, we also 1868 MBE compared the genome-wide divergence between the outbred CNs and eight purebred GSs imported from Germany. A total of 48,445 SNPs passed quality control filters, yielding a call rate of 95.3% and a concordance rate of over 99.9% for the biological replicated samples. Over 90% of SNPs are within 111 kb of another SNP (median = 23 kb). Total missingness for the 1,143 SNPs differed significantly between wolves and dogs (P < 0.05). After false discovery rate (FDR) multiple correction (Benjamini and Hochberg 2000), seven SNPs with abnormally high rates of missingness between the wolves and dogs were excluded from the analyses. Finally, after the control for genetic relatedness among individuals, 20 CNs, 14 wild gray wolves, and 8 GSs remained for the downstream analyses. Based on a pruned subset of SNPs (23,816 SNPs) that were in linkage equilibrium with each other, three distinct clusters formed with K = 3, consisting solely of all individuals from that population (supplementary fig. S1, Supplementary Material online). To investigate the alteration of dog genomic landscape between outbred and modern breed, we quantified pairwise SNP linkage disequilibrium (LD). The long LD decay pattern of GSs was observed, which was nearly the same as that reported in a previous study (Boyko et al. 2010), consistent with breed-forming history (supplementary fig. S2, Supplementary Material online). The CNs’ LD was much shorter than GSs, indicating lack of bottleneck due to breed formation, but slightly greater than the wolf LD, consistent with a bottleneck in dogs during the primary domestication event (supplementary fig. S2, Supplementary Material online). Additionally, average extend haplotype homozygosity (EHH) value at a distance of 500 kb from the core region was negatively correlated with haplotype diversity within the core region in GSs, consistent with the ROH (runs of homozygosity) pattern observed in breed dogs (Boyko et al. 2010). On the contrary, similar to human populations (Auton et al. 2009), CN and wolf did not show the negative correlation with haplotype diversity (supplementary fig. S3, Supplementary Material online). Furthermore, we analyzed haplotype sharing between dogs and wolves for 500-kb haplotype windows containing 5 and 10 SNPs drawn at random similar with previous study (vonHoldt et al. 2010) and found that the CNs had much more haplotype sharing with wolves than GSs, indicating that CNs have a rich genetic diversity (supplementary fig. S4, Supplementary Material online). Differentiation between Dog and Wolf Populations The hypothesis that behavioral transformation was the necessary prerequisite for domestication (Hare et al. 2005) is supported by the finding of differential gene expression profiles in the brains of domesticated and wild canines (Saetre et al. 2004; Lindberg et al. 2005; Kukekova, Johnson, et al. 2011). Therefore, we focused on exploring the expression profile and biological function of the genes identified as being targets of artificial selection. We conducted a pair-wise population differentiation (FST) analysis between CNs and the gray wolves and identified MBE Artificial Selection on Brain-Expressed Genes . doi:10.1093/molbev/mst088 1,878 SNPs within the top 5% of the distribution (FST > 0.50, mean FST = 0.63), indicative of strong selection. These SNPs mapped to 610 putative genes, which have no significance in their gene ontology (GO) category enrichment, suggesting potential effect of SNP ascertainment bias. However, some GO categories, such as locomotory behavior ontology, were enriched with 31 genes when the top 10% FST outliers were considered (FDR correct P < 0.02, table 1). We explored the potential biological function of these FSToutlier genes based on the expression level from 10 different dog tissues. A brain-biased expression pattern was observed with highest expression level and lowest ranking value in the brain (fig. 1). However, a detailed neurological-function analysis of these FST-outlier candidate genes was limited by the lack of expression profiles from different regions of the dog brain and the failure to detect expression of many genes due to the incompleteness of the dog expression array. Because cross-species comparisons of orthologs between dog and human share similar tissue enrichment and tissue selective expression patterns (Briggs et al. 2011), with expression in brain being even more similar (supplementary fig. S5, Supplementary Material online), the expression pattern of genes in the human brain can be used as a proxy for the expression pattern of orthologs in the dog. We, therefore, examined the expression pattern of the human orthologs of these FST-outlier genes to get an approximate localization of their functional brain regions (fig. 2). This analysis showed that the normalized mean expression level of these FST-outlier genes was highest in the prefrontal cortex (fig. 2A). Correspondingly, the mean ranking value was lowest in the prefrontal cortex (fig. 2B). The tissue expression bias of these genes was also highest in prefrontal cortex, followed by 10 other brain-associated tissues (Student’s t-test and Mann– Whitney test, P < 0.05) (fig. 2C); however, none of these tissues remained significant after multiple test correction (the adaptive step-up FDR test [Benjamini and Hochberg 2000]). For example, the significance of the expression bias for the FST-outlier genes was 0.002 in the prefrontal cortex (Mann–Whitney test), but the adjusted P value was 0.135. The most differentiated SNP (chr11.43544043, FST = 0.9698) between wolves and CNs located within gene FOCAD/ KIAA1797, which encodes a focal adhesion protein and function as tumor suppressor in gliomas (Brockschmidt et al. 2012). The human ortholog of this gene is expressed highest in brain-related regions, for example, pineal, prefrontal cortex, amygdale, pituitary, and hypothalamus, except that it is also expressed highly in B lymphoblasts (data from http://biogps. org). In total, 615 FST-outlier candidate genes were identified between GSs and CNs. Of these genes, 75 overlapped with the sets of differentiated genes between wolf and CNs. Five genes (EIF3L, SCFD2, C1QTNF7, PEMT, and MED9) contain the most differentiated SNPs (chr10.29844427, chr13.48921799, chr3.67712095, chr5.44958674, and chr5.44978999, FST = 1) between CNs and GSs. PEMT has highest expression level in the digestion system in humans, whereas EIF3L and MED9 are expressed highest in the immune system. The other two genes showed modest expression among a number of different tissues (data from http://biogps.org). Although, overlapping with FST-outlier genes between wolves and CNs, the 615 FST-outlier candidate genes between GSs and CNs did not show higher expression level in the brain based on the expression data from 10 dog tissues (supplementary fig. S6, Supplementary Material online), but these genes did show a tendency for higher expression levels in brain related tissues relative other tissues based on the human expression data (fig. 3). No statistical significance in this pattern of expression was detected, and the normalized expression levels were much lower than that of genes differentiated between the gray wolf and the native dog (the mean Table 1. Over-Represented GO Categories among the Top 10% Differentiated Genes between Gray Wolves and CNs. Category Description Gene Number P Corrected P Value All TdG E[TdG] 347 224 10 42 31 5 20.58 13.29 0.59 8.5088E-6 9.2094E-6 1.4247E-4 7.1815E-3 7.1815E-3 4.6290E-2 583 8 107 476 65 5 18 49 34.58 0.47 6.35 28.24 6.1476E-7 3.5000E-5 5.4135E-5 1.1614E-4 2.3970E-3 2.2744E-2 2.8736E-2 4.5285E-2 177 19 269 14 543 27 8 33 6 54 10.50 1.13 15.96 0.83 32.21 5.6258E-6 6.2473E-6 5.8961E-5 8.5220E-5 1.3086E-4 7.1815E-3 7.1815E-3 2.8736E-2 3.6919E-2 4.6290E-2 Biological process GO:0007610 GO:0007626 GO:0060122 Behavior Locomotory behavior Inner ear receptor stereocilium organization and biogenesis Molecular function GO:0005509 GO:0030675 GO:0019199 GO:0004713 Calcium ion binding Rac GTPase activator activity Transmembrane receptor protein kinase activity protein tyrosine Kinase activity Cellular component GO:0005578 GO:0032421 GO:0031012 GO:0032420 GO:0044421 Proteinaceous extracellular matrix Stereocilium bundle extracellular matrix Stereocilium Extracellular region part NOTE.—All, all genes apportioned to each category or a descendant category; TdG, genes containing top 10% differentiated SNPs; E[TdG], expected TdG number in each category. 1869 MBE Li et al. . doi:10.1093/molbev/mst088 normalized expression level of the top 10 brain-associated tissues was 1.016 and 1.055, respectively; Mann–Whitney test P < 105). As mentioned by Boyko et al. (2009), village dogs lived as human commensals and were not subjected to the intense artificial selection and breed-formation practices of breed 1.08 relative expression level 1.04 ranking value 1 0.96 0.92 0.88 0.84 FIG. 1. Relative expression level and ranking value in 10 dog tissues of genes showing high population differentiation between wolves and CNs. dogs. Although, to our knowledge, there is no clear definition of the indigenous degree of village dogs, CNs developed genetic signatures characteristic of their geographic location, similar to that of ancient human populations. Because previous studies have found that native dogs in South China have almost the highest levels of canine mtDNA and Y chromosome diversity, implying high levels of nuclear diversity as well (Savolainen et al. 2002; Pang et al. 2009; Brown et al. 2011; Ding et al. 2012), we expect that the genetic structure of CNs should be most similar to that of the ancient dogs. To validate the power of CNs as a comparison model, we compared our candidates of differentiated regions between the CNs and GSs with the selected regions specific to the GSs compiled by Akey et al. (2010). Among the 20 regions identified by Akey et al. with top selective signatures in GSs differentiating them from other 9 breeds, 17 were found to contain highly differentiated SNPs identified in our analysis, with several SNPs in each region. Additionally, the SNP associated with ear erectness (chr10.11440860) (Boyko et al. 2010; Vaysse et al. 2011) was also identified among our top 5% FST outlier between wolves and CNs (chr10.11440860, FST = 0.61). Therefore, as expected, the CNs represents nearly the full genetic diversity of dogs. B 1 A 1.08 1.06 1.04 1 02. 1 0.98 0.96 0.96 0.92 0.88 log(P) C 0 -0.5 -1 -1.5 -2 -2.5 -3 Prefrontal Cortex Whole brain Medulla Oblongata Occipital Lobe Cingulate Cortex Amygdala Hypothalamus Parietal Lobe Subthalamic Nucleus pineal_night Fetal brain Thalamus Pons Cerebellum Cerebellum Peduncles Globus Pallidus pineal_day Caudate nucleus Testis Germ Cell Temporal Lobe retina Olfactory Bulb Testis Intersitial Superior Cervical Ganglion Atrioventricular Node Prefrontal Cortex Medulla Oblongata Whole brain Occipital Lobe Cingulate Cortex Amygdala Subthalamic Nucleus Fetal brain Globus Pallidus Parietal Lobe Temporal Lobe Pons Thalamus Hypothalamus Caudatenucleus Cerebellum Cerebellum Peduncles pineal_night pineal_day retina Olfactory Bulb Testis Germ Cell Spinal cord Testis Intersitial Superior Cervical Ganglion 0.84 t-test Mann-Whitney test FIG. 2. Expression pattern of human orthologs of one-to-one orthologs of dog genes showing high population differentiation between wolves and CNs. (A) Top 25 tissues/cells with high expression level. The value in each tissue/cell was calculated as the averaged log 2 transformed expression values of genes showing high population differentiation in the tissue/cell divided by the averaged log 2 transformed genome-wide expression in the tissue/cell. (B) Top 25 tissues/cells with lowest ranking values. The expression levels of each gene in 84 tissues/cells were sorted from 1st to 84th. The ranking values in each tissue/cell for genes were further averaged and then normalized by dividing by the genome-wide averaged ranking values. (C) The log 10 transformed statistical significances of the expression levels of genes showing high population differentiation higher than genome-wide expression level in each tissue/cell by Student’s t-test and Mann–Whitney test. 1870 Artificial Selection on Brain-Expressed Genes . doi:10.1093/molbev/mst088 A 1.08 MBE B 1 1.06 1.04 1.02 1 0.98 0.96 0.96 0.92 0.88 Prefrontal Cortex Amygdala Occipital Lobe Whole brain retina Cerebellum Medulla Oblongata Hypothalamus Fetal brain Cerebellum Peduncles Caudate nucleus Uterus Cingulate Cortex Parietal Lobe Thyroid Placenta pineal_night Lung Pons Liver Prostate Spinal cord Subthalamic Nucleus Thalamus Uterus Corpus Amygdala Prefrontal Cortex Occipital Lobe Fetal brain Whole brain Cingulate Cortex Cerebellum Medulla Oblongata Caudate nucleus Parietal Lobe Cerebellum Peduncles Subthalamic Nucleus Globus Pallidus Pons Temporal Lobe Thalamus Hypothalamus Uterus pineal_night Placenta Fetal Thyroid retina Thyroid Superior Cervical Ganglion pineal_day 0.84 C 0.9 P 0.7 t-test Mann-Whitney test 0.5 0.3 0.1 FIG. 3. Expression pattern of human one-to-one ortholog of dog genes showing high population differentiation between CNs and GSs. (A) Top 25 tissues/cells with high expression level. The value in each tissue/cell was calculated as the averaged log 2 transformed expression values of genes showing high population differentiation in the tissue/cell divided by the averaged log 2 transformed genome-wide expression in the tissue/cell. (B) Top 25 tissues/ cells with lowest ranking values. The expression levels of each gene in 84 tissues/cells were sorted from 1st to 84th. The ranking values in each tissue/cell for genes were further averaged and then normalized by dividing by the genome-wide averaged ranking values. (C) The statistical significances of the expression levels of genes showing high population differentiation higher than genome-wide expression level in each tissue/cell by Student’s t-test and Mann–Whitney test. Candidate Regions under Artificial Selection in Native Dogs Individual high values for FST may reflect population structure (Weir et al. 2005). In addition, we may have missed genes under selection due to differences in the density of the SNP markers. Consequently, we scanned the dog and wolf genomes using a sliding window analysis to search for significantly differentiated segments (500 kb) with window-averaged FST values beyond the 99th quantile of the empirical distribution (see Materials and Methods) (fig. 4). We found 94 regions that extended for at least 500 kb, which show significant high population differentiation between the gray wolves and the CNs (P < 0.01), and these regions rarely overlap (13 regions) with regions of high differentiation between the GSs and CNs (99 regions). Among the 94 identified regions, a fragment on chromosome 19 is largest and extends over 1.6 Mb (P < 0.01). Within this fragment, the FST value peak (chr19.6599631, FST = 0.84) was located about 6 kb downstream from CCRN4L, a gene regulating the circadian clock with maximal level in early evening, which has also been implicated in lipid metabolism, adipogenesis, glucose homeostasis, inflammation, and osteogenesis (review see [Stubblefield et al. 2012]). Archeological and genetic evidence suggests that the dog was domesticated 16,000 or more years ago (Clutton-Brock 1995; Sablin and Khlopachev 2002; Pang et al. 2009). Genomic regions under recent strong selection should demonstrate extended haplotype homozygosity (Sabeti et al. 2002, 2006). Here, we defined core regions following a previous method (Gabriel et al. 2002) and employed the parameters of EHH and relative EHH (REHH) to search for regions under selection in the CN population. This test is designed for detecting selection using genotyped SNP data, and is robust to ascertainment bias and choice of genetic markers used, and is powerful for the detection of very recent selection (Sabeti et al. 2006). Combined with the previously identified 94 highly differentiated regions, six of them showed significantly higher EHH and REHH values 500 kb upstream or downstream of their core regions (95%) (table 2). Among the six regions, four contain only a single highly differentiated SNP located within a single gene (including the one mentioned earlier, CCRN4L, which affects circadian rhythms) (table 2). One region is located in an intergenic region on chromosome 5 and has four highly differentiated SNPs. Another contains the highly differentiated SNP located within two overlapping genes (CRYAB and FDXACB1), with the major core haplotype of this region (including SNPs at positions 24,267,730 and 24,381,098 on chromosome5) being “CT” (cytosine and thymine) with a frequency 85% in CNs but rare in wolves (7%). At 500 kb downstream of the core region, EHH reached 0.58 (the 95% 1871 MBE Li et al. . doi:10.1093/molbev/mst088 The FST distribution 1.0 A 0.999 0.9999 0.0 0.2 0.4 0.6 0.8 0.99 chr_position (between gray wolf and Chinese Native dog) 0.99 0.999 0.9999 0.0 0.2 0.4 0.6 0.8 1.0 B chr_position (between German Shepherd and Chinese native dog) FIG. 4. Genome-wide FST distribution. (A) The FST between wolves and CNs. (B) The FST between GSs and CNs. The 99th, 99.9th, and 99.99th percentile lines are drawn. Table 2. Summary of Highly Differentiated SNPs with Significant EHH and REHH Values. SNP chr2.9229099 chr4.37752306 chr5.5080128 chr5.5160480 chr5.5160498 chr5.5323685 chr5.24267730 chr18.18141067 chr19.6599631 Gene Symbol ENSCAFG00000014021 BMPR-1A Description Bone morphogenetic protein receptor type-1A CRYAB FDXACB1 SRPK2 CCRN4L crystallin, alpha B ferredoxin-fold anticodon binding domain containing 1 SRSF protein kinase 2 CCR4 carbon catabolite repression 4-like Core Region 9,229,099–9,919,076 37,752,306–38,013,294 5,080,128–5,323,685 24267730–24381098 18,131,350–18,210,935 6,599,631–7,020,181 NOTE.—Highly differentiated SNPs included in each core region with significant EHH and REHH value were listed, together with the genes of which these SNPs locate in 5,000 bp upstream or downstream. Core region was defined by method in Sweep program. in the frequency range of 0.85–0.90 is 0.44) and REHH reached 4.36 (the 99% in the frequency range of 0.85–0.90 is 3.47). Although the pathological consequences of mutations in CRYAB is unknown, elevated expression occurs in many neurological diseases (Hagemann et al. 2009), suggesting a role in the nervous system. Discussion Behavior evolution is important for domestication. The primary key for early domestication is the transformation of negative defensive reactions toward humans (the fearful– aggressive response) to positive reactions, which means physiological changes in the systems that govern neurochemical production (Trut 1999). Specifically, such physiological changes has been characterized by fearful response and a reduced locomotion in a novel environment and increased glucocorticoids that regulate the fear response by mediating neurotransmitter serotonin metabolism (Serpelle and Jagoe 1997; Trut 1999; Korte 2001; Trut et al. 2004). The behavior 1872 evolution is mostly attributable to brain evolution. For example, parallel brain-associated expression differences were identified in the domestication dogs from gray wolves (Saetre et al. 2004). Here, although no GO overrepresentation was observed among the top 5% FST-outlier genes between wolves and CNs, the locomotory behavior ontology was enriched with 31 genes when the top 10% FST outliers were considered (table 1). More strikingly, the top 5% FST-outlier genes were expressed most preferentially in the prefrontal cortex, which is responsible for both emotional and rational aspects of decision making (Striedter 2005). In addition, we identified a highly differentiated region between wolves and CNs (P < 0.01) containing a serotonin receptor (HTR4). The serotonin system is particularly noteworthy as the density of serotonin receptor 1A (HTR1A) was significantly lower in the hypothalamus of tame foxes (Popova et al. 1997), whereas serotonin receptor 2C (HTR2C) was consistently overexpressed in specific brain regions in tame foxes and tame rats (Popova et al. 2010; Kukekova, Johnson, et al. 2011). Artificial Selection on Brain-Expressed Genes . doi:10.1093/molbev/mst088 Further, the binding index of serotonin receptor 2A (HTR2A) in cortical regions is highly related with dog impulsive aggression (Peremans et al. 2005). Previous findings have associated specific patterns of gene expression and serotonin regulation in the dog brain with behaviors likely selected for during the domestication process (Popova et al. 1997; Saetre et al. 2004; Peremans et al. 2005; Kukekova, Johnson, et al. 2011). Our finding of highly differentiated genes between wolves and dogs with brain-biased expression provided new genetic evidence consistent with behavioral transformation at the onset of domestication. We reasoned that it might be attributable to 1) artificial selection for specific behavioral traits in domesticated dogs, such as tameness, temperament, cognitive ability or 2) relaxation of selective constraint due to a population bottleneck. For the latter, however, it would be difficult to generate a group of genes involved in a similar function that are more divergent than the genomic background, although the influence of the latter could not be absolutely excluded. Unlike the genes that are highly differentiated between wolves and CNs, divergent genes between GSs and CNs did not show significantly notable brain-biased expression pattern. During dog breed formation, intense selection pressures differed among breeds and were likely focused on other aspects of phenotype generation and morphology diversification, such as body mass, coat color, and hair (Wayne and Ostrander 2007). However, there are several important caveats to our study. First, both DNA sequence and the expression patterns evolve. We used expression data of the human one-to-one orthologs to infer and evaluate the expression pattern of dog genes in different tissues, especially in the brain-related regions. Here, we reasoned that the human and canine genes have high correlation in their expression profiles, provided that the conservative one-to-one orthologs are often associated with conservative expression patterns under stabilizing selection (Gilad et al. 2006). Genome-wide expression data from various canine brain-related regions would undoubtedly help to confirm the correspondence of brain-biased expression pattern. Second, the genome-wide SNP data were ascertained primarily from the dog genome assemblies and likely underrepresent the genetic diversity in the gray wolf genome. Indeed, there are significant differences between the frequency spectrum of the actual genotyped SNPs and the calculated corrected frequency spectrum, suggesting the effect of ascertainment bias (supplementary fig. S7, Supplementary Material online). Although the sliding window analysis provides a reasonable correction for single SNP-based diversity estimates, multiple-sequenced genomes are needed for the study of artificial selection and identification of the key genes responsible for the domestication. Indeed, a very recent study-based whole-genome sequencing data found evidence of artificial selection on a gene having a function in nervous system development during dog domestication (Axelsson et al. 2013), which supports our finding that artificial selection drives population differentiation of brain-biased genes between wolves and dogs, rather than SNP ascertainment bias. MBE Conclusion We found that genes showing population differentiation between wolves and native dogs based on the population genetics data showed brain-biased expression. These results indicate that during the primary transition from wolves to ancient dogs, genes expressed in the brain evolved rapidly, driven by artificial selection, consistent with the evolution of dog-specific behaviors during domestication. Materials and Methods SNP Genotyping and Calling We genotyped 21 CNs, 22 wild gray wolves, and 8 GSs using Affymetrix v2 Canine arrays. The CNs are outbred village dogs collected across geographical regions (eight provinces) of China, most of which had some specialized morphology (see supplementary fig. S8, Supplementary Material online). The Canine array includes 25.5% of the SNPs found in the boxer dog genome assembly, 11.4% of SNPs from a comparison between the boxer and poodle assemblies, 59.9% of SNPs from a comparison of boxer to low coverage sequencing from nine other breeds, and 3.2% of SNPs from a comparison of the dog and wolf sequences. More than 127,000 SNP markers were scanned with highest signal-to-noise intensity ratios. Because of the consistent overcalling of heterozygous genotypes yielded by the BRLMM-P algorithm, we applied multidimensional analysis for genotype intensity clustering calling algorithm discussed in previous study (Boyko et al. 2010) for SNP calling analysis. We assessed the missing rate for each SNP between dogs and gray wolves using the Fisher’s exact test in PLINK (Purcell et al. 2007). In addition, SNPs that failed in exact tests for Hardy-Weinberg equilibrium at P < 0.001 or had more than 10% missing data within each population were excluded using PLINK (Purcell et al. 2007). Individuals with more than 10% missing genotype data were removed as well. A pairwise similarity matrix among individual was calculated. From our recorded pedigree in the data set, known parental relationships were used to calibrate identity-by-state scores (0.84), and values below this level were judged as unrelated. Pairwise LD was summarized robustly with the method described before (Boyko et al. 2010). Population Structure Analysis The population structure was constructed using Structure software (Pritchard et al. 2000; Falush et al. 2003, 2007; Hubisz et al. 2009), with K = 3 based on a pruned subset of SNPs (23,816 SNPs) that were in linkage equilibrium with each other. Population Differentiation (FST) Analysis FST values of each SNP were calculated as described previously (Weir and Cockerman 1984; Akey et al. 2002) to evaluate the degree of population differentiation between populations. Negative values have no biological explanation and were arbitrarily set to 0. The FST values of SNPs on X chromosome XÞ , where A and X were simply adjusted as X^ ¼ X ðA= denote the averaged FST value among autosomes and X 1873 MBE Li et al. . doi:10.1093/molbev/mst088 chromosome, and X is FST value of the SNP on X chromosome. FST-outlier SNPs were extracted by 95 percentile. SNP was mapped to genes if it locates in 5,000 bp upstream or downstream of this gene. As the hitchhiking effect under strong artificial selection over a short period of time would lead to a rapid allele frequency increase for linked loci, we also focused on specific regions with high differentiation between populations. Here, we performed a sliding-window analysis in which FST values were averaged in 500-kb windows with a step of 100 kb. This procedure may decrease the stochastic variation inherent in single-locus estimates of population structure (Weir et al. 2005). We calculated the significant threshold for 0.99, 0.999, and 0.9999 confidence by bootstrap resampling 1,000 times (sampling size = 2,000). The analyses were done by Perl and R program. EHH/REHH Analysis The haplotype of each chromosome was inferred by fastPHASE (Scheet and Stephens 2006). EHH and REHH values upstream and downstream 500 kb of each core region defined by method in Gabriel et al. (2002) were calculated for each core haplotype using SWEEP software (http:// www.broadinstitute.org/mpg/sweep/) (Sabeti et al. 2002). The genome-wide data of each population were used as an empirical distribution to calculate the statistical P value. GO Analysis We used the Biological Networks Gene Ontology tool, BiNGO 2.44 (Maere et al. 2005), a Java-based tool implemented as a plugin for Cytoscape (Shannon et al. 2003), to determine which GO categories are statistically over-represented in each set of selected genes. All the GO term accession number for each gene through canine genome was downloaded from Ensembl (www.ensembl.org, version 62), which included the whole annotation and was considered as reference set. Any predominant functional themes of interested gene set on the GO hierarchy were shown if the P value obtained from Benjamini and Hochberg FDR-corrected Hypergeometric test is less than 0.05. Analysis on the Expression of Dog Genes and Their Human Orthologs The genomic expression data from 10 different dog tissues (Affymetrix Canine Version 2.0 array) were downloaded from Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/ geo) with accession number GSE20113. According to the chip annotation, probes that might hybridize to multiple loci across the genome were discarded. The expression value of each gene was averaged across four biological replicates. The one-to-one human orthologs of dog genes were obtained using the BioMart tool in the Ensembl (http:// ensembl.org). Human gene expression data (Human U133A Gene Atlas) in 84 tissues or cells were downloaded from BioGPS (http://biogps.gnf.org/#goto=welcome) (Wu et al. 2009) with GEO code GSE1133. According to the chip annotation, probes that might hybridize to multiple loci across the 1874 genome were discarded. Expression values were log 2 transformed. For genes having more than one probe, the expression in each tissue was averaged. The expression similarity of orthologous dog and human genes was estimated using Pearson and Spearman methods in each matched tissue in R program. To avoid bias expressed in different tissues, the expression levels of differentiated genes were normalized by dividing by the average whole-genome expression levels in each tissue. Further, to test whether the tissue-biased expression pattern was affected by several genes with very high expression profiles, we estimated a rank value. Specifically, we sorted the expression levels of each gene in different tissues from highest to lowest. Then, the ranks of selected gene set were averaged within each tissue, so as to the whole-genome genes. Similarly, the average ranking value for selected genes was normalized by dividing by whole-genome background. In addition, we also assessed the statistical significance of expression level of selected genes higher than genome-wide expression level in each tissue by Student’s t-test and Mann– Whitney test. Supplementary Material Supplementary figures S1–S7 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjour nals.org/). Acknowledgments The authors thank Prof. David Irwin for revising the manuscript. This work was supported by grants from the National Basic Research Program of China (973 Program, Grant 2007CB815702), National Natural Science Foundation of China (31061160189), and the National Science Foundation (US) to R.K.W. (for the wolf SNP genotyping). References American Kennel Club. 1998. The complete dog book. New York: Howell Book House. Akey JM, Ruhe AL, Akey DT, Wong AK, Connelly CF, Madeoy J, Nicholas TJ, Neff MW. 2010. Tracking footprints of artificial selection in the dog genome. Proc Natl Acad Sci U S A. 107:1160–1165. Akey JM, Zhang G, Zhang K, Jin L, Shriver MD. 2002. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12:1805–1814. Auton A, Bryc K, Boyko AR, Lohmueller KE, Novermbre J. 2009. Global distribution of genomic diversity underscores rich complex history of continental human populations. Genome Res. 19:795–803. Axelsson E, Ratnakumar A, Arendt M-L, Maqbool K, Webster MT, Perloski M, Liberg O, Arnemo JM, Hedhammar A, Lindblad-Toh K. 2013. The genomic signature of dog domestication reveals adaptation to a starch-rich diet. Nature 495:360–364. Belyaev DK. 1969. Domestication of animals. Science 5:47–52. Benjamini Y, Hochberg Y. 2000. On the adaptive control of the false discovery rate in multiple testing with independent statistics. J Behav Educ Stat. 25:60–83. Boyko AR, Boyko RH, Boyko CM, et al. (15 co-authors). 2009. Complex population structure in African village dogs and its implications for inferring dog domestication history. Proc Natl Acad Sci U S A. 106: 13903–13908. Boyko AR, Quignon P, Li L, Schoenebeck JJ, Degenhardt JD, Lohmueller KE, Zhao K, Brisbin A, Parker HG. 2010. A simple genetic architecture underlies morphological variation in dogs. PLoS Biol. 8:e1000451. Artificial Selection on Brain-Expressed Genes . doi:10.1093/molbev/mst088 Briggs J, Paoloni M, Chen QR, Wen X, Khan J, Khanna C. 2011. A compendium of canine normal tissue gene expression. PLoS One 6: e17107. Brockschmidt A, Trost D, Peterziel H, et al. (22 co-authors). 2012. KIAA1797/FOCAD encodes a novel focal adhesion protein with tumour suppressor function in gliomas. Brain 135:1027–1041. Brown SK, Pedersen NC, Jafarishorijeh S, Bannasch DL, Ahrens KD, Wu J, Okon M, Sacks BN. 2011. Phylogenetic distinctiveness of Middle Eastern and Southeast Asian village dog Y chromosomes illuminates dog origins. PLoS One 6:e28496. Clutton-Brock J. 1995. Origins of the dog: domestication and early history. Cambridge: Cambridge University Press. Ding ZL, Oskarsson M, Ardalan A, Angleby H, Dahlgren LG, Tepeli C, Kirkness E, Savolainen P, Zhang Y-P. 2012. Origins of domestic dog in Southern East Asia is supported by analysis of Y-chromosome DNA. Heredity 108:507–514. Falush D, Stephens M, Pritchard JK. 2003. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–1587. Falush D, Stephens M, Pritchard JK. 2007. Inference of population structure using multilocus genotype data: dominant markers and null alleles. Mol Ecol Notes. 7:574–578. Fox M. 1971. Behavior of wolves, dogs and related canids. New York: Harper-Row. Gabriel SB, Schaffner SF, Nguyen H, et al. (18 co-authors). 2002. The structure of haplotype blocks in the human genome. Science 296: 2225–2229. Gilad Y, Oshlack A, Rifkin SA. 2006. Natural selection on gene expression. Trends Genet. 22:456–461. Hagemann TL, Boelens WC, Wawrousek EF, Messing A. 2009. Suppression of GFAP toxicity by alpha-B-crystallin in mouse models of Alexander disease. Hum Mol Genet. 18:1190–1199. Hare B, Brown M, Williamson C, Tomasello M. 2002. The domestication of social cognition in dogs. Science 298:1634–1636. Hare B, Plyusnina I, Ignacio N, Schepina O, Stepika A, Wrangham R, Trut L. 2005. Social cognitive evolution in captive foxes is a correlated byproduct of experimental domestication. Curr Biol. 15:226–230. Hare B, Tomasello M. 2005. Human-like social skills in dogs? Trends Cogn Sci. 9:439–444. Hubisz MJ, Falush D, Stephens M, Pritchard JK. 2009. Inferring weak population structure with the assistance of sample group information. Mol Ecol Notes. 9:1322–1332. Korte SM. 2001. Corticosteroids in relation to fear, anxiety and psychopathology. Neurosci Biobehav Rev. 25:117–142. Kukekova A, Temnykh S, Johnson J, Trut L, Acland G. 2011. Genetics of behavior in the silver fox. Mamm Genome. 23:164–177. Kukekova AV, Johnson JL, Teiling C, et al. (17 co-authors). 2011. Sequence comparison of prefrontal cortical brain transcriptome from a tame and an aggressive silver fox (Vulpes vulpes). BMC Genomics 12:482. Kukekova AV, Trut LN, Chase K, et al. (11 co-authors). 2008. Measurement of segregating behaviors in experimental silver fox pedigrees. Behav Genet. 38:185–194. Kukekova AV, Trut LN, Chase K, et al. (14 co-authors). 2011. Mapping loci for fox domestication: deconstruction/reconstruction of a behavioral phenotype. Behav Genet. 41:593–606. Lindberg J, Bjornerfeldt S, Saetre P, Svartberg K, Seehuus B, Bakken M, Vila C, Jazin E. 2005. Selection for tameness has changed brain gene expression in silver foxes. Curr Biol. 15:915–916. Maere S, Heymans K, Kuiper M. 2005. BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 21:3448–3449. Masuda K, Hashizume C, Kikusui T, Takeuchi Y, Mori Y. 2004. Breed differences in genotype and allele frequency of catechol o-methyltransferase gene polymorphic regions in dogs. J Vet Med Sci. 66: 183–187. Miklósi A, Kubinyi E, Topál J, Gácsi M, Virányi Z, Csányi V. 2003. A simple reason for a big difference: wolves do not look back at humans, but dogs do. Curr Biol. 13:763–766. MBE Ogata N, Hashizume C, Momozawa Y, Masuda K, Kikusui T, Takeuchi Y, Mori Y. 2006. Polymorphisms in the canine glutamate transporter-1 gene: identification and variation among five dog breeds. J Vet Med Sci. 68:157–159. Ostrander EA, Wayne RK. 2005. The canine genome. Genome Res. 15: 1706–1716. Pang J-F, Kluetsch C, Zou X-J, et al. (14 co-authors). 2009. mtDNA data indicate a single origin for dogs south of Yangtze river, less than 16,300 years ago, from numerous wolves. Mol Biol Evol. 26: 2849–2864. Peremans K, Audenaert K, Hoybergs Y, Otte A, Goethals I, Gielen I, Blankaert P, Vervaet M, van Heeringen C, Dierckx R. 2005. The effect of citalopram hydrobromide on 5-HT2A receptors in the impulsive-aggressive dog, as measured with 123I-5-I-R91150 SPECT. Eur J Nucl Med Mol Imaging. 32:708–716. Popova NK, Kulikov AV, Avgustinovich DF, Voitenko NN, Trut LN. 1997. Effect of domestication of the silver fox on the main enzymes of serotonin metabolism and serotonin receptors. Genetika 33: 370–374. Popova NK, Naumenko VS, Kozhemyakina RV, Plyusnina IZ. 2010. Functional characteristics of serotonin 5-HT2A and 5-HT2C receptors in the brain and the expression of the 5-HT2A and 5-HT2C receptor genes in aggressive and non-aggressive rats. Neurosci Behav Physiol. 40:357–361. Pritchard JK, Stephens M, Donnelly P. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945–959. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, De Bakker PIW, Daly MJ. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 81:559–575. Sabeti PC, Reich DE, Higgins JM, et al. (17 co-authors). 2002. Detecting recent positive selection in the human genome from haplotype structure. Nature 419:832–837. Sabeti PC, Schaffner SF, Fry B, Lohmueller J, Varilly P, Shamovsky O, Palma A, Mikkelsen TS, Altshuler D, Lander ES. 2006. Positive natural selection in the human lineage. Science 312:1614–1620. Sablin MV, Khlopachev GA. 2002. The earliest ice age dogs: evidence from Eliseevichi 1. Curr Anthropol. 43:795–798. Saetre P, Lindberg J, Leonard JA, Olsson K, Pettersson U, Ellegren H, Bergström TF, Vil C, Jazin E. 2004. From wild wolf to domestic dog: gene expression changes in the brain. Mol Brain Res. 126: 198–206. Savolainen P, Zhang YP, Luo J, Lundeberg J, Leitner T. 2002. Genetic evidence for an East Asian origin of domestic dogs. Science 298: 1610–1613. Scheet P, Stephens M. 2006. A fast and flexible statistical model for largescale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet. 78:629–644. Scott JP. 1962. Critical periods in behavioral development. Science 138: 949–958. Serpelle J, Jagoe JA. 1997. Early experience and the development of behavior. In: Serpell J, editor. The domestic dog: its evolution, behavior and interactions with people. Cambridge: Cambridge University Press. p. 79–102. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. 2003. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13:2498–2504. Striedter GF. 2005. Principles of brain evolution. Sunderland (MA): Sinauer Associates. Stubblefield JJ, Terrien J, Green CB. 2012. Nocturnin: at the crossroads of clocks and metabolism. Trends Endocrinol Metab. 23:326–333. Takeuchi Y, Hashizume C, Chon EM, Momozawa Y, Masuda K, Kikusui T, Mori Y. 2005. Canine tyrosine hydroxylase (th) gene and dopamine beta-hydroxylase (dbh) gene: their sequences, genetic polymorphisms, and diversities among five different dog breeds. J Vet Med Sci. 67:861–867. Tomasello M, Kaminski J. 2009. Like infant, like dog. Science 325: 1213–1214. 1875 Li et al. . doi:10.1093/molbev/mst088 Topál J, Gergely G, Erdöhegyi Á, Csibra G, Miklósi Á. 2009. Differential sensitivity to human communication in dogs, wolves, and human infants. Science 325:1269–1272. Trut L, Oskina I, Kharlamova A. 2009. Animal evolution during domestication: the domesticated fox as a model. Bioessays 31: 349–360. Trut LN. 1999. Early canid domestication: the farm-fox experiment. Am Sci. 87:160–169. Trut LN. 2001. Experimental studies in early canid domestication. In: Ruvinsky A, Sampson J, editors. The genetics of the dog. New York: CABI Publishing. p. 15–41. Trut LN, Plyusnina IZ, Oskina IN. 2004. An experiment on fox domestication and debatable issues of evolution of the dog. Russ J Genet. 40:644–655. Vaysse A, Ratnakumar A, Derrien T, et al. (22 co-authors). 2011. Identification of genomic regions associated with phenotypic 1876 MBE variation between dog breeds using selection mapping. PLoS Genet. 7:e1002316. vonHoldt BM, Pollinger JP, Lohmueller KE, et al. (36 co-authors). 2010. Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication. Nature 464:898–902. Wayne RK, Ostrander EA. 2007. Lessons learned from the dog genome. Trends Genet. 23:557–567. Weir B, Cockerman C. 1984. Estimating F-statistics for the analysis of population structure. Evolution 38:1358–1370. Weir BS, Cardon LR, Anderson AD, Nielsen DM, Hill WG. 2005. Measures of human population structure show heterogeneity among genomic regions. Genome Res. 15:1468–1476. Wu C, Orozco C, Boyer J, Leglise M, Goodale J, Batalov S, Hodge CL, Haase J, Janes J, Huss JW 3rd. 2009. BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome Biol. 10:R130.