Download A genome-wide association scan in pig identifies novel regions

Published December 2, 2014 A genome-wide association scan in pig identifies novel regions associated with feed efficiency trait1 Goutam Sahana,*2 Veronika Kadlecová,*† Henrik Hornshøj,* Bjarne Nielsen,‡ and Ole F. Christensen* *Department of Molecular Biology and Genetics, Aarhus University, Blichers Alle 20, Postboks 50, DK-8830 Tjele, Denmark; †Department of Animal Husbandry, Czech University of Life Sciences Prague, 165 21 Prague 6-Suchdol, Czech Republic; and ‡Pig Research Centre, Danish Agriculture and Food Council, Axeltorv 3, DK-1609 Copenhagen V, Denmark ABSTRACT: Feed conversion ratio (FCR) is an economically important trait in pigs, and feed accounts for a significant proportion of the costs involved in pig production. In this study we used a high-density SNP chip panel, Porcine SNP60 BeadChip, to identify the association between FCR and SNP markers and to study the genetic architecture of the trait. After quality control, a total of 30,847 SNP that could be mapped to the 18 porcine autosomes (SSC) using the pig genome assembly 10.2 were used in the analyses. Deregressed estimated breeding value was used as the response variable. A total of 3,071 Duroc pigs had both FCR data and genotype data. The linkage disequilibrium (r2) between adjacent markers was 0.56. Two association mapping approaches were used: a linear mixed model (LMM) based on single-locus regression analysis and a Bayesian variable selection approach (BVS). A total of 79 significant (P < 0.0001) SNP associations on 6 chromosomes were identified by LMM analyses. Out of these, 10 SNP crossed the genome-wide significance threshold. These 10 SNP were all located on SSC 4 and 14. In the BVS analysis, a total of 44 SNP located on 12 chromosomes had posterior probability more than or equal to 0.05 (i.e., Bayes factor ≥ 10). Thirteen SNP were identified by both LMM and BVS. These 13 SNP were located on 4 chromosomes: SSC 4, 7, 8, and 14. Hypoxia inducible factor 1, alpha subunit inhibitor (HIF1AN) and ladybird homeobox 1 (LBX1) are 2 possible candidate genes affecting FCR on SSC 4 and 14, respectively. The study provides a list of SNP associated with FCR and also offers valuable information on the genetic architecture and candidate genes for this trait. Key words: feed conversion ratio, gene mapping, genome-wide association study, pig © 2013 American Society of Animal Science. All rights reserved. INTRODUCTION Feed conversion ratio (FCR) is an economically important trait in pigs, and feed accounts for a significant proportion of the costs involved in pig production. Therefore, selection that improves FCR has been an important part of the breeding goal for commercial pig breeding (Okine et al., 2004). Genomic selection has been implemented in pig breeding (Ostersen et al., 1The work was partly funded by the Danish Ministry of Food, Agriculture and Fisheries (grant 3405-11-0279). The authors gratefully acknowledge Luc Janss for help on the use of Bayz software and the Pig Research Centre, the Danish Agricultural and Food Council, for the data used in the study. 2Corresponding author: [email protected] Received July 12, 2012. Accepted November 29, 2012. J. Anim. Sci. 2013.91:1041–1050 doi:10.2527/jas2012-5643 2011; Christensen et al., 2012) for improvement of economic traits including FCR. However, knowledge of QTL influencing FCR may further improve genetic selection models by using prior knowledge of chromosomal regions harboring QTL. Several QTL affecting economic traits in pigs have previously been reported, primarily on the basis of within-family linkage analysis using microsatellite markers. The results are listed in the Pig Quantitative Trait Locus database (pig QTLdb) located at www.animalgenome.org/cgi-bin/QTLdb/SS/ index (Hu and Reecy, 2007; Hu et al., 2010). However, availability of the PorcineSNP60 BeadChip (Illumina, San Diego, CA), which contains more than 62,000 SNP spread over entire porcine genome, gives us the opportunity to carry out association studies to map chromosomal regions associated with economically important traits in pigs at the population level. 1041 1042 Sahana et al. Several QTL for FCR discovered using linkage studies with microsatellite markers have been listed in the pig QTLdb on various SSC (9, 13, 16, 18, and X). Such linkage information can be used only for within-family selection, and the linkage QTL intervals are generally wide and therefore are not suitable for candidate gene searches. However, association mapping (also known as linkage disequilibrium mapping) with dense SNP markers can map QTL segregating in a population, and as it uses historic recombinations, the QTL intervals become much narrower than in a linkage study (Risch, 2000). Furthermore, there are often pronounced disagreements between results obtained from different genome-wide association studies in different populations. There could be multiple possible reasons for these disagreements, for example, the original discovery was a false positive, the QTL is segregating in 1 population, or lack of statistical SRZHU%RORUPDDHWDO7KHUHIRUHFRQ¿UPDWLRQ of published QTL is necessary in the population under study before such information is included in genomic VHOHFWLRQ DV SULRU NQRZOHGJH RU IRU ¿QHPDSSLQJ 47/ with an aim to identify candidate genes that underlie the QTL. The objective of this study was to identify SNP associated with FCR in Danish Duroc pigs. 0$7(5,$/6$1'0(7+2'6 Animal Care and Use Committee approval was not obtained for this study because no animal was handled in this experiment. et al. (2011). In brief, the deregression procedure of Garrick et al. (2009) adjusts for ancestral information, such that the deregressed EBV only contain their own information and the information of the descendant on each animal. Deregressed EBV have unequal variances and therefore should be used in a weighted analysis. The weight for the ith animal was estimated using Garrick et al. (2009) as 1− h2 wi = ⎡⎛ 1 − ri2 ⎞ 2 ⎤ ⎢⎜ c + 2 ⎟ h ⎥ ri ⎠ ⎦ ⎣⎝ , where c = the part of the genetic variance not explained by markers, h2 = the heritability of the trait, and ri2 = the reliability of the deregressed EBV of the ith animal. The average reliability of the deregressed EBV was 0.21 (SD = 0.13). The parameter c was assumed to be 0.1, DQGWKHKHULWDELOLW\IRU)&5ZDVXVLQJ2VWHUVHQHW al. (2011). Determining the proportion of genetic variance not captured by markers is not straightforward as it will vary with the trait, the population, and also the marker density. Christensen et al. (2012) reported that a proportion in the range of 0.1 to 0.4 is a good assumption. Jensen et al. (2012) reported that 77.2% of the total additive genetic variance was explained by markers in cattle (using Bovine SNP50 BeadChip).We have chosen IRUSDUDPHWHUFWREHFRQVLVWHQWZLWK2VWHUVHQHWDO (2011). However, the aim of the present study is to map the major QTL, and therefore, the choice of value for SDUDPHWHUFZLOOQRWKDYHDPDMRULQÀXHQFHRQWKHFRQclusions of this study. Animals and Phenotypes The Duroc pig data were collected as part of the genetic evaluation system in Denmark. All data were supplied by the Pig Research Centre, Danish Agriculture and Food Council, Axeltorv, Denmark. The pedigree was traced back to 1984; no animals were imported into the system, and nearly all great-grandparents of animals with records were known. There was a total of 371,233 individuals in the pedigree, and a total of 30,091 animals were recorded for FCR. However, the pedigree connected to the genotyped animals had only 11,146 individuals. The trait analyzed was FCR (feed intake/BW gain) in the interval 30 to 100 kg, recorded from 1992. The animals with FCR records were all males, and most litters only had 1 or 2 animals with FCR records. The EBV for FCR were calculated by single-trait animal model, ZLWK UHJUHVVLRQ HIIHFW RI VWDUW ZHLJKW ¿[HG HIIHFW RI herd-week-section, random effect of pen, and a random DGGLWLYHJHQHWLFHIIHFWDVLQ2VWHUVHQHWDO7KH phenotype used for association analysis was deregressed estimated breeding values for FCR. The details of the HVWLPDWLRQ RI GHUHJUHVVHG (%9 DUH JLYHQ E\ 2VWHUVHQ SNP Array Genotyping Marker data were obtained for 3,071 animals selected as those having the largest accuracy for the EBV (born in 1998 to 2010) using the Illumina Porcine SNP60 BeadChip (Illumina). The SNP data were editHG XVLQJ WKH VDPH FULWHULD DV LQ 2VWHUVHQ HW DO Each animal had a call rate greater than 0.95, a minor allele frequency greater than 0.05, a call-frequency score greater than 0.95, a genCall score greater than 0.65, and a heterozygote frequency that did not deviate 1 from Hardy-Weinberg expectations by more than 4 pq , where p and q are the allele frequencies at the marker. The term 14 pq corresponds to a one-fourth SD unit when assuming a binomial distribution of alleles. A total of 33,756 SNP passed these quality control criteria, among which 30,847 SNP could be mapped to the pig genome assembly version 10.2 (SSC10.2; NCBI GenBank ID GCA_000003025.4) using GMAP alignment software :X DQG:DWDQDEH 2QO\ WKH 613 ZLWK NQRZQ genome positions were used for association analyses, because without the known assembly position, it will not Association studies for feed conversion ratio in pigs EHSRVVLEOHWROLQNVLJQL¿FDQWO\DVVRFLDWHG613WRFDQdidate genes. The missing genotypes and the haplotype phases were determined using the software fastPHASE (Scheet and Stephens, 2006), and linkage disequilibrium (r2) between adjacent markers were estimated using the Haploview software (Barrett et al., 2005). Statistical Analyses Association Analyses. The 2 methods used to associate SNP and phenotypic data are described below. 7KH ¿UVW DSSURDFK ZDV D 613E\613 DQDO\VLV ZKHUH HDFK613ZDV¿WWHGVHSDUDWHO\LQDOLQHDUPL[HGPRGHO (/00) using Yu et al. (2005). Complex familial relationship is the primary confounding factor in genomewide association studies (GWAS) in livestock population. Linear mixed model analysis, which includes the relationship among individuals through a polygenic effect, is able to control the false-positives due to family structure (Yu et al., 2005; Sahana et al., 2010). The second approach we used was Bayesian variable selection (%96; George and McCulloch, 1993). Linear Mixed Model. The association between the SNP and the phenotype was assessed by a single-locus regression analysis for each SNP separately, using a linear mixed model (Yu et al., 2005). The model was as follows: y = 1μ + mg + Zu + e , [1] where \ is the vector FCR phenotypes (deregressed EBV), 1 is a vector of ones with length equal to the number of observations, μ is the general mean, m is a vector ZLWK JHQRW\SLF LQGLFDWRUV í RU DVVRFLDWLQJ UHcords to the marker effect, g is a scalar of the associated additive effect of the SNP, Z is an incidence matrix relating phenotypes to the corresponding random polygenic effect, u is a vector of the random polygenic effect with the normal distribution N(0, AT2u ), where A is the additive relationship matrix and σ2u is the polygenic variance, and e is a vector of random environmental deviates with the normal distribution N (0, W −1σ2e ) , where σe2 is the error variance and W is the diagonal matrix containing weights of the deregressed estimated breeding valXHV7KHPRGHOZDV¿WWHGE\5(0/XVLQJWKHVRIWZDUH DMU (Madsen and Jensen, 2011), and testing was done using a Wald test against a null hypothesis of g = 0. Bayesian Variable Selection. This method is based on specifying a mixture distribution for SNP effects, and DOO613PDUNHUVDUH¿WWHGVLPXOWDQHRXVO\LQWKHPRGHO (George and McCulloch, 1993). The markers with heavily shrunken effects were placed in 1 group, effectively removing them from the model, and a second group with more mildly shrunken effects was used to identify the most plausible multi-SNP model(s). 1043 It was assumed that most markers had very small effects on the trait (99.5% of SNP in this analysis) and only a few markers (0.5%) had large effects. Therefore, among the total 30,847 SNP, a priori about 154 SNP were assumed to have an effect on FCR. The allocation of each SNP to either of these 2 distributions was done using an indicator variable in Gibbs sampling. The averaged mixture indicator estimated a posterior probability for that SNP to come from the distribution with large effects, which was interpreted as the probability for the presence of an associated marker or QTL. The analysis was performed using Bayz software (Janss, 2011), and the variances of the 2 mixture components were estimated. This approach can therefore be viewed as a Bayesian multiple regression model selection method. The model is as follows: y = 1μ + Mb + e , [2] where \ȝDQGe are as described using Eq. [1], M is the design matrix with dimension n × j allocating phenotypes to marker effects (n is the number of animals, j is the number of markers), E is a vector of marker ef2 2 fects, with b π 0 N (0, σ β 0 I ) + π 1 N (0, σ β 1 I ), where p0 2 2 = 99.5, p1 = 0.05, and σ β 0 = 0.01 × σ β 1 . Similar to Eq. [1], the variances of residuals were proportional to the reciprocals of the weights of the deregressed EBV. The Gibbs sampling chain was run for 50,000 iterations, and WKH¿UVWLWHUDWLRQVZHUHGLVFDUGHGDVEXUQLQ 6LJQL¿FDQW $VVRFLDWLRQV DQG )'5 The genomeZLGH VLJQL¿FDQW DVVRFLDWLRQ DW WKH VLJQL¿FDQFH level after Bonferroni multiple testing correction was a P-value of 1.6 × 10-6, i.e., –log10(P-value) > 5.79. For /00ZH¿UVWH[DPLQHGWKHQXPEHURI613WKDWKDGJHQRPHZLGHVLJQL¿FDQWDVVRFLDWLRQZLWK)&5+RZHYHU the tests conducted here were not independent and were linked because of linkage disequilibrium (/') between markers. The Bonferroni correction may result in too stringent a threshold and hence many false-negative UHVXOWV 7KHUHIRUH ZH DOVR FRQVLGHUHG D OLEHUDO VLJQL¿cant threshold, and a SNP was considered to have a sigQL¿FDQWDVVRFLDWLRQZLWK)&5LILWV±ORJ10(P-value) was greater than 4 (i.e., P-value < 0.0001). The false discovHU\UDWHIRUWKLVQRPLQDOVLJQL¿FDQWWKUHVKROGFRQVLGHUHG in LMM analyses was calculated using Benjamini and Hochberg (1995) as m × P/S, where m is the number of tests, P is the PYDOXHVXVHGWRFDOODQDVVRFLDWLRQVLJQL¿FDQWDQG6LVWKHQXPEHURI613GHFODUHGVLJQL¿FDQWDW this threshold. The calculated false discovery rate may be overestimated as the tests conducted are not independent because of LD between markers. For BVS, the SNP with posterior probability of the mixture indicator larger than 0.05, which corresponds to a Bayes factor (posteri- 1044 Sahana et al. than 0.05, which corresponds to a Bayes factor (posterior probability/prior probability) of ≥10.0, were reported as having an effect on the phenotype. Chromosomal Mapping of Genes near the Significant SNP. To provide information regarding identity and function of genes at mapped SNP markers, the chromosomal positions of a locally created set of Augustus gene predictions (http://gbi.agrsci.dk/pig/sscrofa10_2_annotation) was used. A list of genes closest to the significant SNP was extracted, allowing a maximum distance of 1 Mb between SNP and genes. Putative gene identities were established by BLAST homology search to known human gene transcripts downloaded from the NCBI RefSeq sequence database. The biological function of genes and possible relation to feed efficiency was subsequently investigated. RESULTS Minor Allele Frequencies and Linkage Disequilibrium The association analyses were carried out for the 30,847 SNP that could be mapped on the 18 porcine autosomes (SSC) using pig genome assembly SSC10.2. The number of SNP on a chromosome ranged from 1,420 on SSC18 to 7,027 on SSC1. The minor allele frequencies of the selected SNP were uniformly distributed between 0.05 and 0.50 (Supplementary Fig. 1; see http:// journalofanimalscience.org). This indicates that the SNP panel used in this study represents mainly the common variation in the genome. The linkage disequilibrium (r2) was studied on chromosome (SSC14), and the results are presented in Fig. 1. The average distance between adjacent SNP was 64 kb, and the average r2 between adjacent SNP was 0.56. The average r2 values at 100, 1,000, and 10,000 kb distance were 0.43, 0.30, and 0.08, respectively. The average r2 at 50 Mb was 0.02 on this chromosome. Figure 1. Decay of average linkage disequilibrium (r2) on chromosome 14 in Duroc pigs at various distances (in million base pairs, from 0 to 50 Mb). Association Results The SNP showing genome-wide significant association with FCR in the LMM analyses are presented in Table 1, and the Manhattan plot is shown in Fig. 2. The distribution of allele substitution effects are presented in Fig. 3. There were a total of 79 significant associations (P < 0.0001) located on 6 chromosomes (Supplementary Table 1. Single nucleotide polymorphism showing genome-wide significant associations with feed conversion ratio (FCR) in Duroc pigs using the linear mixed model (LMM) approach1 SNP Chr. Pos(Bp) MAF b.value SE(b) -Log10(p-value) Percent of DEBV_var CASI0010164 4 63896889 0.456 0.026 0.005 6.708 1.04 ALGA0081097 4 64003194 0.456 −0.026 0.005 6.705 1.04 MARC0014536 14 120439294 0.452 0.025 0.005 6.086 0.96 MARC0111695 14 121418432 0.456 −0.026 0.005 6.691 1.04 ASGA0066126 14 121556657 0.453 0.026 0.005 6.566 1.04 MARC0012665 14 121885221 0.496 0.028 0.005 7.793 1.22 MARC0005041 14 122096887 0.497 −0.028 0.005 7.670 1.22 ASGA0066146 14 122124455 0.496 −0.029 0.005 7.891 1.31 ALGA0081147 14 122390163 0.500 −0.029 0.005 8.044 1.31 H3GA0042111 14 123496090 0.493 0.027 0.005 7.033 1.13 1Chr. = chromosome; Pos(Bp) = position on the chromosome (in base pair); MAF = minor allele frequency; b.value = allele substitution effect; SE(b) = SE for the b.value; Percent of DEBV_var = % of variance for deregressed EBV explained by the SNP. 1045 Association studies for feed conversion ratio in pigs Table 1; see http://journalofanimalscience.org). The largest number of significant (P < 0.0001) SNP was located on SSC16 (44 SNP), followed by SSC14 (19 SNP). Ten SNP located on 2 chromosomes, SSC4 (2 SNP) and SSC14 (8 SNP), crossed the genome-wide significant threshold. Five significant SNP were located within 161.8 kb on SSC4, and 2 of them showed genome-wide significant association. All 5 SNP were in high LD (Fig. 4), indicating that they all are likely linked to the same QTL. The variances of deregressed EBV explained by individual significant SNP ranged between 0.56% and 1.31% (Supplementary Table 1). We considered a SNP to have significant association with FCR if its P-value was less than 0.0001 in the LMM analyses. Therefore, the false discovery in our study was (0.0001 × 30847)/79 = 3.9%, meaning 4% of the 79 associations discovered by LMM analyses are likely to be false. In the BVS analysis, a total of 44 SNP located on 12 chromosomes had posterior probability greater than or equal to 0.05 (i.e., Bayes factor of 10 or more; Table 2, Fig. 5). Out of these SNP, 13 were also identified by LMM analyses. These 13 SNP were located Figure 2. Manhattan plot showing association with feed conversion ratio trait for all the SNP used in the analysis in Duroc pigs. The horizontal line indicates genome-wide significant threshold. See online version for figure in color. on 4 chromosomes: SSC4 (3 SNP), SSC7 (1 SNP), SSC8 (1 SNP), and SSC14 (8 SNP). The distribution of allele substitution effects for the BVS model is presented in Supplementary Fig. 2 (see http://journalofanimalscience.org). Table 3 presents the genes located closest to the significant SNP discovered by LMM and/or BVS approaches. It was not possible to detect a specific candidate gene for all the QTL regions because some of them were wide (>5 Mb), but some candidate genes could be selected on the basis of the known gene function in other species. Candidate genes for the QTL on SSC4 (63.8 Mb) and SSC14 (121 Mb) are discussed in the next section. DISCUSSION Extent of Linkage Disequilibrium The LD observed in this Duroc pig population was quite high (average r2 = 0.56 between adjacent markers). The average r2 at 1,000 kb distance was 0.30; therefore, the currently used SNP chip should be sufficient for a whole-genome scan (Meuwissen et al., 2001). However, high LD at long distance limits fine mapping the QTL as SNP quite far from the actual QTL position will also show association because of the extent of LD. Badke et al. (2012) have studied LD in 4 pig breeds, including Duroc, using the same SNP chip used in the present study. In general, the estimates of LD at different distances reported by Badke et al. (2012) were less than those estimated in our study (for example, the r2 between adjacent SNP was 0.46). One of the main reasons for these discrepancies could be the populations used to estimate LD. Although Badke et al. (2012) used unrelated trios (separated for at least 2 generations), we used all the individuals included in this GWAS study irrespective of their relationship. This might have resulted in the presence of multiple copies of the same haplotype that are identical by descent because of the inclu- Figure 3. Distribution of allele substitution effects obtained in SNP-by-SNP analyses in linear mixed model approach. See online version for figure in color. 1046 Sahana et al. sion of closely related individuals in our sample. Du et al. (2007) estimated r2 from 4,500 SNP markers in 6 commercial lines of pigs and found estimates of average r2 = 0.51 for markers less than 0.1 centimorgan (cM; ~100 kb). Uimari and Tapio (2011) also used the Illumina Porcine SNP60 BeadChip and reported an average r2 of 0.43 and 0.46 for adjacent markers in Finnish Landrace and Yorkshire pig populations. Associations with FCR We used 2 approaches for association mapping, LMM analyses and a Bayesian variable selection approach. Although the LMM does SNP-by-SNP analysis, BVS fits all the markers simultaneously. In this study 13 SNP were identified by both LMM and BVS approaches. The LMM identified more SNP (79) associated with FCR compared with BVS (44 SNP). As BVS considered all the markers simultaneously, given that a SNP is included in a model, there is little chance that a second SNP, in LD with the first SNP, will also be included in a sample. However, if the SNP are in high LD, it is possible that 1 among the SNP in strong LD is picked in a different iteration of the Markov Chain Monte Carlo (MCMC) run, thereby relegating the QTL association to a region rather than to a specific SNP (Sahana et al., 2010). For example, on SSC14, 8 SNP in close proximity had a Bayes factor of >10, but the highest posterior probability for an individual SNP was only 0.218. We also observed high LD among these SNP (Fig.6). A similar phenomenon was also observed on SSC16, where 11 closely located SNP had a Bayes factor of >10 Figure 4. Linkage disequilibrium (r2) plot for the markers on chromosome 4 showing association with feed conversion ratio in Duroc pigs. See online version for figure in color. but none of them individually had high posterior probability. In the present study the number of chromosomal regions (1 or more closely located significant SNP) showing significant associations with FCR were clustered on the basis of physical distances to 8 QTL in LMM and 19 for BVS. One reason for the lower number of QTL discovered by LMM could be the stringent significant threshold in LMM, whereby some QTL were missed. There was a marked difference between LMM and BVS regarding the location of the QTL on SSC16. Although the significant SNP in LMM were located between 32.4 and 38.9 Mb, the significant SNP for BVS were at 77.8 to 84.2 Mb. There was suggestive evidence of association at around the BVS QTL in LMM also, but none of these SNP crossed the significance threshold set in the study for multiple testing corrections. There is a possibility that 2 QTL for FCR are segregating on this chromosome. The detected QTL for FCR from the present study were compared with the previously published QTL and/or genes for FCR from the same chromosome as observed in the present study. Direct comparison between data obtained in this study and those from previous QTL studies is hindered by the fact that locations given in centimorgans on different genome assemblies do not necessarily reflect the same physical location on the genome. Therefore, the physical locations on the QTL (in Mb) as given in the SSC10.2 build in the pig QTLdb (www.animalgenome.org/cgi-bin/QTLdb/SS/index) were used to compare to the results of previous studies. The majority of the earlier QTL studies were based on within-family linkage analysis, and therefore, QTL intervals were generally large. In the present study, the most promising QTL for FCR was located at 63.8 to 64.0 Mb on SSC4. Fontanesi et al. (2010a) reported an association of the cathepsin K (CTSK) gene located at 102 Mb on SSC4 affecting the G:F ratio. Fontanesi et al. (2010b) also studied the cathepsin S (CTSS) gene located on SSC4 at 102 Mb, where QTL for meat and fat deposition were reported. They found that polymorphism in CTSS genes was associated with feed:BW gain ratio and ADG. Cepica et al. (2003) reported QTL associated with longer carcass, greater BW, and greater meat content at 89 to 98 Mb on SSC4 near the genes V-ATPase, ATP1A2, and ATP1B1. Therefore, the QTL identified on SSC4 in this study is a novel one. Zhang et al. (2009) reported a QTL for FCR on SSC7 at 34 Mb. Insulin-like growth factor 2 mRNA binding protein (IGF2BP) 1 is located at 59.3 Mb on SSC7. In the present study the significant SNP was located at 52.5 Mb by both LMM and BVS. Although IGF2BP1 cannot be ruled out, it is most unlikely that the QTL we observed is due to this gene. Liu et al. (2007) reported a Association studies for feed conversion ratio in pigs QTL for FCR on SSC16, but the location of the QTL was very unspecific (1 to 68 Mb). On SSC16 LMM identified a QTL at 32 to 38 Mb with the peak at 35 Mb, whereas the QTL identified by BVS was at 77.7 to 84.1 Mb. It is possible that there are 2 QTL segregating on this chromosome and the QTL intervals were overlapping from Table 2. Single nucleotide polymorphisms showing association with feed conversion ratio (FCR) in Duroc pigs in a Bayesian variable selection approach (Bayes factor ≥ 10)1 SNP CHR. POS(BP) MAF POST.PROB DIAS0003220 2 144806634 0.247 0.051 INRA0046679 4 63841307 0.473 0.065 CASI0010164 4 63896889 0.456 0.050 MARC0049861 4 63988781 0.473 0.060 ALGA0026782 4 100674729 0.125 0.050 ALGA0026797 4 100869004 0.129 0.054 M1GA0006282 4 113676430 0.426 0.053 H3GA0013913 4 114077157 0.455 0.078 ASGA0024642 5 14387665 0.427 0.068 ALGA0033360 5 91226878 0.500 0.058 ASGA0030530 7 2277165 0.246 0.050 M1GA0009803 7 28080480 0.235 0.140 SIRI0000155 7 28872346 0.235 0.179 M1GA0010288 7 52545007 0.218 0.088 MARC0055215 8 83821187 0.382 0.086 ASGA0043216 9 60564587 0.320 0.059 MARC0068186 10 52697274 0.264 0.051 ALGA0074502 14 5375680 0.341 0.056 H3GA0042050 14 120546415 0.495 0.070 ASGA0066126 14 121556657 0.453 0.054 MARC0015087 14 121582671 0.491 0.055 MARC0012665 14 121885221 0.496 0.183 MARC0005041 14 122096887 0.497 0.158 ASGA0066146 14 122124455 0.496 0.154 ALGA0081147 14 122390163 0.500 0.218 H3GA0042111 14 123496090 0.493 0.090 ASGA0093353 15 15507497 0.434 0.083 ASGA0084877 15 38003646 0.292 0.055 ASGA0069422 15 49225855 0.233 0.051 H3GA0047121 16 77781169 0.306 0.066 M1GA0021168 16 77794510 0.310 0.064 H3GA0047156 16 78321022 0.375 0.053 ALGA0092260 16 82729055 0.476 0.061 ALGA0092264 16 82766361 0.471 0.060 ASGA0074743 16 82855966 0.472 0.060 M1GA0021418 16 82877833 0.472 0.061 ALGA0107462 16 82915871 0.472 0.063 ASGA0102627 16 83908627 0.496 0.061 ASGA0074775 16 84145049 0.469 0.064 ASGA0074804 16 84160066 0.392 0.101 ALGA0093843 17 26830847 0.224 0.058 ALGA0093863 17 27104034 0.453 0.201 MARC0056150 18 57909445 0.234 0.051 ALGA0098916 18 58907611 0.307 0.054 1Chr. = chromosome; Pos(Bp) = position on the chromosome (in base pair); MAF = minor allele frequency; post.prob = posterior probability for the SNP being included in the model. 1047 linkage analyses. This could be the reason for Liu et al. (2007) reporting such a large QTL interval. Houston et al. (2005) suggested a QTL at 12 Mb on SSC13 for daily feed intake. Insulin-like growth factor 2 mRNA binding protein 2 is located at 93.4 Mb on SSC13. Therefore, the QTL identified by LMM at 28.2 Mb is an additional genetic factor on SSC13 affecting FCR. Candidate Genes for Feed Efficiency We examined the genes located close to the associated SNP. Some of the QTL regions are wide (>5 Mb), and several genes are located within those QTL regions. Therefore, it is not possible to pinpoint a specific candidate gene for such a large QTL region. The strongest evidence of QTL for FCR was observed on SSC4 and SSC14. Here there were 2 strong candidate genes that might be responsible for the QTL effects in these 2 genomic regions affecting feed efficiency. On SSC4, the SNP CASI0010164 at 63896889 bp showing the strongest association [−log10(P-value) = 6.708] with FCR for this QTL is located within the gene Hypoxia inducible factor 1, alpha subunit inhibitor (HIF1AN), which is located at 63736245 to 63907065 bp. Another SNP, ALGA0081092, located within the HIF1AN gene, which is also associated with FCR (Supplementary Table 1). Zhang et al. (2010), using mice with a null mutation in the factor-inhibiting HIF-1α (FIH) gene, found that hypoxia-inducible factors (HIF) Figure 5. The association results for the SNP using the Bayesian variable selection approach. (top) The estimate of allele substitution effects of the SNP and (bottom) the posterior probabilities of the SNP being in the model. See online version for figure in color. 1048 Sahana et al. Table 3. Genes located closest the associated SNP1 Chr. Start_Bp End_Bp 2 144718313 144880260 4 63736245 63907065 4 63909866 63916434 4 63954630 63997673 4 87134024 87185018 4 4 4 100703017 100750827 100845127 100982954 113175592 113353702 4 4 113658484 113701147 114052764 114163466 5 14214605 14290330 5 91180096 91262406 7 2293813 2311064 7 28038952 28077106 7 28810803 28891854 7 52535543 52554486 8 20474676 20536888 8 24729340 24731185 8 83616671 83722239 9 61475647 61539927 10 52583372 13 25517874 52733037 25618569 13 28066059 28221607 13 28488958 28607562 14 4720242 4748260 14 119905252 119985534 14 120054422 120080877 14 120166395 120214754 14 120274015 120322538 Gene_name Gene_symbol Sparc/osteonectin, cwcv SPOCK1 and kazal-like domains proteoglycan (testican) 1 Hypoxia inducible factor 1, HIF1AN alpha subunit inhibitor NADH dehydrogenase NDUFB8 (ubiquinone) 1 beta subcomplex, 8, 19kDa, nuclear gene encoding mitochondrial protein Wingless-type MMTV integra- WNT8B tion site family, member 8B Minichromosome maintenance MCM4 complex component 4, transcript variant 1 CD5 molecule-like CD5L Fc receptor-like 4 FCRL4 Mannosidase, alpha, MAN1A2 class 1A, member 2 CD101 molecule CD101 Immunoglobulin IGSF3 superfamily, member 3 (IGSF3), transcript variant 1 NUAK family, NUAK1 SNF1-like kinase, 1 PREDICTED: chromosome C12orf63 12 open reading frame 63 Chromosome 6 C6orf145 open reading frame 145 Pre-B-cell PBX2 leukemia homeobox 2 Butyrophilin-like 2 BTNL2 (MHC class II associated) Minichromosome maintenance MCM3 complex component 3 Recombination signal RBPJ binding protein for immunoglobulin kappa J region (RBPJ), transcript variant 1 Protocadherin 7, PCDH7 transcript variant c Doublecortin-like kinase 2, DCLK2 transcript variant 1 V-etserythroblastosis virus ETS1 E26 oncogene homolog 1 (avian), transcript variant 3 FERM domain containing 4A FRMD4A Sodium channel, SCN5A voltage-gated, type V, alpha subunit, transcript variant 3 Unc-51-like ULK4 kinase 4 (C. elegans) Trafficking protein, TRAK1 kinesin binding 1 (TRAK1), transcript variant 2 ATPase, H+ ATP6V1B2 transporting, lysosomal 56/58kDa, V1 subunit B2 Cyclin M1 CNNM1 Glutamic-oxaloacetic GOT1 transaminase 1, soluble (aspartate aminotransferase 1)) NK2 homeobox 3 NKX2-3 Ectonucleoside triphosphate ENTPD7 diphosphohydrolase 7 Chr. Start_Bp End_Bp 14 120345005 120484143 Gene_name Gene_symbol CutC copper transporter CUTC homolog (E. coli) 14 120490521 120575847 Dynamin binding protein DNMBP 14 121295795 121368923 Hypoxia inducible factor 1, HIF1AN alpha subunit inhibitor 14 121481667 121599455 Paired box 2 (PAX2), PAX2 transcript variant d 14 121897333 121899278 Ladybird homeobox 1 LBX1 14 122360746 122380773 Deleted in primary ciliary DPCD dyskinesia homolog (mouse) 14 122729561 122905102 Chromosome 10 C10orf76 open reading frame 76 14 123492265 123505827 Tripartite motif TRIM8 containing 8 (TRIM8) 14 126126955 126181469 Sortilin-related VPS10 SORCS3 domain containing receptor 3 15 15281842 15301048 Speckle-type POZ protein-like SPOPL 15 37991566 38019080 Rho guanine nucleotide ARHGEF10 exchange factor (GEF) 10 16 32537265 32559273 ISL LIM homeobox 1 ISL1 16 33953237 33955108 Pelota homolog (Drosophila) PELO 16 33979957 34130428 Integrin, alpha 1 (ITGA1) ITGA1 16 34164917 34262520 Integrin, alpha ITGA2 2 (CD49B, alpha 2 subunit of VLA-2 receptor) 16 34271233 34290806 Molybdenum MOCS2 cofactor synthesis 2 (MOCS2), transcript variant 3 16 34779010 34784235 Follistatin, transcript FST variant FST317 16 34843343 34908030 NADH dehydrogenase NDUFS4 (ubiquinone) Fe-S protein 4, 18kDa (NADH-coenzyme Q reductase) (NDUFS4), nuclear gene encoding mitochondrial protein 16 35316886 35475423 ADP-ribosylation factor-like 15 ARL15 16 35788147 35838239 Heat shock 27kDa protein 3 HSPB3 16 35962856 35966677 Sorting nexin 18 (SNX18), SNX18 transcript variant 2 16 36288841 36296526 Endothelial ESM1 cell-specific molecule 1 (ESM1), transcript variant 2 16 37296881 37396018 Ankyrin repeat domain 55 ANKRD55 16 37530326 37613274 Interleukin 31 receptor A IL31RA (IL31RA), transcript variant 1 16 38138052 38210961 Mitogen-activated protein MAP3K1 kinase kinasekinase 1 16 38869597 38871285 Actin, beta-like 2 ACTBL2 16 77759785 77802067 Solute carrier SLC36A1 family 36 (proton/amino acid symporter), member 1 16 78291978 78338684 TNFAIP3 interacting protein 1 TNIP1 16 82659680 82792978 ADAM metallopeptidase with ADAMTS16 thrombospondin type 1 motif, 16 16 84587087 84601558 Iroquois homeobox 2, IRX2 transcript variant 1 17 27581211 27672350 MACRO domain MACROD2 containing 2 (MACROD2), transcript variant 1 18 57998005 58008669 Inhibin, beta A INHBA 18 58822206 58966286 Chromosome 7 open reading C7orf10 frame 10, transcript variant 4 1Chr. = chromosome; Start_Bp = staring position of the gene; End_Bp = end position of the gene. Association studies for feed conversion ratio in pigs 1049 Figure 6. Linkage disequilibrium (r2) plot for the markers on chromosome 14 showing association with feed conversion ratio from 11.99 to 12.6 Mb region in Duroc pigs. is an essential regulator of metabolism: mice lacking FIH exhibit reduced BW, increased metabolic rate, hyperventilation, and improved glucose and lipid homeostasis and are resistant to high-fat-diet-induced BW gain and hepatic steatosis. Neuron-specific loss of FIH influenced some of the major metabolic phenotypes of the global null animals: those mice have reduced BW, increased metabolic rate, and enhanced insulin sensitivity and are also protected against high-fat-diet-induced BW gain. Their results demonstrated that FIH acts to a significant degree through the nervous system to regulate metabolism. On SSC14, the ladybird homeobox 1 (LBX1) gene located at 121897333 to 121899278 bp is a strong candidate gene for FCR. There were 4 genome-wide significant SNP very closely located (121885221 to 122390163 bp) to the LBX1 gene. In mice, this gene is a key regulator of muscle precursor cell migration and is required for the acquisition of dorsal identities of forelimb muscles (Watanabe et al., 2011). They generated conditional Lbx1-null mice using the Cre-loxP system. In Lbx1-null mouse lines, Pax3-expressing limb muscle precursor cells were seriously reduced during embryonic development, and eventually the limb extensor muscles were lost after birth. The LBX1 gene can have an effect on muscle growth in pigs and thereby on FRC, which is a ratio between body growth and feed intake. The marker effects are overestimated in LMM because only 1 marker is fitted at a time. However, BVS considered all the markers simultaneously, and therefore, it is expected to give more accurate effect size estimates. In BVS, there were only 7 SNP with allele substitution effects (absolute values) larger than 0.001 (Supplementary Fig. 2). The maximum variance explained by a marker was only 0.00137% in SD units on deregressed EBV. If we assume the closely located SNP picked up by BVS represent 1 QTL, then the present study was able to identify approximately 20 QTL affecting FCR. No QTL with large effect size points toward a highly polygenic nature of the trait. This study provides a list of SNP associated with FCR and also offers valuable information on the genetic architecture and candidate genes for this trait. The information about SNP or regions showing association with FCR may be used as prior information in genomic selection and also for identifying candidate genes affecting FCR. Hypoxia inducible factor 1, alpha subunit inhibitor and LBX1 are 2 possible candidate genes affecting FCR on SSC4 and SSC14, respectively. As we observed high LD between adjacent markers, it will be necessary to analyze other pig populations to come closer to the candidate genes responsible for the identified QTL effects. LITERATURE CITED Badke, Y. M., R. O. Bates, C. W. Ernst, C. Schwab, and J. P. Steibel. 2012. Estimation of linkage disequilibrium in four US pig breeds. BMC Genomics 13:24. 1050 Sahana et al. Barrett, J. C., B. Fry, J. Maller, and M. J. Daly. 2005. Haploview: Analysis and visualization of LD and haplotype maps. Bioinformatics 21:263–265. Benjamini, Y., and Y. Hochberg. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57:289–300. Bolormaa, S., B. J. Hayes, K. Savin, R. Hawken, W. Barendse, P. F. Arthur, R. M. Herd, and M. E. Goddard. 2011. Genome-wide association studies for feedlot and growth traits in cattle. J. Anim. Sci. 89:1684–1697. Cepica, S., A. Stratil, M. Kopecny, P. Blazkova, J. Schröffel Jr., R. Davoli, L. Fontanesi, G. Reiner, H. Bartenshlager, and G. Moser. 2003. Linkage and QTL mapping for Sus scrofa chromosome 4. J. Anim. Breed. Genet. 120:28–37. Christensen, O. F., P. Madsen, B. Nielsen, T. Ostersen, and G. Su. 2012. Single-step methods for genomic evaluation in pigs. Animal 6:1565–1571. Du, F., A. Clutter, and M. Lohuis. 2007. Characterizing linkage disequilibrium in pig populations. Int. J. Biol. Sci. 3:166–178. Fontanesi, L., E. Scotti, L. Buttazzoni, S. Dall’Olio, R. Davoli, and V. Russo. 2010a. A single nucleotide polymorphism in the porcine cathepsin K (CTSK) gene is associated with back fat thickness and production traits in Italian Duroc pigs. Mol. Biol. Rep. 37:491–495. Fontanesi, L., C. Speroni, L. Buttazzoni, E. Scotti, L. N. Costa, R. Davoli, and V. Russo. 2010b. Association between cathepsin L (CTSL) and cathepsin S (CTSS) polymorphisms and meat production and carcass traits in Italian Large White pigs. Meat Sci. 85:331–338. Garrick, D. J., J. F. Taylor, and R. L. Fernando. 2009. Deregressing estimated breeding values and weighting information for genomic regression analyses. Genet. Sel. Evol. 41:55. George, E. I., and R. E. McCulloch. 1993. Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88:881–889. Houston, R. D., C. S. Haley, A. L. Archibald, and K. A. Rance. 2005. A QTL affecting daily feed intake maps to chromosome 2 in pigs. Mamm. Genome 16:464–470. Hu, Z.-L., C. A. Park, E. R. Fritz, and J. M. Reecy. 2010. QTLdb: A comprehensive database tool building bridges between genotypes and phenotypes. Proc. 9th World Congr. Genet. Appl. Livest. Prod. Leipzig, Germany. Hu, Z.-L., and J. M. Reecy. 2007. Animal QTLdb: Beyond a repository: A public platform for QTL comparisons and integration with diverse types of structural genomic information. Mamm. Genome 18:1–4. Janss, L. 2011. Bayz manual version 2.02. Janss Biostat., Leiden, Netherlands. Jensen, J., G. Su, and P. Madsen.2012. Partitioning additive genetic variance into genomic and remaining polygenic components for complex traits in dairy cattle. BMC Genet. 13:44. Liu, G., D. G. Jennen, E. Tholen, H. Juengst, T. Kleinwächter, M. Hölker, D. Tesfaye, G. Un, H. J. Schreinemachers, and E. Murani. 2007. A genome scan reveals QTL for growth, fatness, leanness and meat quality in a Duroc-Pietrain resource population. Anim. Genet. 38: 241–252 Madsen, P., and J. Jensen. 2011. A user’s guide to DMU, a package for analyzing multivariate mixed models, version 6, Release 5. http://www.dmu.agrsci.dk/dmuv6_guide.5.0.pdf. (Accessed 5 January 2011). Meuwissen, T. H., B. J. Hayes, and M. E. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 157:1819–1829. Okine, E. K., J. A. Basarab, L. A. Goonewardene, and P. Mir. 2004. Residual feed intake and feed efficiency: Differences and implications. Proc. FL Rumin. Nutr. Symp., Gainesville, FL. p. 27–38. Ostersen, T., O. F. Christensen, M. Henryon, B. Nielsen, G. Su, and P. Madsen. 2011. Deregressed EBV as the response variable yield more reliable genomic predictions than traditional EBV in purebred pigs. Genet. Sel. Evol. 43:38. Risch, N. J. 2000. Searching for genetic determination in the new millennium. Nature 405:847–856. Sahana, G., B. Guldbrandtsen, L. Janss, and M. S. Lund. 2010. Comparison of association mapping methods in a complex pedigreed population. Genet. Epidemiol. 34:455–462. Scheet, P., and M. Stephens. 2006. A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78:629–644. Uimari, P., and M. Tapio. 2011. Extent of linkage disequilibrium and effective population size in Finnish Landrace and Finnish Yorkshire pig breeds. J. Anim. Sci. 89:609–614. Watanabe, S., S. Matsushita, M. Hayasaka, and K. Hanaoka. 2011. Generation of a conditional null allele of Lbx1. Genesis 49:803–810. Wu, T. D., and C. K. Watanabe. 2005. GMAP: A genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21:1859–1875. Yu, J., G. Pressoir, W. H. Briggs, I. Vroh Bi, M. Yamasaki, J. F. Doebley, M. D. McMullen, B. S. Gaut, D. M. Nielsen, J. B. Holland, S. Kresovich, and E. S. Buckler. 2005. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet. 38:203–208. Zhang, N., Z. Fu, S. Linke, J. Chicher, J. J. Gorman, D. Visk, G. G. Haddad, L. Poellinger, D. J. Peet, F. Powell, and R. S. Johnson. 2010. The asparaginyl hydroxylase factor inhibiting HIF-1alpha is an essential regulator of metabolism. Cell Metab. 11:364–378. Zhang, Z. Y., J. Ren, D. R. Ren, J. W. Ma, Y. M. Guo, and L. S. Huang. 2009. Mapping quantitative trait loci for feed consumption and feeding behaviors in a White Duroc × Chinese Erhualian resource population. J. Anim. Sci. 87: 3458–3463.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download A genome-wide association scan in pig identifies novel regions