Download The Value of Hierarchical Bayes Models on Genetic Evaluation of

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polymorphism (biology) wikipedia , lookup

Human–animal hybrid wikipedia , lookup

Viral phylodynamics wikipedia , lookup

Philopatry wikipedia , lookup

Genetic code wikipedia , lookup

Koinophilia wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Quantitative trait locus wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome (book) wikipedia , lookup

Genetic drift wikipedia , lookup

Genetic engineering wikipedia , lookup

Public health genomics wikipedia , lookup

Twin study wikipedia , lookup

Medical genetics wikipedia , lookup

Human genetic variation wikipedia , lookup

Genetic testing wikipedia , lookup

Behavioural genetics wikipedia , lookup

Microevolution wikipedia , lookup

Population genetics wikipedia , lookup

Heritability of IQ wikipedia , lookup

Genetic engineering in science fiction wikipedia , lookup

Transcript
The value of hierarchical Bayes models on genetic evaluation of multiple-breed beef cattle
populations
Fernando F. Cardoso
Michigan State University, East Lansing, MI 48824
Introduction
Crossbreeding is an important tool to increase the efficiency of beef production through
heterosis and complementarity between breeds (Gregory, 1999). This is one of the reasons that
has led to an increasing proportion of the beef cattle populations being composed of crossbred
animals. Crossbreeding and selection are synergic key factors to improve production in the longterm. The response to selection is proportional to the accuracy on predictions of genetic merit
(Falconer and Mackay, 1996). These predictions of genetic merit on crossbred animals depend
on reliable estimates of breed-composition specific means, individual deviations from these
means, and covariances between related animals (Fernando, 1999); however, genetic evaluations
of multiple-breed populations are complicated by the different genetic backgrounds and degrees
of crossing present in these populations.
The complexity of the biological and environmental issues involved makes the task
challenging and demands the effort of several research groups. Bayesian statistics, in this
context, provide a set of flexible tools and a general framework to tackle this task (Sorensen and
Gianola, 2002). Hierarchical Bayes models (HBMs) can handle virtually any level of complexity
that is present in the population of interest and are particularly useful when records are correlated
(Hobert, 2000), as typical of related animals. Moreover, HBMs allow for optimal combination of
information present in the data with previous inferences from the literature to estimate the
parameters of interest (e.g. genotypic means). The current “state of art” multiple-breed genetic
evaluation model for beef cattle in the United States uses a HBM to incorporate prior knowledge
on heterosis (Klei et al., 1996).
The objectives of this paper were to review the current state of knowledge on major
issues involved on the prediction of performance and genetic merit of multiple-breed beef cattle
populations; and to describe how HBMs and Bayesian inference can by employed to tackle these
issues.
1
Review of literature
Genotypic means. Several approaches have been considered to estimate means of
genotypes or breed-composition groups in multiple-breed populations. The simplest strategy
involves including breed-composition in the definition of the contemporary group (CG) and
estimating heterotic effects jointly with the CG effects. However, this method reduces the
number of possible direct comparisons and connectedness in the population, since animals with
different compositions are considered different contemporary groups even when they are raised
together under the same management and environmental conditions (Klei et al., 1996). More
parsimonious models are obtained by estimating breed-composition means as a function of
additive (breed proportion) and non-additive (degree of allelic and non-allelic interaction)
genetic coefficients. If heterosis is primarily due to dominance (allelic interaction) with no
epistasis, then it is proportional to heterozygosity (proportion of heterozygotes at individual loci)
(Gregory, 1999). Dickerson (1969; 1973), however, introduced the concept of “recombination
loss” to explain deviations from the heterozygosity found in crossbred individuals. The
recombination loss is equal to “the average fraction of independently segregating pairs of loci in
the gametes from both parents which are expected to be non-parental combinations” (Dickerson,
1969). The effect of recombination loss is attributable to the loss of favorable epistatic
combinations present in the gametes from purebreds as a result of long-term selection. Kinghorn
(1987) proposed several hypotheses and models to account for “epistatic loss” in crossbred
populations, and Wolf et al. (1995) proposed a general model based on the two-loci theory to
account for dominance and epistatic effects.
Confoundedness and multicollinearity between the coefficients for genetic effects
complicates the estimation of dominance effects separately from epistatic effects such that most
of the models proposed for multiple breed evaluations are only based on dominance effects
(Cunningham, 1987; Klei et al., 1996; Miller and Wilton, 1999; Sullivan et al., 1999).
Accounting for additive and heterotic mean effects on genetic evaluations can be
accomplished by using information in the literature to pre-adjust records (Roso and Fries, 1998;
Sullivan et al., 1999), provided that the published estimates are reliable and applicable to the
population being evaluated; by estimating these mean effects solely from the data of the
population under investigation (Arnold et al., 1992; Miller and Wilton, 1999), or by
simultaneously using information from the literature combined with data information, as in the
2
benchmark model used currently in the U.S. beef industry (Klei et al., 1996; Quaas and Pollak,
1999).
Let g be a genotype (breed-composition) composed of B breeds; let b be the proportion of
genes from the bth source; let  bb and  bb be the probability that at a randomly chosen locus
from an individual in g, one allele is from the bth source and the other allele, respectively, from
b th and bth source population. A general model assumed for the mean of g   g  based on the
two-loci theory and absence of inbreeding is as follows (Wolf et al., 1995):
B
B
B
B 1
B
 g      b Ab    bb Dbb   b2 AAbb  2
b 1 b b
b 1
B
B
B

b 1 bb 1
b 1
B
B
B
B
b
B
b
AAbb
B
B
    b  bb ADbbb    bb2  DDbbbb  
b 1 b 1 b b
b 1 b b
  bb bb DDbbbb
,
[1]
b 1 b b b 1 b b
 b b or bb
where  is the overall mean, Ab is the additive effect, Dbb is the dominance effect, AAbb is the
additive × additive effect, ADbbb is the additive × dominance effect and DDbbbb is the
dominance × dominance effect. The indices refer to the source populations. The extension of [1]
to other effects is naturally done by adding analogous terms referring to the extra effects; e.g AbM
would be the additive maternal effect. The coefficients in [1] can be obtained from the parental
generation as follows:  b  0.5  bP   bM  ,  bb   bP bM ,  bb   bP bM   bP bM , for b=1,…, B;
b =1,…, B; and b < b . Here P and M denote paternal and maternal groups, respectively.
Equation [1] is clearly overparameterized; thus some restriction on the parameters must
be applied in order to make them estimable. These restrictions are based on the relationship
between the coefficients, namely

B
b 1
b  1 ,

b  b
bb
 1 and  b   bb  0.5  bb . For example,
b
in a two population scenario, a restricted model would be given by:
g    1 A1  12 D12  21  2 AA12  1 12 AD112  122 DD1212 ,
[2]
which has 6 parameters and requires at least six genotypic groups to be estimable. This model is
equivalent to (or a reparameterization of) the models proposed by Mather and Jinks (1971) and
Hill (1982).
Another important aspect to consider in a crossbred population is that the relationship
between performance and contribution of each breed may not be linear in all ranges of
3
compositions and can also be environmentally dependent (Arthur et al., 1999; Long, 1980). As
an example, the combination of Bos Indicus (for fitness) and Bos Taurus (for production) that
optimizes performance will vary according to the quality of the environment in terms of
management and climate. Larger percentages of Bos Taurus will be more suitable to temperate
environments while animals with larger proportion of Bos Indicus blood are expected to perform
better in tropical environments. If we partition the different environments into R regions, the
mean of a breed-composition group raised in the rth region can be written as:
g ,r    1 A1  12 D12  21  2 AA12  1 12 AD112  122 DD1212  Region r  1 A1Region r
r  1,..., R.
[3]
Here Regionr represents the effect of the rth class of region. Conceivably higher order
genetic effects could also interact with region effects (Arthur et al., 1999), yielding a
straightforward extension of [3]. On the other hand, simpler models can be obtained by setting
some of the effects of the general model in [1] equal to zero. For instance, a model including
solely additive and dominance effects is obtained when we let all AAbb , ADbbb , and DDbbbb
effects be equal to zero. The models of Dickerson (1973) and Kinghorn (1980) based on the
concept of recombination loss are also simpler versions of [1]. Their models are equivalent in a
two-breed population (Wolf et al., 1995) but Kinghorn’s parameterization has a better biological
interpretation. Kinghorn’s hypothesis X for recombination loss assumes that each locus codes for
a different component of a dimorphic enzyme and epistatic loss is proportional to the probability
that choosing one allele from each locus comes from a different breed. This is equivalent to
assuming that recombination loss is due to between breed additive × additive effects and the
model can be written as:
g    1 A1  12 D12  21 2 AA12 .
[4]
Animal additive genetic effects. The genetic value of an animal can be determined by the
mean of its breed-composition or genotypic group plus an individual deviation from its group
(Arnold et al., 1992; Elzo, 1994; Klei et al., 1996; Sullivan et al., 1999). Deviations are due to
additive and non-additive genetic effects. Additive effects or breeding values indicate the
deviance from the population means expected in the offspring of an individual when it is mated
at random to another individual in the population, while non-additive effects are useful to
4
determine specific combining abilities between individuals (Falconer and Mackay, 1996). These
deviations are determined by the performance of an individual and its relatives; therefore, it is
important to properly account for covariances between relatives when predicting genetic value of
crossbred animals.
Theory to estimate the covariance between crossbred animals was presented by Lo et al.
(1993) for an additive model and by Lo et al. (1995) for an additive and dominance model.
Under the additive model, (co)variances are modeled as a function of breed specific additive
variances and variances due to the segregation between breeds. These segregation variances
represent the additional variance observed in F2 individuals compared to the F1’s (Lo et al.,
1993). These methods derive genetic means and covariances between crossbred and purebred
individuals from “identity modes” and based upon the probability that related individuals share
alleles that are identical-by-descent (IBD). The additive and dominance model is derived for a
two-breed and their crosses scenario (Lo et al., 1995). This model has an exact theoretical
derivation and can accommodate the presence of inbreeding, but requires a relatively larger
number of variance components to be estimated (up to 25 in the former when inbreeding is
present). Simplifications arise when the population is composed only by the two pure breeds and
F1’s (Lo et al., 1997), and this model has been applied to swine data (Lutaaya et al., 2001).
For more general crossbreeding schemes, the dominance model (Lo et al., 1995) can be
cumbersome due to the large number of dispersion parameters to be estimated, while the additive
model (Lo et al., 1993) can be implemented without great difficulty. An alternative formulation
of the additive model with a regression approach to account for non-additive effects and a sirematernal grandsire model implementation was proposed by Elzo (1994) and applied to multiplebreed data (Elzo et al., 1998; Elzo and Wakeman, 1998). Recently, Birchmeier et al. (2002)
proposed a REML algorithm to estimate additive breed and segregation variances under a typical
animal model and general pedigree structure. Yet, several recently proposed models (Klei et al.,
1996; Miller and Wilton, 1999; Quaas and Pollak, 1999; Roso and Fries, 1998; Sullivan et al.,
1999) assume that all breeds have the same additive genetic variance and there is no variance due
to segregation between breeds in advanced crosses. A model including additive and non-additive
breed-composition means and additive individual deviations may offer a parsimonious model for
genetic evaluation of multiple-breed populations.
5
Bayesian Inference and hierarchical models on multiple-breed genetic evaluations. The
milestone paper that introduces Bayesian inferences to animal breeding research is credited to
Gianola and Fernando (1986). The most striking, and perhaps controversial, difference between
Bayesian and classical (or frequentist) inference is that the former allows the incorporation of
prior knowledge (Blasco, 2001). From the practical point of view, if significant prior information
is available, ignoring it seems poor advised, especially when the complexity of the problem is
high and data information limited.
Hierarchical or multistage models are used in Bayesian inference to functionally describe
complex problems through a series of nested levels or sub-models (Sorensen and Gianola, 2002).
Distributional assumptions and parameter values associated with these distributions
(hyperparameters in Bayesian terminology) are used to integrate prior knowledge in the analyses.
The Henderson’s mixed model equation (Henderson, 1973) widely used in animal breeding are
an example of a two stage model as seen below in the illustration of a hierarchical multiple-breed
animal model (HMBAM). The first stage of this model is the distribution of y, a vector of n
phenotypic records, presented as follows:
y | β, g, a,  e2 ~ N  X1β  X 2g  Za, I e2 
[5]
where  is a vector of non-genetic “fixed” effects (e.g. gender, age of dam, contemporary groups,
etc.); g is a vector of “fixed” genetic effects (as in [1] to [4]); and a is a vector of q animal
additive genetic effects; X1, X2, and Z are known incidence matrices. The elements of X2 are
determined by the coefficients of genetic effects specified above (’s and ’s). Finally  e2 ,
represents the residual variance, assumed to be homogeneous across breed groups.
The second stage of the model states the prior knowledge on all parameters in , g and a
contributing to the mean (location) of the normal distribution assumed on y [5], and is
represented by:
β | β o , V ~ N  β o , V  ,
[6]
g | g o , Vg ~ N  g o , Vg  ,
[7]
a | G a ~ N  0, G a  ,
[8]
and
6
where o and go are prior means and V , Vg and Ga are prior variance matrices. The values of o
and go can be elicited from the literature, and would be particularly relevant when the data does
not have a suitable structure to adequately estimate the effects for all levels or combination of
levels, as in the cases of unbalanced distributions of records in the subclasses and
multicollinearity, which are often the case for go. These “fixed” genetic effects determining the
genotypic means are generally difficult to estimate solely from the data due to confounding and
multicollinearity (Birchmeier et al., 2002; Klei et al., 1996), but reliable estimates may be
available from the literature (Gregory, 1999). The use of informative priors reduces confounding
among correlated effects (Quaas and Pollak, 1999). The variance specification V and Vg,
typically diagonal, are used to state the uncertainty about o and go. If these variances are set to
very large values, there would be little confidence on prior means and the inference will be
basically driven by the data; on the other hand, if the elements of V and Vg are set to very small
values then the impact of the prior means on the yielded estimates will be large. This is the
specification adopted by the benchmark model used in today’s Simmental genetic evaluation
(Quaas and Pollak, 1999). The prior assignment on a is based on the additive genetic variancecovariance between animals represented by Ga. The matrix Ga contains elements as defined by
Lo et al. (1993). These elements can be computed by the tabular method provided that the
variance of crossbred individuals is computed as:
B 1 B
Var  a j     bj (2a )b   2  bs bs   bd  bd  (2a )bb  0.5cov  a sj , a dj  ,
B
b 1
[9]
b 1 bb
where a sj and a dj represent, respectively, the additive genetic effect of the sire and the dam of j;
 (2a )b is the additive variance of breed b; and  (2a )bb is the variance due to the segregation
between breed b and b . Here, this proposition differs from that of Klei et al. (1996), in which all
breeds are assumed to have the same genetic variance and there is no variance due to segregation
between breeds.
Following Quaas (1988), Lo et al. (1993) showed that the inverse of the additive
covariance matrix G a 1 can be computed as:
G a1   I  P  Ω a1  I  P  ,
where Ω a 1 is a diagonal matrix with the jth diagonal element is defined as:
7


 a  j =Var  a j   .25 Var  a sj   Var  a dj   .5cov  a sj , a dj  ,
[10]
which is a linear function of breed additive variances (  (2a )b ’s), and segregation variances
(  (2a )bb ’s).
The third stage of the model corresponds to prior information on variance components.
This information is introduced via scaled inverted chi-square prior distributions, defined as
follows:

p  e, s
2
e

2
e
   

p  (2a )b  ( a )b , s(2a )b   (2a )b 

p 
2
( a ) bb
 ( a )bb , s
2
( a ) bb
  
 e  2 
2  2 
e
 ( a ) b  2 


2


  ( a )b s(2a )b
exp  
 2 2
( a )b

  ( a ) bb  2 


2
2


( a ) bb

  e se2 
exp   2 
 2 e 
[11]

 , b=1,..., B;

[12]
  ( a )bb s(2a )bb 
exp  
 , b=1,..., B;

2 (2a )bb 

b =b+1,..., B.
[13]
Here, the se2 , s(2a )b and s(2a )bb hyperparameters represent prior values, respectively for  e2 ,  (2a )b
and  (2a )bb ; and the  e ,  ( a ) b and  ( a )bb hyperparameters state the prior degrees of belief or
certainty about se2 , s(2a )b and s(2a )bb , respectively. Although prior information for segregation
variances is limited (Birchmeier et al., 2002; Elzo and Wakeman, 1998), there is extensive
information available on breed specific variances (Koots et al., 1994; Meyer, 1992) that could be
incorporated in the analysis through [12].
The product of [5], [6], [7], [8], [11], [12] and [13] yields the joint posterior density,
which is a function of all unknowns in the model given the data y and all hyperparameters,
represented as follows:
8
p  β, g, a,  (2a )b ,  (2a )bb ,  e2 | β o , V , g o , Vg , e , ( a )b , ( a )bb , se2 , s(2a )b , s(2a )bb , y 
  e2 
n / 2

 1

exp   2  y  X1β  X 2 g  Za   y  X1β  X 2g  Za  
 2 e

 

 exp .5  β  β o  V1  β  β o  exp .5  g  g o  Vg1  g  g o  G a
 
B 1
 e  2 
2  2 
e

b 1 b b
exp  .5aG a1a  [14]
 ( a ) b 2 
  ( a )b s(2a )b 

  e se2  B

2
2
 exp  
exp   2    ( a )b  
 2 2 
( a )b 
 2 e  b 1

  (2a )bb 
B
1/ 2
  ( a ) bb  2 


2


  ( a )bb s(2a )bb 
exp  


2 (2a )bb 

Inferences (e.g. estimation of genotypic means or prediction of breeding values) are
derived from this joint posterior density [14]. There are two main venues to obtain the estimates:
1) an empirical Bayes approach, in which the modes of all parameters are obtained by iterative
methods, such as the Expectation-Maximization (EM) algorithm (Dempster et al., 1977) and
approximate “large sample” standard error derived from the information matrix; 2) a fully Bayes
approach, in this case Monte Carlo Markov Chain (MCMC), a simulation-intensive algorithm is
used to derive marginal densities obtaining “exact” small sample inference of any parameter
(Gilks, 1996). The Metropolis-Hasting algorithm (Hasting, 1970; Metropolis, 1953) and the
Gibbs sampler (Gelfand and Smith, 1990; Geman and Geman, 1984) are the most common
MCMC strategies used in animal breeding. A large number of cycles are generated and samples
are saved. Eventually, the Gibbs sampler converges to the joint posterior distribution. Values
drawn after convergence are considered random samples from the joint posterior distribution and
used to draw inference (e.g. calculate means, modes, medians, standard errors, credibility sets,
etc) (Sorensen and Gianola, 2002).
Fully Bayesian methods have been used in the last decade for inference in animal
breeding problems in several applications, including variance component estimation (Jensen et
al., 1994; Wang et al., 1994b), predictions of selection response (Sorensen et al., 1994; Wang et
al., 1994a) and in threshold models for categorical data (Sorensen et al., 1995; Wang et al.,
1997). The possibility of combining prior and data information, and the ability to provide exact
small sample inference, make Bayesian methods attractive for animal breeding and genetics
problems, especially when the number of parameters exceeds the number of observations.
9
Using either an empirical or fully Bayes approach, inference on location parameters (, g
and a) of the hierarchical model described above is derived from the following mixed model
equations:
 X1 X1  V1 e2

X2 X1


ZX1

X1 X 2
X2 X2  Vg1 e2
ZX 2
 βˆ   X1y  V1β o e2 
  
1
2
 gˆ    X2 y  Vg g o e 

ZZ  G a1 e2  aˆ  
Zy

X1Z
X2 Z
Note that if G a 1 based on Lo et al. (1993) is replaced by the classical
A 1
 a2
[15]
, with A being a
numerator relationship matrix, equations on [15] become equivalent to the ones proposed by Klei
et al. (1996). Additionally, if V1  0 , then [15] equates to Henderson’ mixed model equations
(Henderson, 1973).
The HMBAM was applied to analyze 22,717 post-weaning gain (PWG) records of a beef
cattle population under genetic evaluation in Brazil, consisting of Herefords and crosses
Hereford × Nelore (Cardoso and Tempelman, 2003). Including base animals, 40,082 animals
were in the pedigree, pertaining to 15 different herds. Results were compared to those from a
standard animal model (AM) that assumes equal breed variances and no variance due to
segregation. MCMC was the estimation method used for both models. Posterior means (±
standard deviation) in kg for fixed genetic effects on g were obtained by HMBAM, using
Kinghorn’s parameterization [4] (Kinghorn, 1980) and non-informative priors, were -29.3 ± 10.1
for the additive (A) effect (representing the proportion of Nelore genes); 36.7 ± 5.0 for
dominance (D) and -30.8 ± 9.0 for A × A interaction effects. As expected, D favorably affected
PWG while A × A interaction had an adverse effect. The authors however failed to fit the twoloci model (Hill, 1982; Wolf et al., 1995) due to extremely high correlations between coefficients
of genetic effects: ranging from 0.92 between A × A and D × D to a maximum of 0.99 between
D and A × D. A similar situation was observed by Birchmeier et al. (2002), who used a model
with only A and D as fixed genetic effects and no epistatic effects. Certainly, incorporation of
prior information available from the literature on fixed genetic effects in the analyses of poorly
structured datasets, as those above, will be helpful on accurately predicting performance on these
populations. The models based on recombination loss (Dickerson, 1973; Kinghorn, 1980;
Kinghorn, 1987) present an interesting compromise between the two-loci model (Hill, 1982;
10
Wolf et al., 1995) and the purely dominance models, allowing for epistatic effects with fewer
parameters (only one in two breed scenario); however, availability of reliable estimates of
epistatic effects is still limited (Arthur et al., 1999; Koch et al., 1985).
Nelore and Hereford additive genetic variances of PWG in kg2 obtained by Cardoso and
Tempelman (2003) using HMBAM differed substantially. Herefords had a posterior mean
genetic variance of 93.1 with a 95% posterior probability interval (PPI) of [70.1, 118.0] whereas
the corresponding values for the Nelore were 33.8 and [20.6, 52.5]. The AM estimate of a
common genetic variance had an intermediate value of 60.4 between the Nelore and Hereford
variances and PPI of [44.4, 77.5]. The posterior mean variance due to the segregation between
these breeds was 15.1 with a 95% PPI of [5.0, 33.8], having a magnitude of about 45% of the
Nelore genetic variance, but represented only 16% of Hereford genetic variance. These
percentages are reasonably larger than those found for birth and weaning weight of crosses of
Angus and Brahman in Florida, ranging from 1.4 to 3.1% (Elzo and Wakeman, 1998). The
magnitude of the segregation variance relative to the Hereford genetic variance (16%) found by
Cardoso and Tempelman (2003) was, however, similar in magnitude to the results obtained for
birth weight of Hereford-Nelore crosses in Argentina (16.5%) (Birchmeier et al., 2002). In this
data set, the Nelore genetic variance was 73.5% of the magnitude of the Hereford genetic
variance in birth, while in Cardoso and Tempelman (2003) Nelores had a genetic variance for
PWG that was only 36.2% that of Herefords. The advantage of HMBAM is the flexibility on
modeling the genetic variability of the different breed composition groups of crossbred
populations. With HMBAM the genetic variance of each genotypic group is a function of breed
specific variances and the segregation variance; for example, the genetic variance of the F2
groups is obtained by 0.5 a21  0.5 a22   a212 (Lo et al., 1993); whereas, a common genetic
variance is attribute to all compositions in AM. This will affect the dispersion of the genetic
values and the accuracy of predictions, as it is clear by the different heritabilities obtained by
Cardoso and Tempelman (unpublished data) for the different genotypes (Table 1). The benefits
of HMBAM over AM are, however, dependent on the magnitude of difference among breed
specific genetic variances and of the segregation variance.
Expected progeny differences (EPD) in a multiple-breed scenario are a function of fixed
and random additive genetic effects (Arnold et al., 1992; Elzo, 1994; Klei et al., 1996; Sullivan
et al., 1999). The coefficient for the fixed effect (A, D, A × A, etc.) will depend on the mate’s
11
genotype and therefore, comparison between candidates for selection should be made for specific
breed compositions of the mates. The additive genetic effect corresponds to the general
combining ability of the individual and does not depend on the genotype of the mates. The
determination of specific combining abilities requires the estimation of non-additive genetic
variances. Even though theory for an additive and dominance two-breed genetic model is
available (Lo et al., 1995), this model requires a larger number of variance components to be
estimated and may be cumbersome for practical applications. A simplified formulation for
general crossbreeding scenarios including additive effects and regression approach to account for
non-additive effects (sire × dam genotype interaction) is also available (Elzo, 1994).
Conclusions and implications to genetic improvement of beef cattle
The accurate prediction of performance of crossbred animals is one of the most important
factors ultimately determining the success of breeding programs in the industry today, since a
substantial portion of the beef is produced from crossbred animals. These predictions require
reliable estimates of genotypic means and breeding values for complex multiple-breed
populations. Bayesian inference provides a general framework for optimal merging of
information derived from data with prior knowledge to achieve this task.
Hierarchical Bayes models are extremely powerful, yet flexible, tools available to the
breeders for better describing the biological and environmental complexities behind the
performance of beef crosses. The implementation of a more realistic modeling of the additive
genetic variability and correlation between relatives on crossbred populations (Lo et al., 1993)
will help to improve accuracy of genetic predictions and, consequently, selection response
(Falconer and Mackay, 1996). Increased genetic progress is a key factor helping producers to
achieve efficiency in their systems.
Finally, beef cattle performance is, in general, measured across diverse production
systems and environments, with data quality often compromised by the occurrence of recording
error, preferential treatment and/or the effect of injury or disease. Hierarchical models present a
general framework to tackle problems arising from the nature of field data structure; a variety of
multistage propositions have been advocated to handle issues such as heterogeneity of variance
(Foulley et al., 1992; Foulley and Quaas, 1995; Gianola et al., 1992; SanCristobal et al., 1993),
12
outlying observations (Rosa, 1999; Stranden and Gianola, 1998, 1999), and uncertain paternity
as typical of multiple-sire mating (Cardoso and Tempelman, ; Foulley et al., 1987; Henderson,
1988), for instance. These situations can be addressed individually, but there is no conceptual
difficulty in handling them jointly. These critical issues arising from field data could be naturally
incorporated into HMBAM by adding levels of complexity to its hierarchy.
13
Bibliography
Arnold, J. W., J. K. Bertrand, and L. L. Benyshek. 1992. Animal-model for genetic evaluation of
multibreed data. Journal of Animal Science 70: 3322-3332.
Arthur, P. F., H. Hearnshaw, and P. D. Stephenson. 1999. Direct and maternal additive and
heterosis effects from crossing Bos Indicus and Bos Taurus cattle: Cow and calf
performance in two environments. Livestock Production Science 57: 231-241.
Birchmeier, A. N., R. J. C. Cantet, R. L. Fernando, C. A. Morris, F. Holgado, A. Jara, and M. S.
Cristal. 2002. Estimation of segregation variance for birth weight in beef cattle. Livestock
Production Science 76: 27-35.
Blasco, A. 2001. The Bayesian controversy in animal breeding. Journal of Animal Science 79:
2023-2046.
Cardoso, F. F., and R. J. Tempelman. 2003. Bayesian estimation of breed-specific and
segregation genetic variances applied to a Nelore-Hereford population. Journal of Animal
Science 81 Suppl. 1: (In press).
Cardoso, F. F., and R. J. Tempelman. (In press). Bayesian inference on genetic merit under
uncertain paternity. Genetics Selection Evolution.
Cunningham, E. P. 1987. Crossbreeding - the Greek temple model. Journal of Animal Breeding
and Genetics-Zeitschrift Fur Tierzuchtung Und Zuchtungsbiologie 104: 2-11.
Dempster, A. P., N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data
via EM algorithm. Journal of the Royal Statistical Society Series B- Methodological 39:
1-38.
Dickerson, G. E. 1969. Experimental approaches in utilising breed resources. Animal Breeding
Abstracts 37: 191-202.
Dickerson, G. E. 1973. Inbreeding and heterosis in animals. In: Proceedings of the Animal
Breeding and Genetics Symposium in Honour of Dr. J. L. Lush, Champaign, IL. p 54-77.
Elzo, M. A. 1994. Restricted maximum-likelihood procedures for the estimation of additive and
nonadditive genetic variances and covariances in multibreed populations. Journal of
Animal Science 72: 3055-3065.
Elzo, M. A., and D. L. Wakeman. 1998. Covariance components and prediction for additive and
nonadditive preweaning growth genetic effects in an Angus- Brahman multibreed herd.
Journal of Animal Science 76: 1290-1302.
Falconer, D. S., Mackay, T. F. C. 1996. Introduction to quantitative genetics. 4 ed. Longman
Group Ltd, Harlow.
Fernando, R. L. 1999. Theory for analysis of multi-breed data. In: Seventh Genetic Prediction
Workshop, Kansas City, MO. p 1-16.
Foulley, J. L., M. S. Cristobal, D. Gianola, and S. Im. 1992. Marginal likelihood and Bayesian
approaches to the analysis of heterogeneous residual variances in mixed linear Gaussian
models. Computational Statistics & Data Analysis 13: 291-305.
Foulley, J. L., D. Gianola, and D. Planchenault. 1987. Sire evaluation with uncertain paternity.
Genetics Selection Evolution 19: 83-102.
Gelfand, A. E., and A. F. M. Smith. 1990. Sampling-based approaches to calculating marginal
densities. Journal of the American Statistical Association 85: 398-409.
Geman, S., and D. Geman. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian
restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6:
721-741.
14
Gianola, D., and R. L. Fernando. 1986. Bayesian methods in animal breeding theory. Journal of
Animal Science 63: 217-244.
Gianola, D., J. L. Foulley, R. L. Fernando, C. R. Henderson, and K. A. Weigel. 1992. Estimation
of heterogeneous variances using empirical Bayes methods - theoretical considerations.
Journal of Dairy Science 75: 2805-2823.
Gilks, W. R., S. Richard, and D. J. Spiegelhalter. 1996. Markov chain monte carlo in practice.
Chapman and Hall, New York.
Gregory, K. E., Cundiff, L. V., Koch, R. M. 1999. Composite breeds to use heterosis and breed
differences to improve efficiency of beef production. Tech. Bulletin No. 1875, MARCUSDA-ARS, Clay Center, NE.
Hasting, W. K. 1970. Monte carlo sampling methods using markov chains and their applications.
Biometrika 57: 97-109.
Henderson, C. R. 1973. Sire evaluation and genetic trends. In: Animal breeding and genetics
symposium in Honor of Dr. Jay L. Lush, Champaign, IL. p 10-41.
Hill, W. G. 1982. Dominance and epistasis as components of heterosis. Zeitschrift Fur
Tierzuchtung Und Zuchtungsbiologie-Journal of Animal Breeding and Genetics 99: 161168.
Hobert, J. P. 2000. Hierarchical models: A current computational perspective. Journal of the
American Statistical Association 95: 1312-1316.
Jensen, J., C. S. Wang, D. A. Sorensen, and D. Gianola. 1994. Bayesian-inference on variance
and covariance components for traits influenced by maternal and direct genetic-effects,
using the Gibbs sampler. Acta Agriculturae Scandinavica Section a-Animal Science 44:
193-201.
Kinghorn, B. 1980. The expression of recombination loss in quantitative traits. Zeitschrift Fur
Tierzuchtung Und Zuchtungsbiologie-Journal of Animal Breeding and Genetics 97: 138143.
Kinghorn, B. P. 1987. The nature of 2-locus epistatic interactions in animals - evidence from
Wright,Sewall guinea-pig data. Theoretical and Applied Genetics 73: 595-604.
Klei, L., R. L. Quaas, E. J. Pollak, and B. E. Cunningham. 1996. Multiple-breed evaluation. In:
Beef Improvement Federation 28th Annual Research Symposium & Annual Meeting,
Birmingham, AL. p 93-105.
Koch, R. M., G. E. Dickerson, L. V. Cundiff, and K. E. Gregory. 1985. Heterosis retained in
advanced generations of crosses among angus and hereford cattle. Journal of Animal
Science 60: 1117-1132.
Koots, K. R., J. P. Gibson, C. Smith, and J. W. Wilton. 1994. Analyses of published genetic
parameter estimates for beef cattle production traits. 1. Heritability. Animal Breeding
Abstracts 62: 309-338.
Lo, L. L., R. L. Fernando, R. J. C. Cantet, and M. Grossman. 1995. Theory for modeling means
and covariances in a 2-breed population with dominance inheritance. Theoretical and
Applied Genetics 90: 49-62.
Lo, L. L., R. L. Fernando, and M. Grossman. 1993. Covariance between relatives in multibreed
populations - additive-model. Theoretical and Applied Genetics 87: 423-430.
Lo, L. L., R. L. Fernando, and M. Grossman. 1997. Genetic evaluation by blup in two-breed
terminal crossbreeding systems under dominance. Journal of Animal Science 75: 28772884.
15
Long, C. R. 1980. Crossbreeding for beef-production - experimental results. Journal of Animal
Science 51: 1197-1223.
Lutaaya, E., I. Misztal, J. W. Mabry, T. Short, H. H. Timm, and R. Holzbauer. 2001. Genetic
parameter estimates from joint evaluation of purebreds and crossbreds in swine using the
crossbred model. Journal of Animal Science 79: 3002-3007.
Mather, K., and J. L. Jinks. 1971. Biometrical genetics. 2 ed. Chapman and Hall, London.
Metropolis, N., A. W. Rosenbulth, A. H. Teller, E. Teller. 1953. Equations of state calculations
by fast computing machines. Journal of Chemical Physics 21: 1087.
Miller, S. P., and J. W. Wilton. 1999. Genetic relationships among direct and maternal
components of milk yield and maternal weaning gain in a multibreed beef herd. Journal
of Animal Science 77: 1155-1161.
Quaas, R. L. 1988. Additive genetic model with groups and relationships. Journal of Dairy
Science 71: 1338-1345.
Quaas, R. L., and E. J. Pollak. 1999. Application of a multi-breed genetic evaluation. In: Seventh
Genetic Prediction Workshop, Kansas City, MO. p 30-34.
Rosa, G. J. M. 1999. Robust mixed linear models in quantitative genetics: Bayesian analysis via
Gibbs sampling. In: International symposium on animal breeding and genetics, Vicosa,
MG, Brazil. p 133-159.
Roso, V. M., and L. A. Fries. 1998. Maternal and individual heterozygosities and heterosis on
preweaning gain of angus x nelore calves. In: World Congress On Genetics Applied To
Livestock Production, Armidale. Communication no 23:105.
Sorensen, D. A., S. Andersen, D. Gianola, and I. Korsgaard. 1995. Bayesian-inference in
threshold models using Gibbs sampling. Genetics Selection Evolution 27: 229-249.
Sorensen, D. A., and D. Gianola. 2002. Likelihood, Bayesian and MCMC methods in
quantitative genetics. 1 ed. Springer-Verlag New York, Inc., New York.
Sorensen, D. A., C. S. Wang, J. Jensen, and D. Gianola. 1994. Bayesian-analysis of genetic
change due to selection using Gibbs sampling. Genetics Selection Evolution 26: 333-360.
Stranden, I., and D. Gianola. 1999. Mixed effects linear models with t-distributions for
quantitative genetic analysis: A Bayesian approach. Genetics Selection Evolution 31: 2542.
Sullivan, P. G., J. W. Wilton, S. P. Miller, and L. R. Banks. 1999. Genetic trends and breed
overlap derived from evaluations of beef cattle for multiple-breed genetic growth traits.
Journal of Animal Science 77: 2019-2027.
Wang, C. S., D. Gianola, D. A. Sorensen, J. Jensen, A. Christensen, and J. J. Rutledge. 1994a.
Response to selection for litter size in danish landrace pigs - a bayesian-analysis.
Theoretical and Applied Genetics 88: 220-230.
Wang, C. S., J. J. Rutledge, and D. Gianola. 1994b. Bayesian-analysis of mixed linear-models
via Gibbs sampling with an application to litter size in Iberian pigs. Genetics Selection
Evolution 26: 91-115.
Wolf, J., O. Distl, J. Hyanek, T. Grosshans, and G. Seeland. 1995. Crossbreeding in farmanimals .5. Analysis of crossbreeding plans with secondary crossbred generations.
Journal of Animal Breeding and Genetics-Zeitschrift Fur Tierzuchtung Und
Zuchtungsbiologie 112: 81-94.
16
Table 1. Posterior mean, standard deviation (std), and 2.5% and 97.5% percentiles of direct
additive heritability of post-weaning gain (PWG) for different genotypes, obtained by
hierarchical multiple-breed animal model (HMBAM) and animal model (AM).
Model
AM
HMBAM
Genotype
Mean
Std
2.5%
Overall
0.15
0.02
0.11
Nelore
0.09
0.02
0.06
Hereford
0.22
0.03
0.17
F1
0.16
0.02
0.12
F2
0.19
0.03
0.15
BC1
0.14
0.02
0.11
BC2
0.21
0.02
0.16
a
Adv38
0.20
0.03
0.16
a
Advanced generation of 3/8 Nelore and 5/8 Hereford composition.
97.5%
0.19
0.14
0.27
0.20
0.25
0.19
0.25
0.26
17