* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download But what are genomic (additive) relationships?
Inbreeding avoidance wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genomic library wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Genome evolution wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Pharmacogenomics wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Gene expression programming wikipedia , lookup
Medical genetics wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genetic engineering wikipedia , lookup
Pathogenomics wikipedia , lookup
Behavioural genetics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genetic drift wikipedia , lookup
Genomic imprinting wikipedia , lookup
Human genetic variation wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Heritability of IQ wikipedia , lookup
Genome (book) wikipedia , lookup
Designer baby wikipedia , lookup
Public health genomics wikipedia , lookup
Population genetics wikipedia , lookup
Butwhataregenomic (additive)relationships? AndresLegarra [email protected] INRAUMRGenPhySE,Toulouse,France 1 WhatIwanttoshowinthistalk • Giveanoverviewofsomeestimatorsofrelationshipsinahistorical context • Explainwhy“allgenomicrelationshipsareequal” 2 Kinship Itobviously comesfromLatin“parentes” 3 Sowhatiskinship? • Sociallyithasa“pedigree”interpretation • e.g.”allroyalfamiliesarerelated” • Howeverpedigrees“gobackforever” • Weneedamorerigorousdefinition 4 Truerelationships • Twoindividualsaregeneticallyidentical(foratrait)iftheycarrythe samegenotypeatthecausalQTLsorgenes • Thisisabiologicalfact • if Isharethe blood group 00with somebody Iam“like”his twin • Thegeneticsofonelocusfortwodiploidindividualscanbedescribed usingGillois’identitycoefficients 5 Genes andphenotypes • Heredity seems toact inalinear manner • Fisher(1918)explained why this is so: • The« substitution »effect ofoneallele is theregression ofphenotype ongenotype 𝟎 𝑪𝑪 • 𝑎 = 𝒛$ 𝒛 %𝟏 𝒛$ 𝒚, 𝒛 = *𝟏 ,e.g.forgenotypes = *𝑪𝑻 𝟐 𝑻𝑻 • Undermost realistic models ofgene actionandmutation,this « substitution » effect explains alargepartofthegenetic varianceofatrait • Even ifbiology is complex (interactions) 6 Why additiverelationships • Diploids transmitalleles,notgenotypes,totheir offspring • Additivesubstitutioneffects describe adequately thegenetic superiority oftheoffspring ofone(selected)parentmated atrandom toapopulation • Thisis thereason why we useBreeding Valuesforgenetic improvement • Inpractice,even ifmating is notatrandom,they arevery goodguides • See next talkforconsideration ofmatings « notatrandom » • Additivesubstitutioneffects:=>additiverelationships 7 Relationships • Relationshipswere conceived asstandardized covariances (Fisher,Wright) • 𝐶𝑜𝑣 𝑢3 , 𝑢4 = 𝑅𝑒𝑙34 𝜎9: • 𝑅𝑒𝑙34 “some”relationship • 𝜎9: « some »variancecomponent • Genetic relationships aredue toshared (Identical By State)alleles atcausal genes • These genesareunknown (andmany will likely remain so) • Useproxies • Pedigree relationships • Marker relationships 8 Classical view ofapedigree • Baseanimals aredrawn from avery large, unselected population • All2n founder alleles aredifferent • E.g.for blood group we would have 10 different alleles (there areactually 3) • These assumptions arefalsebutwork reasonably well 9 Pedigree relationships • (Malécot,Wright) • Kinship or coancestry𝜙34 of𝑖 and𝑗:probability that one allele taken atrandom from each individualis identical by descent • Additive relationship 𝐴34 = 2𝜙34 0 0.5 0.25 0.25 10 Computationofrelationshipcoefficientsfrom pedigree 𝑎NO 1 = 2ΔQ + ΔS + ΔT + ΔU + ΔW 2 𝑑YZ[ = ΔU 𝑑\Z[ = ΔQ 1 𝑐NO = ΔQ + ΔT 2 1 𝑐ON = ΔQ + ΔS 2 • e.g.followingKarigl (1981) 11 X Juan León Pedro Y Petrona Juana Paiperon Mariana Leonor Beltrán Andrés Julio Carmen Constantino Teresa Julio-Mencha Progeny-progeny Julio-progeny Mencha-progeny 1 0.01025 0.06580 0.03467 0.03467 2 0.02393 0.04333 0.08252 0.00000 3 0.02490 0.04333 0.00000 0.08252 4 0.02393 0.04333 0.06665 0.06665 5 0.02490 0.04333 0.08228 0.08228 6 0.00708 0.00729 0.00000 0.00000 7 0.05103 0.02383 0.00000 0.00000 8 0.05103 0.02383 0.00000 0.00000 9 0.05127 0.24713 0.08228 0.06665 10 0.10937 0.15900 0.29248 0.00000 11 0.16992 0.15900 0.00000 0.29248 12 0.00708 0.00729 0.06665 0.08228 13 0.05103 0.02383 0.00000 0.29248 14 0.05103 0.02383 0.29248 0.00000 15 0.34326 0.08582 0.00000 0.00000 Francisco Caciana Catoya Mencha Figure 2. The pedigree of the Jicaque Indians Julio and Mencha. Table 1. Detailed coefficients of identity for four pairs involving Julio, Mencha and their progeny. Garcia-CortesGenSel Evol 2015 12 Pedigree relationships:A • Systematic “tabular”rulestocomputeany 𝐴34 (Emik &Terrill 1947) • The whole array of𝐴34 is disposed inamatrix 𝑨. • 𝑨%Q is very sparse andeasy tocreate andmanipulate (Henderson 1976) • Extraordinary development ofwhole-pedigree methods inlivestock genetics • E.g.computing inbreeding for 15generations including 106 sheep takes minutes 13 Pedigree relationships:A • Earlyuseofmarkersusedthemtoinferpedigreesor relationships • Gathermarkers,thenreconstructpedigrees,thenconstructA • Inconservationgenetics,molecular markershaveoften beenused to estimate pedigreerelationships • Either estimates ofAxy ,orestimates of« themost likely relation »(sondaughter,cousins,whatever) • LiandHorvitz 1953,Cockerham 1969,Ritland 1996,Caballero&Toro2002, andmany others • Withabundantmarkerdatawecandobetterthanthis 14 Theinfinitesimalworldwasahappyone • Therewasamythicallarge,unrelated“base”populationfromwhich everythingoriginated • Chromosomes“didnotexist”forBLUPers • GeneswereIdenticalByStateONLYiftheywereidenticalbydescent (IBD) • Relationshipscouldbecomputedfrompedigree • Dominanceandadditivitycan“easily”beconsidered • Onlyinbreedingresultedincomplications 15 Thegenomicworldisacruelone • Whatisagenomicbasepopulation? • Chromosomesexistandarefinite • Markersare“measures”ofDNAthatare“read”(andcostmoney!!) • Weestimate relationships • …yetwedonotagreeonacommondefinitionofrelationships • (seerecentreviews • EAThompsondoi:10.1534/genetics.112.148825, • Speed&Baldingdoi:10.1038/nrg3821) 16 Realized relationships • IdenticalByDescent Relationships based onpedigreeareaverage relationships which assumeinfinite loci. _ areabitdifferent duetofinite genome size(Hilland • « Real »IBDrelationships 𝑨 Weir,2010) _ • Therefore A is theexpectation ofrealized relationhips 𝑨 • SNPs moreinformativethan A. • Two fullsibsmight haveacorrelation of0.4or0.6 • Youneed many markerstoget these « finerelationships » 17 Traditional Pedigree Sire of Sire Sire Dam of Sire Animal Sire of Dam Dam Dam of Dam Interbull annual meeting 2007 (18) VanRaden 2007 Genomic Pedigree Interbull annual meeting 2007 (19) VanRaden 2007 Haplotype Pedigree atagatcgatcg ctgtagcgatcg ctgtagcttagg agatctagatcg agggcgcgcagt ctgtctagatcg cgatctagatcg atgtcgcgcagt cggtagatcagt agagatcgcagt agagatcgatct atgtcgctcacg atggcgcgaacg ctatcgctcagg Interbull annual meeting 2007 (20) VanRaden 2007 Genotype Pedigree Count number of second allele 121101011110 111211120200 101121101111 122221121111 101101111102 011111012011 121120011010 0 = homozygous for first allele (alphabetically) 1 = heterozygous 2 = homozygous for second allele (alphabetically) Interbull annual meeting 2007 (21) VanRaden 2007 Comparison of expected and observed variances – relationship/sharing 4401 full sib pairs 400-800 markers Expected Mean 0.5 SD 0.039 Observed 0.498 Mean 0.0498 SD 0.036 Range 0.37 - 0.63 Source: Visscher et al. 22 Slidefrom WGHill Genomicrelationships Canbeseenas • Estimatorsofwhole-genomeIBDrelationships • Ritland,1996,VanRaden 2007andmanyothers • Onesuchsoftwarepre-SNPeraisSPAGeDi (HardyandVekemans) • RelationshipsattheQTLloci • e.g.Nejati-Javaremi etal.,1997; • Bothatthesametime • TheestimatorofVanRaden 2007,2008(==Yangetal.2010)hasreceivedgreat attention 23 Genomicrelationships • Inpractice,thebehaviorofmostofthesethingsareverysimilarto eachother • First,becausetheyinvolvesomeformofdistanceacrossgenotypes, andmostdistancesareverysimilar • Second,becauseinacovariance/BLUP/REML/linearworldtheyare oftenmathematicallyequivalent 24 Genomicrelationships Canbeseenas • Estimatorsofwhole-genomeIBDrelationships • Ritland,1996,VanRaden 2007andmanyothers • Onesuchsoftwarepre-SNPeraisSPAGeDi (HardyandVekemans) • RelationshipsattheQTLloci • e.g.Nejati-Javaremi etal.,1997; • Bothatthesametime • TheestimatorofVanRaden 2007,2008(==Yangetal.2010)hasreceivedgreat attention 25 IBSandIBD • IBSatmarkers(𝑟a34 )isafrequentlyusedestimatorofrealizedIBD(𝐴b34 ) • IndividualscanbeidenticalbyIBDorbyIBSatthefounders: 𝑟a34 = 𝐴b34 + 2 − 𝐴b34 𝑝: + 𝑞 : • Thus,IBSisbiasedupwardswithrespecttoIBD. • Thishasoriginatedabunchofestimators,withacommonproblem:where togetp from. • Foradetailedaccount,seeToroetal(2011GenSel Evol) 26 Thequantitativegeneticsofmarkers • Considergenecontentcoding{𝐴𝐴, 𝐴𝑎, 𝑎𝑎}as𝑚 = {0,1,2} • Cockerham,1969: • Fortwoindividuals,thecovarianceoftheirgenecontentsis 𝐶𝑜𝑣 𝑚3 , 𝑚4 = 𝐴b 34 2𝑝𝑞 • Inotherwords,tworelatedindividualswillshowsimilargenotypesatthe markers • Thisleadstothemethodofcovariances ofVanRaden (2008) 27 VanRaden’s “firstG” Genotypes{0,1,2} 𝑮= Ifbaseallelicfrequencies areused,G isanunbiased anefficientestimatorofIBD realizedrelationships Shiftedtorefertothe averageofapopulation withallelefrequenciesp 𝑴%:𝑷 𝑴%:𝑷 m :∑op qp Scaledtorefertothe geneticvarianceofa populationwithallele frequenciesp 28 Some properties ofG • Ifp arecomputed from thesample • InHWE&LinkageEquilibrium • Average ofDiag(G)=1 • Average (G)=0 𝑮= 𝑴%:𝑷 𝑴%:𝑷 m :∑op qp • With average inbreeding F • Average ofDiag(G)=1+F freq AA Aa aa q2 + pqF 2pq(1-F) p2 + pqF 29 Some intriguing properties ofG • Ifp arecomputed from thedata • Thisimplies that E(Breeding Values)=0 • Positiveandnegative inbreeding • Some individuals aremoreheterozygous than theaverage of thepopulation(OK,nobiological problem) • Positiveandnegative genomic relationships • Thisimplies that individuals i andj aremoredistinctthan an average pairofindividuals inthedata • Fixingnegative estimates ofrelationships to0is wrong praxis 30 Notpositivedefinite • Strandén &Christensen(2011)showed that ifp’s areaverages across thesample then G is not positivedefinite (hasnoinverse) • We could useBLUPequations with non-inverted G (Henderson,1984)=>see exercises • Instead,we use𝑮 = 0.99 𝑴%:𝑷 𝑴%:𝑷 m :∑op qp +0.01I orsomethingsimilar 31 Genomicrelationships Canbeseenas • Estimatorsofwhole-genomeIBDrelationships • Ritland,1996,VanRaden 2007andmanyothers • Onesuchsoftwarepre-SNPeraisSPAGeDi (HardyandVekemans) • RelationshipsattheQTLloci • e.g.Nejati-Javaremi etal.,1997; • Bothatthesametime • TheestimatorofVanRaden 2007,2008(==Yangetal.2010)hasreceivedgreat attention 32 Volume 41 January–February 2001 Number 1 PERSPECTIVES What If We Knew All the Genes for a Quantitative Trait in Hybrid Crops? Rex Bernardo* ABSTRACT Plant genomics programs are expected to decipher the sequence and function of genes controlling important traits. Most of the important traits in crops are quantitative and are controlled jointly by many loci. What if we knew all the genes for a quantitative trait in hybrid crops? Will genomics enhance hybrid crop breeding, which currently involves selection on the basis of phenotypes rather than gene information? With maize (Zea mays L.) as a model species, I found through computer simulation that gene information is most useful in selection when few loci (e.g., 10) control the trait. With many loci ($50), the least squares estimates of gene effects become imprecise. Gene information consequently improves selection efficiency among hybrids by only 10% or less, and actually becomes detrimental to selection as more loci become known. Increasing the population size and trait heritability to improve the estimates of gene effects also improves phenotypic selection, leaving little room for improvement of selection efficiency via gene information. The typical reductionist approach in genomics therefore has limited potential for enhancing selection for quantitative traits in hybrid crops. “cherry-pick” as many desirable genes as possible into one single-cross hybrid. It becomes increasingly difficult to accumulate all the desirable genes into one hybrid if the inbreds differ at an increasingly large number of loci. Consequently, the effects of the individual genes need to be quantified for the information to be useful in selection (Kennedy et al., 1992). In other words, a maize breeder would need to know how many grams per kilogram of oil each gene for kernel oil contributes. Selection in hybrid crops, such as maize, oilseed rape (Brassica napus L.), hybrid rice (Oryza sativa L.), rye (Secale cereale L.), sorghum (Sorghum bicolor L. Moench), sugar beet (Beta vulgaris L.), and sunflower (Helianthus annuus L.), is performed among testcrosses of recombinant inbreds and among hybrids (Fehr, 1987, p. 2, 5–6). Best linear unbiased prediction on the basis of trait phenotypes (T-BLUP; Henderson, 1985) is particularly useful for selecting improved single-cross hybrids (Bernardo, 1996). Selection, however, can be on the basis of both trait values and known genes (via trait and gene best linear unbiased prediction, i.e., TGBLUP) if some of the genes are known, or on gene information alone (via standard multiple regression) if all the genes are known (Kennedy et al., 1992). Details • InsteadoftryingtoestimateofQTLeffects,wecoulduseidentityby state relationshipsattheQTLloci B reeders have successfully improved crops despite not knowing the genes affecting quantitative traits. The numbers of genes controlling quantitative traits in different crops are yet unknown, although rough esti- 33 2 2 ∑∑ TAl = 2 × i=1 j=1 2 2 ∑∑ I ij = i=1 j=1 1 I ij 2 3 IBSrelationshipsattheQTL wh er e I is t h e iden t it y of t h e i a llele of t h e fir st ij 4 2 (1 ) th in dividu a l wit h t h e j th a llele of t h e secon d ( I ij t a kes t h e va lu e of 1 if t h e t wo a lleles a r e iden t ica l a n d zer o if t h ey a r e n ot ); TA is t h e t ot a l a llelic r ela t ion sh ip. Th e coefficien t of 2 em ph a sizes t h a t TA is t wice t h e coefficien t of r ela t ion sh ip (Ma lécot , 1 9 4 8 ) a n d is a n a logou s t o t h e n u m er a t or r ela t ion sh ip (Wr igh t , 1922). Tot a l a llelic iden t it y of in dividu a ls x a n d y a ver a ged over L loci a ffect in g t h e t r a it is t h en 4 5 6 1.8 .8 1.2 1.6 1.2 1.6 .8 1.4 1.0 1.2 1.2 1.2 1.2 1.0 1.2 1.2 1.0 1.2 1.6 1.2 1.2 1.8 1.4 1.8 1.2 1.2 1.0 1.4 1.4 1.4 1.6 1.2 1.2 1.8 1.4 1.8 Th e elem en t s of TA m a t r ix follow dir ect ly f a pplica t ion of E qu a t ion 2 t o t h e in for m a t ion gen ot ype a t t h e five loci. Alt h ou gh in dividu a ls 3 a r e a ll pr ogen y of a sin gle pa ir of pa r en t s, h en ce f 2 2 sibs, t h eir t ot a l a llelic r ela t ion sh ip r a n ges fr om 1. I ∑ ∑ lij L L 1.8, in con t r a st t o t h e va lu e of .5 for a ll pr og i=1 j=1 ∑ TAl ∑ ( 2 ) est im a t ed fr om pedigr ee in for m a t ion . Th eir pa r e l=1 l=1 TAxy = = . h a ve a lso a t ot a l a llelic r ela t ion sh ip of .8 wit h e this L L ( 2 ) relationship can also be obtained in a mathematical form ot h er a s com pa r ed wit h zer o r ela t ion sh ip in without counting as (TORO et al. 2011): pedigr𝑟 ee = m et od.−Tot l a llelic r ela t ion sh ip wou ld 𝑚3h𝑚 𝑚3 a−𝑚 As a n exa m ple, con sider t h e followin g t wo in dividu a ls, a34 4 4 +2 ch a n ge even if t h er e wer e n o pedigr ee r ela t ion s a m on g t h ese six in dividu a ls. Th er e a r e t wo pr in cipa l differ en ces bet ween In dividu a l 1 In dividu a l 2 m et h od of t ot a l a llelic r ela t ion sh ip a n d 34t h e st a n d IBSrelationshipsatthemarkers • Theserelationshipsare twiceprobabilities,andhenceoscillate between0and2(nonegativevalues) • BecausewedonotknowallQTLs,weusedensemarkersinstead Hence: • 𝑮\tu isagenomicrelationshipmatrixbasedonIdentityByStateat themarkers _ (aswesaw • And𝑮\tu isalso abiasedestimateofrealizedIBDrelationships𝑨 before) 35 Estimationofmarkereffects • IfeverymarkerisaQTL,awaytoestimatemarkereffects𝒂 istousea BayesianmethodcalledRRBLUP,SNPBLUP,“Ridge”… 𝒚 = 𝑿𝒃 + 𝑴𝒂 + 𝒆 ortaking 𝑴 − 2𝑷 = 𝒁 𝒚 = 𝑿𝒃 + 𝒁𝒂 + 𝒆 36 Frommarkereffectstocovariances • Underreasonableassumptions, • Var 𝒂 = 𝑰 geneticvariance :∑op qp = 𝑰𝜎€: • Definegeneticvalueas𝐮 = 𝒁𝒂 • Covarianceofgeneticvaluesis𝐶𝑜𝑣 𝒖 = 𝒁𝒁$ geneticvariance :∑op qp Sothattherelationshipmatrixforthismodelisagain 𝒁𝒁$ 𝑮= 2∑𝑝3 𝑞3 37 Recapitulation • Bythreeways(estimationofrealizedrelationships,IBSattheQTL, RRBLUP)wearrivetosimilarorsamemodels,andinparticularto 𝑴 − 2𝑷 𝑴 − 2𝑷 𝑮= 2∑𝑝3 𝑞3 $ • Thisisbecausethesameconceptsareusedoverandover • ButaretheseGBLUPsreallythesame? 38 GBLUP==RRBLUP • Theequivalence is tersely shown inVanRaden 2008andfully shown inStrandén andGarrick2009(JDS) • Moreequivalences areshown inStrandén andChristensen2011(GSE) • Shifting ofgenotypes in𝒁 = 𝑴 − 2𝑷 • Irrelevantifthereisanoverallmeanorfixedeffectinthemodel • EBVsareshiftedbyaconstant • Scaling𝑮 by2∑𝑝3 𝑞3 andmarkereffectpriorvariance𝜎€: mustbe geneticvariance : equivalent,i.e.𝜎€ = :∑o q p p • Often,thisnotdonebecausevariancesareestimatedby,e.g.REML,which explainstheminordifferences 39 GBLUP==RRBLUP • We can jumpfrom GBLUPtoRRBLUP ƒ = 𝒁𝒂 ƒ 𝒖 ƒ= 𝒂 1 2∑𝑝3 𝑞3 𝒁$ 𝑮%𝟏 𝒖 „ 40 GBLUP==GBLUP • Forallmatricesofthekind𝑮 = 𝑴%:𝑷 𝑴%:𝑷 m :∑op qp • Changingallelefrequenciesin𝑷 shiftsEBV’sbyaconstant • Q Changingallelefrequenciesin:∑o q p p “scales” • Butwecancompensatethroughachangeinthe“geneticvariance” • E. g. 1.1 0.55 1 0.5 10 = 11 0.55 1.1 0.5 1 • So,ifvariancesareestimated byREMLor« corrected »according to2∑𝑝3 𝑞3 results should be identical :see exercises 41 GBLUP==GBLUPwith IBS • Infact,𝑮𝑰𝑩𝑺 = 𝑴%:𝑷 𝑴%:𝑷 m ˆ + 𝟏𝟏′ • WithP containing0.5 • So,𝑮𝑰𝑩𝑺 isa“VanRaden style”G matrix • Again,usingtherightvariancecomponentswegetthesameEBVs 42 OK,so what should Iuse? • Doesnotmattermuch,allmodelsareequivalent • IfusingREMLorBayesianmethodsyougetthevariancecomponents right • Ifyouusepre-estimatedvariancecomponentsyouwanttouse comparablevariances • Thisisabittrickybutinmostcases“default”G worksjustfine • Comparingvariancecomponentsandℎ: acrossG’sgetstricky • E.g.𝑮𝑰𝑩𝑺 overestimatesℎ: withrespecttoA • SeeLegarra2016,TPB,fortheseaspects 43 Compatibility ofmarker andpedigree relationships • Populationsevolve with time,but genotypes came years after pedigree started • Genomic Predictions areshifted from Pedigree Predictions • Compatibility is achieved if both relationships refer tothe same genetic base: • Same average BVatthe base • Same genetic variance atthe base • Quiteactivework for the SSGBLUP 44 Finally,the SingleStep GBLUP • You want tocombineG inapart ofthe population andA inall the population • But infact,A contains information about the “likely”genotypes of animals that have not been genotyped (e.g.,the daughter ofan animal“AA”will receive an “A”allele) 45 Covariancesofallindividuals Legarraetal.2009;Aguilaretal.,2010;Christensen &Lund,2010 ⎛ u1 ⎞ ⎡ H11 H12 ⎤ Var ⎜ ⎟ = H = ⎢ = non genotyped ⎥ ⎝ u2 ⎠ ⎣ H 21 H 22 ⎦ ⎡ A11 − A12 A −221 A 21 + A12 A −221GA −221A 21 A12 A −221G ⎤ ⎢ ⎥ −1 GA A G ⎣ 22 21 ⎦ genotyped non genotyped Let ⎡ A11 A= ⎢ ⎣ A 21 A12 ⎤ A 22 ⎥⎦ 46 Covariancesofallindividuals This is the variance of prediction of genotypes from genotyped to non-genotyped ⎛ u1 ⎞ ⎡ H11 H12 ⎤ Var ⎜ ⎟ = H = ⎢ = ⎥ ⎝ u2 ⎠ ⎣ H 21 H 22 ⎦ −1 −1 −1 ⎡ A11 − A12 A −221 A 21 + A12 A 22 GA 22 A 21 A12 A 22 G⎤ ⎢ ⎥ −1 GA 22 A 21 G ⎥⎦ This is the error in the ⎢⎣ prediction The prediction « generates » a covariance G comes from genotypes 47 ⎛ u1 ⎞ ⎡ H11 H12 ⎤ Var ⎜ ⎟ = H = ⎢ = ⎥ ⎝ u2 ⎠ ⎣ H 21 H 22 ⎦ ⎡ A11 − A12 A −221 A 21 + A12 A −221GA −221A 21 A12 A −221G ⎤ ⎢ ⎥ −1 GA 22 A 21 G ⎦ ⎣ • Incredibly: H-1 is very simple: …and avoiding « double counting » Inverse of the regular pedigree relationship matrix Correcting for genomic relationships… 48 Fun • Relationshipsacrosstechnicallyunrelatedpopulations 49 Lacaune Manech Tête Rousse Latxa CaraNegra- Euskadi Basco Bearnaise Manech Tête Noire Latxa CaraNegra- Navarre Latxa CaraRubia 50 PCA • First component distinguish Lacaune from the rest • Two Lacaune sub-populations corresponding to two AIcenters (that donot exchange) • LCR-MTRoverlap (recent exchanges oframs) • LCNNAFbetween MTNandLCNEUS(exchanges but less frequent) • BBisolated 51 PC 2 0.04 0.03 0.02 0.01 0.00 -0.01 -0.02 0.00 BB LCN-EUS 0.01 MTR LCR LCN-NA MTN PCA of 7 dairy sheep breeds Lacaune -0.01 PC 1 0.02 52 Asaconclusion • Markershavechangedourwayofthinkinginrelationships • Wecancopymany,butnotall,conceptsfrompedigreerelationships togenomicrelationships • E.g.relationshipsarenotboundedtoprobabilities • Mostderivationsareplaintranspositionsfromclassicpapers • Mostinformationis“outthere”butreadingandlinkingthedifferent informations takestime • Takeyourdatasetandhavefun 53 Moreinfo: • FinancingthroughINRASelGen metaprogram • Notesinmywebpage • UGAIgnacy Misztal’s course(http://nce.ads.uga.edu ) • and… 54 http://icms.org.uk/workshops/statistical Inparticular, seeDavidBalding talk 55 Aguilar,I.,I.Misztal,D.L.Johnson,A.Legarra,S.Tsuruta etal.,2010Hottopic:aunifiedapproachtoutilizephenotypic,fullpedigree,andgenomicinformationforgeneticevaluationofHolsteinfinalscore.JDairySci 93: 743-752. Aguilar,I.,I.Misztal,A.LegarraandS.Tsuruta,2011Efficientcomputationsofgenomicrelationshipmatrixandothermatricesusedinthesingle-stepevaluation.JournalofAnimalBreedingandGenetics128: 422-428. Emik,L.O.,andC.E.Terrill,1949Systematicproceduresforcalculatinginbreedingcoefficients.JHered 40: 51-55. LI,C.C.,andD.G.HORVITZ,1953Somemethodsofestimatingtheinbreedingcoefficient.AmJHumGenet5: 107-117. Cockerham,C.C.,1969Varianceofgenefrequencies.Evolution23: 72-84. Ritland,K.,1996Estimatorsforpairwiserelatednessandindividualinbreedingcoefficients.Genetical research67: 175-185. Caballero,A.,andM.A.Toro,2002Analysisofgeneticdiversityforthemanagementofconservedsubdividedpopulations.Conservationgenetics3: 289. VanRaden,P.M.,2008EfficientMethodstoComputeGenomicPredictions.J.DairySci.91: 4414-4423. Hill,W.G.,andB.S.Weir,2011VariationinactualrelationshipasaconsequenceofMendeliansamplingandlinkage.GenetRes (Camb): 1-18. Yang,J.,B.Benyamin,B.P.McEvoy,S.Gordon,A.K.Henders etal.,2010CommonSNPsexplainalargeproportionoftheheritabilityforhumanheight.NatGenet42: 565-569. VanRaden,P.M.,C.P.V.Tassell,G.R.Wiggans,T.S.Sonstegard,R.D.Schnabel etal.,2009Invitedreview:reliabilityofgenomicpredictionsforNorthAmericanHolsteinbulls.JDairySci 92: 16-24. Legarra, A.,I.AguilarandI.Misztal,2009Arelationshipmatrixincludingfullpedigreeandgenomicinformation.JDairySci 92: 4656-4663. Christensen,O.F.,andM.S.Lund,2010Genomicpredictionwhensomeanimalsarenotgenotyped.GenetSel Evol 42: 2. Tanner,M.A.,andW.H.Wong,1987Thecalculationofposteriordistributionsbydataaugmentation.JournaloftheAmericanStatisticalAssociation82: 528-540. Gengler, N.,P.Mayeres andM.Szydlowski,2007Asimplemethodtoapproximategenecontentinlargepedigreepopulations:applicationtothemyostatin geneindual-purposeBelgianBluecattle.animal1: 21-28. McPeek,M.S.,X.WuandC.Ober,2004Bestlinearunbiasedallele-frequencyestimationincomplexpedigrees.Biometrics60: 359-367. Christensen,O.F.,2012Compatibilityofpedigree-basedandmarker-basedrelationshipmatricesforsingle-stepgeneticevaluation.GENETICSSELECTIONEVOLUTION44: 37. Lourenco,D.,I.Misztal,S.Tsuruta,I.Aguilar,T.Lawlor etal.,2014Areevaluationsonyounggenotypedanimalsbenefitingfromthepastgenerations?JournalofDairyScience97: 3930-3942. Chen,C.,I.Misztal,I.Aguilar,S.Tsuruta,S.Aggrey etal.,2011Genome-widemarker-assistedselectioncombiningallpedigreephenotypicinformationwithgenotypicdatainonestep:Anexampleusingbroilerchickens. JournalofAnimalScience89: 23-28. Vitezica,Z.,I.Aguilar,I.Misztal andA.Legarra,2011Biasingenomicpredictionsforpopulationsunderselection.GeneticsResearch: Inpress. Christensen,O.F.,2012Compatibilityofpedigree-basedandmarker-basedrelationshipmatricesforsingle-stepgeneticevaluation.GENETICSSELECTIONEVOLUTION44: 37. Jacquard,A.,1970Genetic structuresofpopulations.Structuresgenetiques despopulations. General review: Legarra, A.,O.F.Christensen,I.AguilarandI.Misztal,2014SingleStep,AGeneralApproachForGenomicSelection.LivestockScience. 56 Thanks • LeopoldoAlfonso,CatherineBastien fortheinvitation • Mylong-timecollaboratorsanddiscussersinthis,IMisztal,IAguilar, MAToro,LAGarcia-Cortes,LVarona,ZGVitezica,OFChristensen,PM VanRaden,JMElsen,ARicardandmanyothers • FinancingthroughINRASelGen metaprogram 57