* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download QTL analysis in Mouse Crosses
Survey
Document related concepts
Microevolution wikipedia , lookup
Population genetics wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Albinism in biology wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Transcript
How many genes? Mapping mouse traits Lecture 1, Statistics 246 January 20, 2004 1 Aim of today’s and Thursday’s lecture To review basic Mendelian genetics, the basics of recombination, and go on to see how genes contributing to qualitative and quantitative traits are mapped using data from crosses of inbred strains of mice. 2 2.1 Genetic background 2.1 Loci and markers We need to know the following notions from Meldelian genetics: autosomes, sex chromosomes, genotypes, phenotypes, loci, alleles, homozygous, heterozygous, dominant, recessive, (fully) inbred, markers. 3 Our markers are Microsatellites ..AGTCCACACACACACACATGT.. ..AGTCCACACACACACACATGT.. A PCR and electrophoresis ..AGTCCACACACACACACATGT.. ..AGTCCACACACACACACACACACATGT.. H ..AGTCCACACACACACACACACACATGT.. ..AGTCCACACACACACACACACACATGT.. Desirable: to call the genotypes (A, H, or B) automatically Problems: stutters and noise, variability of the patterns, etc. B 4 Similarity Sorting unsorted sorted correlation matrix This is a useful technique to enhance presentation of gel traces and assist manual examination. 5 Genotype Calling This is a statistical pattern recognition problem: • Fit mixture models • Discriminant analysis A H B 6 JnoTyper: software implementation in Java 7 2.2 Inbred strains and their crosses Our main players are the C57BL/6 (BL for black, abbreviated B6), a robust strain that has been around about 90 years, and the NOD (non-obese diabetic) mouse strain, a delicate diabetes-prone strain discovered in 1990. Coat colours: agouti is standard, B6 is black, NOD is albino (i.e. white). 8 Normal (wild-type) mouse coat: color = agouti a grizzled color of fur resulting from the barring of each hair in several alternate dark and light bands 9 Black mouse: C57/BL6 strain 10 Albino mouse: non-obese diabetic (NOD) strain 11 Coat color loci in mice Four main loci : A, B, C and D • • • • Locus A – agouti Locus B – black Locus C (known as Tyr) – albinism Locus D – dilution gene 12 Alleles at the Agouti (A) locus • Ay, Lethal dominant yellow • Avy, Viable yellow • Aw, White-bellied Agouti • A, Agouti or Wild type • At, Black and Tan • Am, mottled agouti • a, Non-agouti • ae., Extreme non-agouti A and a are a dominant/recessive allele pair 13 Alleles at the Albino (C) Locus • • • • C, full color gene cch, chincilla ch, himalayan c, albino gene C and c are a dominant/recessive pair of alleles 14 Alleles at A and C interact (called epistasis in genetics) • If the mouse is aaCx it is not agouti and not albino (in our case a black mouse) • If the mouse is AxCx it is agouti and not albino • If the mouse is xxcc it is albino no matter what the alleles at the agouti locus are because they are irrelevant 15 Crosses We will denote the NOD mice by A, and the B6 mice by B. This same notation will denote the two homozygotes at a polymorphic marker. Two main crosses interest us, following the first filial generation or F1 , which we denote by AB H. Here H denotes heterozygote, which is the case for our F1s. The backcross BC is arrived at via HB BC, or a variant, while the F2 intercross is given by HH F2. 16 2.3 Data • An F2 inter cross was performed starting with C57BL/6 and NOD parental lines. • We have 133 female mice at the F2 generation, just females for the reason that males fight, and this influences other (quantitative blood) phenotypes of interest • They were genotyped at 153 microsatellite markers spanning all 19 autosomes and the X chromosome. We also have coat color and a few white blood cell phenotypes. 17 A small portion of the data (beginning) #individuals #loci #traits marker next column = data from mouse1 data type f2 intercross . 133 153 7 *D10M106 BBABBBBBHBBABBBBAABBBB-BABAB *D10M14 AHHBHHHAHHABAHBHHBABAA-BHHAH *D10M163 AHBBHHB-HHAB-HBH-BAHBA-BHHAH *D10M20 HCBHAHBAHHAHAHBABAHHBH-HHHAB data type f2 intercross . 133 153 7 *D10M106 BBABBBBBHBBABBBBAABBBB-BABABABBABBBBBBBBBBBBB-BBBBBBABBAAABBBBBBBBB-HBABABB-ABBBBAB-BBBABABBB-BBBBBCBCBCBHBBBHCBBHBHHBCBBBBBBBHBHBHCH *D10M14 AHHBHHHAHHABAHBHHBABAA-BHHAHAAHAHHHHHBAHHHAHHBAHBHABBBHAAHHHHAHBHHH--HHHHAHAHAHBHHHAHHABAHHHAHHHAHBHBBHHHAAHAAHHBHHAHAH-HBABAHAHBHHAH *D10M163 AHBBHHB-HHAB-HBH-BAHBA-BHHAHAAHAAHHAHBAHHHHHHHAHBHABBBHAAHBBHAHBBHHBBHBHHHH-HBHHHHHAHHAHABH-AHHHAHBABBBBAAAHAAHHBHHAHHHBHBAHAHABHHHAH *D10M20 HCBHAHBAHHAHAHBABAHHBH-HHHABAAHAAABHHBH-HAHBHAAHBCABABHAAABBHAHBHHBBBHBHAHH-HBHHHABAHHHHAHHBAAHHABHABHBHAAHBHAAHBHAAHBHBHBHHHHABAHAAH D10M106 = a marker on chr 10 defined by MIT Incompleteness code: C = B or H, D = A or H, - = missing 18 A small portion of the raw data (end) *DXM210 --HAAAAHHHAHAAAAAHAH-HAHHAHAHHH*DXM222 HAAHHAA-HHAAHAAHHAAAHH-HAAHAAHHH *DXM39 HAAAHAA-HHAH-AAA-HAAHH-HAAAAHHHHH Coat color code *trait1 *trait2 *trait3 *trait4 *trait5 *trait6 *trait7 data type f2 intercross . 133 153 7 *D10M106 BBABBBBBHBBABBBBAABBBB-BABABABBABBBBBBBBBBBBB-BBBBBBABBAAABBBBBBBBB-HBABABB-ABBBBAB-BBBABABBB-BBBBBCBCBCBHBBBHCBBHBHHBCBBBBBBBHBHBHCH *D10M14 AHHBHHHAHHABAHBHHBABAA-BHHAHAAHAHHHHHBAHHHAHHBAHBHABBBHAAHHHHAHBHHH--HHHHAHAHAHBHHHAHHABAHHHAHHHAHBHBBHHHAAHAAHHBHHAHAH-HBABAHAHBHHAH *D10M163 AHBBHHB-HHAB-HBH-BAHBA-BHHAHAAHAAHHAHBAHHHHHHHAHBHABBBHAAHBBHAHBBHHBBHBHHHH-HBHHHHHAHHAHABH-AHHHAHBABBBBAAAHAAHHBHHAHHHBHBAHAHABHHHAH *D10M20 HCBHAHBAHHAHAHBABAHHBH-HHHABAAHAAABHHBH-HAHBHAAHBCABABHAAABBHAHBHHBBBHBHAHH-HBHHHABAHHHHAHHBAAHHABHABHBHAAHBHAAHBHAAHBHBHBHHHHABAHAAH 11231123122211113111111 8.90472059883773 8.62455170973674WBC 8.454 16.0508869012649 16.1080453151048 traits 16.16 16.0138456295845 16.0907244541622 16.12 13.8887610197039 14.1288603771646 13.98 7.1066061377273 6.52209279817015 6.6333 8.65927129000923 8.41405243249672 198.158 Snapshot of the genotype data 20 Using the LOD_error statistic. Based on close recombn events which indicate possible presence of genotyping error Error Detection (see later) calc.genoprob, calc.errorlod, plot.errorlod 21 2.4 Mendel’s laws for one locus We can (and should) check Mendel with data from our 133 offspring at each of our 153 loci. For example, at D7Mit126, we have 24 A, 29 B and 67 H genotypes, adding to 120, indicating 12 incomplete or missing genotypes. What do we expect according to Mendel? How would we test whether the data agree with our expectations? 22 2.5 Mendel’s law for 2 loci Mendel inferred from his data on peas the independent segregation of different factors. Here we check that this holds for our two coat color loci, but not generally. We then go on to understand the more general situation. 23 Mating & Coat color outcomes in this cross C57/BL6 males Parental lines NOD females Albinos Black (aaBBCC) (AABBcc) All Agouti F1 F2 aABBCc Agouti 9 : Black 3 : Albino 4 We need to check these last proportions following Mendel’s 24 reasoning. Punnett square depicting F1 parental allele combinations passed on to F2 offspring 25 It’s not always like that 132 51 A H B Total A H B Total 26 10 0 36 10 46 5 61 0 9 23 32 36 65 28 129 2-locus genotypes at D12Mit51 and D12Mit132. If we pool A and H, we do not get 9:3:3:1. 26