Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 7: Linkage Analysis I Date: 9/16/02 Clean up segregation analysis Steps of Segregation Analysis Identify mating type(s) where the trait is expected to segregate in the offspring. Sample families with the given mating type from the population. Sample and score the children of sampled families. Test H0: “expected segregation ratio” or estimate segregation ratio. But, you knew it wouldn’t always be that easy… Appropriate mating types may not be identifiable. Offspring of Dr x Dr cross segregate, but this mating type is indistinguishable from DD x DD and others. An incompletely penetrant trait. Some appropriate mating types fail to be detected because the trait is invisible. The trait is rare and you need to enrich your sample for affecteds. You collect a nonrandom sample. Segregation Analysis for Autosomal Recessive Genes Mating Type Genotype Phenotype DD Dr rr Dominant Recessive DDxDD 1 0 0 1 0 DDxDr 0.5 0.5 0 1 0 DrxDr 0.25 0.5 0.25 0.75 0.25 DDxrr 0 1 0 1 0 Drxrr 0 0.5 0.5 0.5 0.5 rrxrr 0 0 1 0 1 Avoid Contaminating Mating Types Controlled crosses. Select only the appropriate mating types by ascertaining them through their offspring. Ascertainment Procedure Definition: An ascertainment procedure is the way in which families come to be included in a study. The ascertainment procedure may result in incomplete selection or over-selection. Exclude those families that happen to have all normal offspring. Include those families which don’t segregate. Ascertainment Terminology I Definition: Affected individuals that are identified independently of all other individuals are called probands. Definition: Other affected individuals in a family with a proband are called secondary cases. Definition: The ascertainment probability () is the probability that an affected individual in the population is identified as a proband. Ascertainment Terminology II Definition: Complete ascertainment or more appropriately, truncated ascertainment, is the case where =1, so all families with affected children are identified and all their children are probands. Note that this does not imply complete selection. Definition: Single ascertainment is the situation where all ascertained families will have just one proband. Ascertainment Bias: Sex-Based Dizygotic twins: Each twin is independently assigned sex with equal probability of being male or female. PFM or MF 1 1 Psister one male twin PMM 2 2 4 2 1 1 PMF or FM 2 2 2 2 1 1 PFF 4 2 PFM or MF or MM 1/ 2 2 3/ 4 3 1 Pbrother one male twin 3 Ascertainment Bias: Sex-Based (contd) Phas sister AND sample male Phas sister sample male Psample male 1/ 4 1 1/ 2 2 By randomly sampling males we only identify only ½ of the cases where a male twin has a sister. Ascertainment Bias: SizeBased Suppose a proportion kx of families have x siblings. xkx x x k x 2 2 x A random individual derives from a sibship of size x with probability: xkx xkx qx zk z z Ascertainment Bias: SizeBased (contd) The expected number of sibs of a random individual is: q x 1 x x x k x x1 x 2 1 Thus, by sampling random individuals, we tend to sample more from large families and we increase the average number of sibs. Ascertainment Bias: Genetic Samples In genetic samples, the ascertainment bias is trait-based. Enrich for a rare trait by sampling an individual who is affected and those who are related to him/her. Obviously, among those who are related, the probability of being affected is higher than in an unbiased sample from the population. Family Ascertainment Probability pr 1 1 r Complete ascertainm ent : 1 pr 1 Small : 1-1-π 1 1 r r r Truncated Ascertainment ( =1) Consider only families of size s. Let random variable Xi be the number of affected offspring in the ith family. Xi~Bin(s, 0.25) In complete ascertainment, all families with Xi>0 are included in the study. We seek the distribution of Xi | Xi>0. Distribution of Xi | Xi > 0 (Truncated Binomial) Call this random variable on the new sample space {1,2,…,s}Yi. PYi r PX i r X i 0 P X i r , X i 0 P X i 0 s r p 1 p s r r s 1 1 p Expected Segregation Ratio with Truncated Ascertainment Let pt be the segregation ratio given the truncated ascertainment procedure. s r s r r p 1 p Y 1 s r E pt E s s s r 1 1 1 p p s 1 1 p biased! Example: Truncated Ascertainment Number Number Affected of 1 2 3 4 5 Total Probands 1 140 80 35 4 0 259 2 52 12 7 1 72 3 7 0 0 7 4 2 0 2 5 0 0 Total 140 132 54 13 1 340 pt = 0.33 623 1700 0.3665 pˆ Truncated Ascertainment: Estimating p The expression for pt gives us a means of estimating p. We observe pt, assume truncated ascertainment and use p to estimate p. Indeed, since p is a function of pt, we can use previous variance formula results to get an approximate variance for p. Unfortunately, the equation is not analytically solvable. Estimating p: EM Algorithm (E) Incomplete data: the number of unascertained families with 0 affected offspring. Call this Ui at iteration i. Expectation Step: Assume pi and calculate: EU i P0 affected EDr Dr mating types 1 pi ns EU i s ns 1 pi EU i s 1 1 pi s Estimating p: EM Algorithm (M) Compute the maximum likelihood estimate pi+1: Observed number of affected offspring pˆ i 1 Total number of observed offspring s ra r 1 r sns sU i Incomplete Ascertainment: The Norm Any time there are affecteds in the study that are NOT probands, the assumption of truncated ascertainment does not apply. Instead we have incomplete ascertainment. The Proband Method Use the proband to identify the mating type, but then leave it out of subsequent calculations. n ~ p b r 1 i 1 n i i b s 1 i 1 i i The Proband Method: Estimating Again, use the proband only to identify the mating type. Count only the other siblings. n ~ b 1 b i 1 n i i b r 1 i 1 i i Example: Proband Method Number Number Total Affected Probands Siblings 1 140 (s-1)*140 2 Affected Siblings 0 80+2*52 (s-1)*184 (2-1)*184 Proband Siblings 0 2*52 : : : : : Total: 260 520 210 131 430 ~ p 0.2488 1728 210 ~ 0.488 430 Singles Method A single is a proband who is the only proband in a family. Singles are not considered effective observations because they are observed only through their affected status. Let d be the number of singles in a sample of n families. Singles Method Estimates n n p r d i 1 n i s d i 1 i b d i 1 n i r d i 1 i Singles Method Example There are 259 singles. 623 259 p 0.2526 1700 259 434 259 0.481 623 259 Variance For Proband and Singles Method The proband and singles method both give estimators that are the quotient of two random variables. Approximate equations fr variance exist: X X Var X Var Y 2Cov X , Y Var E 2 2 Y Y E X E Y E X E Y X E X Cov X , Y E X Var Y E 2 E X EY 2 Y EY 2 Likelihood Method P(offspring is proband) = p. P(family ascertained) = 1-(1- p)s. P(r affected and b probands | ascertained) then is: PB b X r; PX r; s, p PX r , B b B 0; s, , p PB 0; s, , p r b r b s r s r 1 p 1 p b r s 1 1 p Likelihood Method (contd) Each family is an independent observation (assuming they are not related). Newton-Raphson multiple parameter update: pm 1 pm S p, I p, m 1 m Likelihood Method: Testing Hypotheses Nested models can be tested using the log likelihood ratio. Interesting hypotheses include: Complete ascertainment: =1 Recessive inheritance: p=0.25 Complete ascertainment and recessive inheritance: =1 and p=0.25 Likelihood Method: Sample Results Model general ca recessive p 0.25 0.48 0.31 1 0.25 0.47 chisquare df p-value 0 0 - 11736 1 0 0.04 1 0.84 Cannot rejective the recessive disease hypothesis. Can reject complete ascertainment hypothesis. Rejecting a Null Hypothesis Not a single locus. Ascertainment procedure. Selection Environmental effects mimicing phenotype. Incomplete penetrance. More Complex Ascertainment Models We have considered ascertainment procedures where the probability of ascertainment was of the form: r pr 1 1 Allowing to vary covers a wide number of cases, but not all. Still imposes a funcional relationship. Summary Ascertainment procedure and the impact of sampling. Segregation analysis when the ascertainment procedure is nonrandom. Specifically, recessive trait. Truncated ascertainment vs. incomplete ascertainment. Proband method; singles method; likelihood method.