* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 04/01/15 lecture2_04c
Survey
Document related concepts
Genetic engineering wikipedia , lookup
Gene therapy wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Gene expression profiling wikipedia , lookup
Tay–Sachs disease wikipedia , lookup
Gene expression programming wikipedia , lookup
Heritability of IQ wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Microevolution wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Fetal origins hypothesis wikipedia , lookup
Genome (book) wikipedia , lookup
Designer baby wikipedia , lookup
Transcript
Biomath 207B / Biostat 237 / HG 207B Lecture 2 - Segregation Analysis 1/15/04 If a trait is inherited in a recessive manner with complete penetrance, then the child of two unaffected carriers has a 25% chance of having the trait. Example: Brown and Blue eyes. The Blue is recessive to Brown. B-b B-B B-b B-b B-b b-b 1 If a trait is inherited in a dominant manner with complete penetrance, then the child of an affected heterozygous parent and an unaffected parent has a 50:50 chance of having the trait. Dominant and recessive are relative terms. Note Brown eye gene is dominant to Blue eye gene. B-b b-b b-b b-b B-b B-b 2 Simple segregation patterns: (1) recessive pattern of inheritance. (2) disease is fully penetrant (3) let D denote the disease allele (4) p(d)=0.7, p(D)=0.3 (5) collect all families with exactly two children What distribution of affecteds do we expect to see under Hardy Weinberg Equilibrium and random mating? 75.1% 6.6% 1.1% 10.7% 3.8% 1.9% Unaffected parents: One affected parent (male or female): 0.8% Two affected parents: 3 A disease that is inherited in a dominant manner has a different pattern (1) disease is fully penetrant (2) let D denote the disease allele (3) p(d)=0.9, p(D)=0.1 (4) collect all families with exactly two children (5) Hardy Weinberg equilibrium and random mating 65.6% Unaffected parents: 7.3% 14.6% 0.2% 1.2% 8.9% One affected parent (male or female): 2.2% Two affected parents: 4 Why is it not always this simple? -More than one gene can be involved and environment influences disease risk. That is, there are diseases with reduced penetrance and sporadic cases of disease. -Can’t sample everyone. Complete ascertainment is impractical for rare diseases -Family structures will vary. Parents may not be available. 5 Most common diseases are examples of multi-factorial,or complex,traits. Complex trait: more than one gene or gene(s) and environment play a role. Two genes additive effects gene 1 gene TRAIT gene 2 gene-environment additive effects genes-environment interactions gene 1 TRAIT environment gene 2 TRAIT environment In a multi-factorial disease, genes that play a role in susceptibility to a disease may not be necessary or sufficient for disease expression. Do not observe Mendelian inheritance patterns. Mendelian inheritance patterns include the transmission patterns expected if there is a single gene obeying Mendel’s law of independent assortment of alleles at a single locus, eg. dominant, recessive. 6 Quantifying the Familial Aggregation The first step of any genetic study is to ask one of the following related questions: (1) Does the disease aggregate in families (more than expected by chance)? (2) Are family members’ trait values more likely to be similar than the trait values of two randomly selected people from the same population? One popular method of answering these questions is to calculate the recurrence risk to relatives. Recurrence risk to relatives of type R : R = Prob(relative of type R affected | subject affected) Prob(random person affected) The larger R, the greater than degree of aggregation in families but a large value of R does not prove disease has a genetic basis. Aggregation could be common environmental factors. Prob(random person affected)= population prevalence. The observation that offspring > siblings argues against a purely Mendelian trait. 7 Segregation Analysis • Goal of Segregation analysis: To identify the specific genetic mechanisms that may control traits associated with disease. • Segregation Analysis is used to determine if the observed familial aggregation has a genetic basis. In addition, it is used to estimate the relative effects of genetic and environmental factors shared among family members. It can also be used to test for gene-environmental interactions. • See Jarvik (1998) Complex Segregation analyses: Uses and Limitations AJHG 63:942-946 for more information. 8 Why go to all the trouble of segregation analysis? (1) Calculating relative risks isn’t good enough. Familial aggregation can be due to shared environment. High sibling relative risk (s) or heritability does not prove that the disease has a genetic component (see for example, Guo AJHG 1998). Segregation analysis increases the confidence that genes play a role in the susceptibility to the disease. (2) The most powerful forms of linkage analysis require accurate knowledge of the inheritance mode and penetrance of the disease. Genetic model based gene mapping (classical linkage analysis) requires that the inheritance mode (dominant, recessive, etc) for the major gene and the probability of disease given a particular genotype be known. If the genetic model is wrong the false negative rate is increased (Martinez M. et al, Gen. Epi., 1989, 6:253-8). 9 Segregation analysis is a more difficult but more informative method of gathering evidence for substantial genetic involvement in susceptibility to the trait. Familial Aggregation can be due to: (1) Shared genes (a) one gene acting in a (i) dominant manner Let D be the disease risk gene P(disease|DD)=P(disease|Dd)>P(disease|dd) (ii) recessive manner P(disease|DD)>P(disease|Dd)=P(disease|dd) (iii) additive manner P(disease|Dd)=1/2(P(disease|DD)+P(disease|dd)) (iv) codominant manner P(disease|DD)>P(disease|dD)>P(disease|dd) (b) several genes (c) many genes (polygene model) (2) Shared environment (3) A combination of both genes and environment that can include interactions between the genes and the environment. 10 Segregation Analysis involves: (1) Specifying a mathematical model (similar to genetic model based linkage analysis). (2) Computing the likelihood of the observed data under the model (3) Comparing various models to find the “best” fitting model. Note that with segregation analysis, the best model is the best model among those examined. For example, if a polygene model is not among the choices for a disease caused by many loci, the best fitting model might be end up being a major gene model with spurious environmental factors. Environmental factors must be identified and carefully documented for accurate results. The method of finding the families (ascertainment) should be included in the model. 11 The overall approach to segregation analysis is: • Step (1): Specify null and alternative hypotheses. • For example: no aggregation in families at all (sporadic model) for the null hypothesis and Mendelian inheritance (single gene) as the alternative hypothesis. • Step (2): Translate into mathematical models. • Step (3): Compute the maximum likelihood of the data and maximum likelihood estimates for the parameters in the mathematical model for both hypotheses. • Step (4): If the null model is a special case of the alternative (nested models), then compare the models using Likelihood ratio tests (LRT) to find the hypothesis that is best supported by the data (hierarchical testing). If not nested, then use AIC criterion or simulation to test. • Repeat these steps for as many hypotheses as you wish to test. 12 Comparing models: (1) If the null hypothesis is a special case of the alternative model then one way to compare is using a LRT test. For example a dominant Mendelian model is a restriction of the co-dominant Mendelian model. Under this null hypothesis: 2*LR has a chisquare distribution. The degrees of freedom are determined by the difference in the number of parameters. When comparing the dominant and codominant Mendelian models, the degree of freedom is one. The chi-square statistic has an associated p-value. If it is less than 0.05 then reject the null hypothesis in favor of the alternative. If it is greater than 0.05 then accept the null hypothesis. (2) If the null hypothesis is not a special case of the alternative use the AIC criterion to compare. For example, a dominant Mendelian model under HWE is not a special case of a recessive Mendelian model where we do not assume HWE. The model with the lowest AIC corresponds to the accepted hypothesis. 13 Converting hypotheses into models: • The mathematical models have three parts: • The penetrance – a measure of how likely is the trait value given a person is in a particular risk group In genetics, the most relevant parameters are m=gaa, gAa, gAA, representing the value for phenotype value for the aa, and the change in value for the Aa, or AA group. • The prior - The probability that a founder belongs to a particular risk group (under HWE determined by qA). • The transmission probabilities - The probability that an offspring belongs to a particular risk group given their parents’ risk groups. The relevant parameters taa, tAa, and tAA. For example taa = P(A transmitted from an aa parent) and taa taa =P(AA transmitted from aa and aa parents) Under Mendelian inheritance, taa= 0, tAa=1/2, and tAA=1. 14 With this information, determine the likelihood of the trait gene location given the marker genotypes for the family members. (Sum over the possible genotypes for the trait). Prob for family r ... Pen X i | Gi Prior G j G1 Gn i j TransG m | Gl , Gm { k , l , m} Each family is independent so the individual family likelihoods multiply. 15 Ousiotype model: Define tAA, tAa, taa to be the probability of "transmitting" type A to an offspring depending on the parental type. These transmission probabilities are Pr(gi|gfi,gmi) where gi is person i's ousiotype, gfi is his father's ousiotype, and gmi is his mother's ousiotype. if P(offspring ousiotype |parents ousiotype) father's mother's offspring's ousiotype (gi): ousiotype ousiotype aa aA (gfi) (gmi) AA aa aa aa aa Aa AA taa)2 taa)tAa) taa)tAA) taataa) taatAa)+taa)tAa taatAA)+taa)tAA taa2 taatAa taatAA Aa Aa Aa aa Aa AA tAa)taa) tAa)2 tAa)tAA) tAataa)+tAa)taa tAatAa) tAatAA)+taA)tAA tAataa tAa2 tAatAA AA AA AA aa Aa AA tAA)taa) tAA)tAa) tAA)2 tAAtaa)+tAA)taa tAAtAa)+tAA)tAa tAAtAA) tAAtaa tAAtAa tAA2 16