Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
American Journal of Epidemiology Copyright O 1999 by The Johns Hopkins University School of Hygiene and Public Health AH rights reserved Vol. 150, No. 11 Printed In USA. Method to Detect Genotype-Environment Interactions for Quantitative Trait Loci in Association Studies Edwin J. C. G. van den O o r d u Khoury et al. {Am J Hum Genet 1988;42:89-95 and Am J Epidemiol 1993; 137:1241-50) presented an epidemiologic approach to examine genotype-environment interaction in situations where the disease is either present or absent. In this article, the author extends the approach of Khoury et al. to quantitative outcome variables. This extension is relevant for diseases that are extremes on a continuum or when continuous risk factors are studied. To account for a possible admixture of subgroups in the sample, tests for genotypeenvironment interaction are discussed for designs with parents as controls as well as without parents as controls. Assuming two environmental conditions, the author demonstrates how the power of these tests can be calculated and used to estimate the sample sizes needed to detect genotype-environment interaction in a variety of conditions. In addition, he analyzes simulated data to demonstrate the detection of different mechanisms of genotype-environment interaction and to study the effectiveness of this approach to identify the correct mechanism. Finally, extensions to multiple environmental conditions and designs with multiple subjects per family are discussed. Am J Epidemiol 1999; 150:1179-87 epidemiologic methods; genetics; relative risk; statistics Insight into the interplay between genes and environments is important to obtain a better understanding of disease mechanisms (1). A classical example is phenylketonuria (PKU), a metabolic disorder involving a rare gene (approximately 1 in 10,000 newborns) which gives rise to a mental handicap only when phenylaline is present in the diet. Because phenylaline is present in many foods, under normal circumstances all children with the genetic defect will get PKU retardation. For this reason the role of environmental factors was initially even overlooked (2). However, the recognition of the true mechanism brought along the insight that putting the children with the genetic defect on a special diet could prevent the adverse effects of PKU. PKU is an example of a genotype-environment interaction. Genotype-environment interaction refers to the differential effects of the same environment on individuals with different genotypes (or the differential effects of different environments on individuals with the same genotype). The environmental component does not necessarily have to involve harmful events but can, for instance, also be a demographic characteristic such as when the expression of a gene is age dependent. To examine genotype-environment interaction Khoury et al. (3, 4) advocated an epidemiologic approach. By using epidemiologic definitions these authors showed how the risk of disease is a function of the gene effects, the frequency of the exposure to the environmental agent, and the strength of the interaction between the genotype and the agent. In addition, they delineated six biologically plausible patterns of genotype-interactions and proposed empirical tests to discriminate among these patterns. In subsequent work, their approach was extended by demonstrating how to estimate the sample sizes needed to detect the interactions and by developing strategies to avoid spurious findings caused by an admixture of subgroups from the population (5-7). The approach developed by Khoury et al. (3, 4) applies to qualitative dependent variables where the disease is either present or absent. However, sometimes the dependent variable is continuous. Examples are hypertension or the commoner varieties of psychopathology where the disease may be an extreme on a continuum. In other situations, the dependent variable may be a continuous trait associated with a chronic disease, such as cholesterol is a quantitative Received for publication July 16,1998, and accepted for publication March 1, 1999. Abbreviations: AIC, Akaike's Information Criterion; ANOVA, analysis of variance; PKU, phenylketonuria; QTL, quantitative trait locus; SEM, structural equation modeling; TDT, transmission-disequilibrium test; TLI, Tucker-Lewis Index. 1 1nstitute of Psychiatry, London, England. 2 Department of Child and Adolescent Studies, Utrecht University, Utrecht The Netherlands. 1179 1180 vandenOord risk factor for heart disease. To examine genotypeenvironment involving quantitative variables we extended the structural equation modeling approach outlined by Fulker et al. (8) and Van den Oord (Institute of Psychiatry, unpublished manuscript). Some advantages of this approach are: the model parameters have a clear interpretation and known relation to the additive genetic and dominance variance; it is not necessary to assume equal variances in all genotype groups; and structural equation modeling is very flexible so that it can easily be extended to include multiple subjects per family or to study different mechanisms of genotype environment interaction. Traditionally, qualitative traits are studied using casecontrol designs. In this design, it is tested whether subjects with the high-risk allele of genotype are more frequently a case than a control. When the "dependent variable" is a continuous trait instead of a dichotomous (case or control) variable, the corresponding test consists of examining whether groups of subjects with different genotypes have different trait means. In addition to Type I errors, significant differences in means can be caused by: 1) the marker allele plays a causal role in determining the trait, 2) the marker allele is in disequilibrium with an unobserved allele that itself is causal (disequilibrium can occur when the marker and causal gene He very close to each other on the genome so that they are transmitted together for many generations), or 3) population admixture which means that there are subgroups, e.g., ethnic groups, that differ from each other with respect to marker frequencies as well as trait means. Significant differences due to population admixture are a problem because in this case the marker is unrelated to a causal gene and not informative about the differential expression of a gene in different environments. Therefore, if population admixture is present it needs to be taken into account to avoid inaccurate conclusions. In the context of qualitative traits this can be done by using the transmission-disequilibrium test (TDT) (9, 10), which tests whether within parental mating type groups the high risk allele is transmitted more frequently to cases than other parental alleles. Its quantitative counterpart consists of testing whether within parental mating type groups the transmission of the high-risk allele is associated with higher means than the transmission of the other parental alleles. Although population admixture may affect the differences in allele frequencies or means between parental mating types, within parental mating type groups there can be no preferential transmission of an allele or differences in means unless the marker is related to the causal gene. To account for a possible population admixture, we present here tests for genotype-environment interaction with parents as controls as well as without parents as controls. After tests with and without parents are introduced, we show how the power can be calculated and used to estimate the sample sizes needed to detect genotypeenvironment interactions in a variety of conditions. Next, we analyze simulated data to demonstrate the detection of different mechanisms of genotypeenvironment interaction and to study the effectiveness of our approach to identify the correct mechanism. TESTS FOR GENOTYPE-ENVIRONMENT INTERACTION IN SITUATIONS WITH AND WITHOUT PARENTS AS CONTROLS To derive tests for genotype-environment interaction that involve quantitative traits we used structural equation modeling (SEM). To a certain extent, the approach resembles an analysis of variance (ANOVA) where the genotypes and environmental conditions are the factors and the quantitative trait the dependent variable. The analysis consists of modeling the means for each combination of groups as a function of the genetic effects, environmental effects, and interaction effects. Significance tests can be performed by examining the deterioration in the fit when the interaction parameters are fixed to zero. For the test without parents as controls, let o be a constant that is the same for all groups in the analysis, gj the effect of genotype j , ek the effect of environmental condition k, ge-lk the interaction effect between the genotype and environment, and riJk a residual score of subject i with genotype j in environmental condition k consisting of the effects of other unlinked loci and environmental factors. The trait score xijk of subject i with genotype; in environmental condition k can then be written as: = o + j + ek + gejk + rijk (1) We assume two alleles, A, and A2, and two environmental conditions. Such restrictions are not required but are made to simplify the discussion of the tests. For a biallelic locus, there are three possible genotypes. To model the effects of these genotypes, we used the parameterization discussed by Falconer (11) in which a score a is assigned to A,A, subjects, d to A,A2 subjects, and -a to A2A2 subjects. Thus, the genotypic values are expressed on a scale where both homozygotes have additive genetic scores that differ a from the center of the scale. If d is zero, there is no dominance so that the heterozygote is exactly in the middle of the two homozygotes. If d is positive, there is dominance of the A, allele over the A2 allele and the value of the heterozygote is relatively closer to the A, A, genotype. If d is negative, there is dominance of the A2 allele Am J Epidemiol Vol. 150, No. 11, 1999 Genotype-Environment Interactions for Quantitative Trait Loci 1181 over the A, allele and the value of the heterozygote is relatively closer to the A2A2 genotype. The environmental effect was modeled by specifying parameter e in environmental condition 2 that represented the increase or decrease of the mean in condition 2 compared to condition 1. Finally, interaction effects were modeled by assigning an interaction effect / to A,A, subjects in environmental condition 2, an interaction effect n to AjA2 subjects in environmental condition 2, and an interaction effect -i to A2A2 subjects in environmental condition 2. Thus, i represented the difference between the value of a homozygote in condition 2 compared to the value of a homozygote in condition 1 after the environmental effect e has been taken into account. Parameter n represents different amounts of dominance in both conditions. The expected means in the six groups are summarized in table 1. Table 1 shows that the means are modeled with six parameters o, a, d, e, i, and n. With two environmental conditions and three genotypes, there are also six observed means. The model is therefore the saturated model and an unique value can be estimated for each parameter. To test for interaction effects, parameters, i, n, or both, can be fixed to zero. The test statistic equals the difference between the %2 of the restricted model and the x2 of the unrestricted model, and is X2 distributed with the number of fixed interaction parameters as the degrees of freedom. By constraining the variances across the three groups, o^ = a2, such a x2 difference test can also be used to examine whether the group variances have to be modeled by specifying a separate variance parameter <Xj in each group or a single variance parameter a 2 for all groups. In the presence of population admixture, an unbiased test for causal effects of the locus can be obtained TABLE 1. Expected group means for the tests wttti and without parents as controls Mean Without parents as controls Genotype A,A, Genotype A,A, Genotype A^A, With parents as controls Mating type A,A,; A.A, Genotype A,A, Genotype A^A, Mating type A ^ A,A, Genotype A,A, Genotype AjA, Mating type A,A,; A.A, Genotype A,A, Genotype A,Aj Genotype AJA, Environ mental corKfitton 1 Environmental condition 2 o+a o+ d o- a o+d+e+ n o-a+e-l o, + a o, + d o, + a + 0+ i o, + d + o + n ot + d o, + d + 6 + n o,-a ot- a o, + a o, + d o, + a +0+1 o, + d+0+n o,-a + e - i o,-a Am J Epidemiol Vol. 150, No. 11, 1999 o+a+ e+/ +0-1 by examining whether within parental mating type groups subjects with different genotypes have different means. For such TDT-like tests, only subjects with at least one heterozygote parent can be used because otherwise all subjects within that group have the same genotype (e.g., A(AI;A1A1 matings always yield A!A, genotypes). We want to note that this test is not identical to an ANOVA by parental mating type. This is because the test allows subjects in different parental mating type groups to have different means, and constrains only the means of genotypes within the same parental mating type group to be equal. The mean in each mating group equals ot = o + mh in which o is the overall constant and ml the effect of admixture on the mean in mating type group /. The admixture effect m depends on the differences in allele frequencies and trait means among subgroups in the sample. It does not affect the differences between genotype groups within mating type groups because in epidemiologic samples (also in selected samples when there are no causal genetic effects) the probabilities of getting different genotypes of offspring are identical in all subgroups. To account for admixture, formula 1 can therefore be rewritten as: (2) rijk An additional assumption of this model is that the interaction effect gejk is independent of the admixture effect ml and identical in each subgroup in the population. With two alleles, there are three mating types that involve at least one heterozygote parent (/ = 1..3). The expected means of the possible genotypes wihin each of the three mating type groups are shown in the second part of table 1. The whole test specifies eight parameters for the means: ou o2, o3, a, d, e, i, and n. To test for interaction effects, parameters /, n, or both, can again be fixed to zero and the x2 difference test with 1 or 2 degrees of freedom can be used to examine whether the interaction effects are significant. This difference test can also be used to examine how the variances in the seven groups have to be modeled. Example scripts for the tests discussed in this section are available from the author. The scripts involve the commonly used and excellent SEM program Mx (12), which can be downloaded from Internet site http://opal.vcu.edu/html/mx/mxhomepage.html free of charge. gejk POWER TO DETECT GENOTYPE-ENVIRONMENT INTERACTION In this section, we demonstrate how the power of these tests can be calculated and used to estimate the sample sizes needed to detect genotype-environment 1182 van den Oord interactions. The sample sizes needed to reject the incorrect hypothesis assuming no interaction were computed for 80 percent power with alpha (Type 1 error) is 5 percent. The power was calculated using the approach discussed by Satorra and Saris (13). Under the incorrect hypothesis, the test statistic is distributed approximately as a non-central X2 distribution with non-centrality parameter X. To determine X, we first computed the expected means and variances using the chosen values for the parameters a, d, e, i, n, and X2Next, a model was fitted to these input statistics that assumed no interaction (i and n were fixed to zero). The x 2 obtained from this analysis was used as the value of A, and the number of fixed interaction parameters (two) as the degrees of freedom of the noncentral x 2 distribution. Different patterns of genotype-environment interaction For the power calculations, we used the six types of biologically plausible patterns of genotype-environment interaction discussed by Khoury et al. (3, 4). An elaborate discussion and examples can be found in their articles and we will confine ourselves here to a short description. Because these authors only addressed situations without dominance, we added two subtypes that involved parameters d and n. The different types of interaction are depicted in figure 1 and the specification using our model parameters is shown in table 2. The first type is shown in figure la and shows the situation where the trait is increased only in the presence of both the high-risk genotype and high-risk environmental condition. Thus, the corresponding row in table 2 indicates that parameters a and e are zero and that i is larger than zero, implying higher means for genotypes with the A! allele in condition 2. Figure lb is a variation on interaction type 1 in which the interaction effect n for the heterozygote is also larger than zero. In this specific case, we assumed that the interaction effect for the heterozygote equaled the interaction effect for the homozygotes {n = i) so that in condition 2 the mean for A,A2 subjects equaled the mean of the AjA! subjects. The second interaction type is depicted in figure lc and applies when the trait is increased by the high-risk genotype in the high-risk environmental condition, as well as by exposure alone. Thus, parameter e is now larger than zero and figure lc shows that in comparison to figure la the means are higher for all genotypes in condition 2. Type 3 interaction is shown in figure Id and involves the situation where the trait is increased by the genotype alone but not by the exposure alone. In this situation, parameter a is larger than zero so that in com- parison to the previous figures there are now also differences between the genotypes in condition 1. A variation including additional dominance effects is depicted in figure le. Figure le shows that because of the dominance effects the mean of the heterozygotes has shifted d from the middle of the two homozygotes in both environmental conditions. Because we assumed that dominance parameter d equaled additive genetic parameter a, in condition 1 the means of the heterozygotes and A ^ homozygotes are the same. The fourth type of interaction is shown in figure If and assumes an effect of the environment alone, an effect of the genotype alone, and an interaction effect consisting of higher means for the high-risk genotype in the high-risk environment Thus, in terms of the model parameters shown in table 2 this implies that e, a, and i are all larger than zero. A special case of type 4 interaction is the multiplicative model where the interaction effect equals the product of genetic and environmental effect (i = e x a). Type 5 interaction assumes that the genotype is protective or decreases the mean in the absence of the exposure but becomes deleterious or increases the mean in the presence of the exposure. This implies that the value of a is larger than zero and that interaction parameter i is negative. An example is given in figure lg. Because the interaction parameter is larger than the additive genetic values (i = -2 x a) in figure lg, the lines cross and the ranking of the means are reversed in both conditions. Finally, the sixth type is identical to type 5 interaction except that there is also an effect of exposure alone. This situation is depicted in figure lh. A comparison with figure lg shows that due to the environmental effect e all genotypes have higher means in condition 2. Power calculations To compute the input statistics, we assigned a value of five to all non-zero parameters that were specified for a given interaction type (see table 2). For interaction types 5 and 6, we assumed that i was -10. The proportion of subjects in condition 1, b, and the frequency of the A, allele, p, were both assumed to be 0.5. For all interaction types, we fixed in the situation without parents as controls, the proportion of explained variance o^,, to 10 percent of the total variance so that the variance in the genotype groups equaled a 2 = 10 x <£xpl — a2^,,. The variances in the situation with parents as controls were assumed to be equal for all seven groups and identical to the variance in the situation without parents as controls: Strictly speaking, the power to detect interaction depends on the proportion of interaction variance and not the type of interaction. However, given the same amount of explained variance, some types will be easAm J Epidemiol Vol. 150, No. 11, 1999 Genotype-Environment Interactions for Quantitative Trait Loci Figure 1b: Type 1b interaction Figure 1a: Type 1a interaction H H Condition 2 Condition 1 1183 Condition 1 Condition 2 Figure 1d: Type 3a interaction Figure 1c: Type 2 interaction AjA, Condition 1 Condition 2 Condition 2 Figure 1h: Type 6 interaction Condition 2 FIGURE 1. Six types of genotype-environment interaction. Am J Epidemiol Vol. 150, No. 11, 1999 Condition 2 Condition 1 Figure 1g: Type 5 interaction Condition 1 Condition 2 Figure 1f Type 4ab interaction Figure 1e: Type 3b interaction Condition 1 Condition 1 Condition 1 Condition 2 1184 van den Oord TABLE 2. Specification of different kinds of genotype-environment Interactions* Interaction type Hi Type 1a Type 1b Type 2 Type 3a Type 3b Type 4a Type 4b Type 5 Type 6 0 0 x>0 0 0 x>0 x>0 0 x>0 X X X X X X X X X 0 0 0 x>0 x>0 x>0 x>0 x>0 x>0 0 0 0 0 x>0 0 0 0 0 x>0 x>0 x>0 x>0 x>0 x>0 = e*a x<0 x<0 0 x>0 0 0 0 0 0 0 0 • x denotes estimated parameter, 0 denotes a parameter that is fixed to zero. ier to detect because they imply a larger proportion of interaction variance. For instance, in the situation without parents as controls the explained variance o2^ equals: y Ow - U)2 o2 , = y J (3) k = O2 + O2k + O2k with \ijk as given in table 1, and M- = 2 j * \ij = ^ ^*My*» a r | d M* = 2 P/My*- Th e proportion of interaction variance can now be computed with OjiJ(p2 + o^i), where o2 is the variance within the groups. Substituting the chosen parameter values in formula 3 shows that the proportions of interaction variance equal 5.00 for interaction type la, 4.29 for type lb, 2.50 for type 2, 1.00 for type 3a, 0.83 for type 3b, 0.83 for type 4a, 3.29 for type 4b, 10.00 for type 5, and 6.67 for type 6. Formula 3 indicates that the explained variance equals the sum of the variance due to the main effect of the genotypic groups oj, the main effect of the environmental groups of, and the interaction variance ojk. It is important to realize that the effects of the genetic and environmental parameters seen in formula 1 are not equivalent to the main and interaction effects in formula 3. For instance, assume type 1 interaction in which there are no main effects and there is only a contribution of interaction parameter i, that the interaction parameter i equals 5, and that b = p = 0.5. For this sit- uation, the group means \ij for the A,A,, A,A2, and A2A2 genotypes are 2.5, 0, and -2.5, and of equals 3.125. This demonstrates that part of an interaction effect can show up as a main effect in the traditional ANOVA context. Formula 3 shows that the proportion of subjects in condition 1 and the frequency of the A j allele affect the interaction variance. To assess the impact of these factors, the power calculations were repeated with b = 0.1/0.9 and p = 0.1/0.9. In these calculations, the within-group variance for each type of interaction was fixed to the within group variance in the situation with b = p = 0.50. Results of the power calculations are shown in table 3. Forp = b = 0.5, the required sample size to detect interaction with 80 percent power and alpha is 5 percent was on average 450 and ranged from 100 for interaction type 5 (10 percent interaction variance) to 1,055 for interaction types 3b and 4a (0.83 percent interaction variance). Sample sizes increased when p and b deviated from 0.5. The required sample sizes were about the same for conditions with b = 0.1 and b = 0.9. This symmetry implied that it does not matter whether the proportion of subjects deviates from 0.5 in the direction of the high-risk or low-risk environmental condition. Despite small differences in interaction variance, results for p = 0.1 and p = 0.9 were also symmetric and showed that the required sample sizes did not depend on whether the A, or A2 allele was more frequent. An exception was the test for interaction type lb that involved interaction parameter n for the heterozygote and was much more powerful withp = 0.1 than withp = 0.9. With respect to the power for the test with parents as controls, a first remark is that it will always be lower than in the situation without parents as controls because only subjects with at least one heterozygous parent can be used. The proportion of selected subjects increases when the allele frequencies become more equal and the number of alleles become larger. For instance, in a random sample and with two alleles Am J Epidemiol Vol. 150, No. 11, 1999 Genotype-Environment Interactions for Quantitative Trait Loci 1185 TABLE 3. Sample sizes required to detect different types of genotype-environment Interaction for 80% power at alpha Is 5% In situations with and without parents as controls*,! 6 = 0.5 b = 0.Mb = 0.i Interaction type p = 0.1 p = 0.5 p = o.9 p = 0.1 p = 0.5 p = 0.9 Without parents as controls 1a 1b 2 3a 3b 4a 4b 5 6 511 268 994 2,440 2,922 2,922 762 269 390 187 218 361 881 1,055 1,055 278 100 144 511 4,292 994 2,440 2,922 2,922 762 269 390 1,538 836 2,885 6,907 8,246 8,246 2,240 858 1,199 563 666 1,045 2,491 2,973 2,973 814 320 1,538 14,373 2,885 6,907 8,246 8,246 2,240 858 1,199 With parents as controls 1a 1b 2 3a 3b 4a 4b 5 6 297 162 580 1,428 1,711 1,711 444 154 226 187 218 361 881 1,055 1,055 278 100 144 297 1,444 580 1,428 1,711 1,711 444 154 226 859 466 1,646 3,999 4,783 4,783 1,269 460 660 563 441 666 1,045 2,491 2,973 2,973 814 320 441 859 4,755 1,646 3,999 4,783 4,783 1,269 460 660 * For the situation with parents as controls, trie required sample sizes refer to the number of subjects with at least one heterozygote parent. t All power calculations assumed two degrees of freedom. about 75 percent of the subjects can be selected when p = 0.5 and about 32.8 percent can be selected when/? = 0.1 or p — 0.9. To examine the power as such and avoid obvious sample size effects caused by the selection of subjects, the sample sizes reported in table 3 for the situation with parents as controls referred to the selected number of subjects with a heterozygous parent. Thus, to obtain the initial pool parents and subjects that would need to be genotyped, the sample sizes reported in table 3 need to be divided by 0.75 when p = 0.5 and by 0.328 whenp = 0.1 or p = 0.9. The required sample sizes reported in table 3 show that with p = b = 0.5 the tests in the situation with and without parents as controls are equally powerful to detect interaction. With p = b = 0.1 or 0.9 sample sizes were lower in the situation with parents as controls. This indicated the more severe selection of subjects was partly compensated by the fact that more informative subjects were selected. DISCRIMINATIVE POWER The tests discussed in the previous section address the question whether or not interaction exists. Once this has been established, it is of interest to identify the type of interaction. This can be examined by fitting models that represent different types of interactions to Am J Epidemiol Vol. 150, No. 11, 1999 the data and choosing the model that yields the best fit. For this purpose, constraints need to be imposed on parameters a, d, e, i, n. These can be equality constraints in which parameters are fixed to zero, boundary constraints in which parameters are constrained to lie between certain values, and non-linear constraints in which parameters are constrained to be non-linear functions of other parameters. To illustrate the identification of the correct mechanism, we again used the interaction types discussed by Khoury et al. (3, 4). The constraints involved were those reported in table 2. Data were simulated on the basis of each interaction type and analyzed by fitting models representing all interaction types. Ideally, the model used to simulate the data gives the best fit and is selected. To select the best fitting model, we employed three different fit indices. First, we selected the model with the highest p value of the %2 goodness of fit index. The second index was Akaike's Information Criterion (14), AIC = %2 - 2 degrees of freedom (df), which has proven to be a good index in many situations (15). Smaller values of AIC indicate a better fit and we therefore selected the model with the lowest AIC. Finally, we used the Tucker-Lewis Index (TLI), TLI = - X,2/df,)/(}rj/df6 - 1), 1186 van den Oord the true interaction type under which the data were simulated and the columns to the interaction models used to analyze the simulated data. Results showed that there were marked differences. For instance, type 6 interaction was correctly identified in 99.7 percent of the 2,000 simulations, whereas type 4b interaction was correctly identified in 44.95 percent of the 2,000 simulations. Another salient finding was that interaction type 2 was relatively often mistakenly identified as interaction type 4b (in 33.35 percent of the 2,000 simulations), 4a as 4b (37.66 percent), and 4b as la (25.50 percent). A final remark is that the accuracy to detect the correct interaction model will depend on the factors that determine the power of the tests such as the number of observations, the proportion of subjects in both environmental conditions, and the allele frequencies. For instance, with the number of observations equal to 250 instead of 500, the average proportion of correctly identified interaction types using the p value, AIC, and TLI dropped in the situation without parents as controls from 75.8 percent, 78.7 percent, and 75.2 percent to 67.3 percent, 67.3 percent, and 66.8 percent. which reflects the improvement in fit of the target model (subscript t) compared to a baseline model (subscript b). Our baseline model assumed no group differences in means. The index ranges from 0 to about 1. Because larger values imply a better fit, we selected the model with the highest TLI. We included the TLI because research has shown that it possesses some favorable features such as being relatively insensitive to sample size (16). We wrote a Pascal program to simulate for each interaction type 2,000 data sets of 500 subjects. The parameter values used for this simulation were identical to the ones reported above for the power calculations with p = b — 0.5. This condition was chosen because our calculations showed that in this situation there is generally sufficient power to detect genotypeenvironment interaction so that it makes sense to try to discriminate between the various types. Results indicated that the proportion of correctly identified models on the basis of the p value, AIC, and TLI was on average 75.8 percent, 78.7 percent, and 75.2 percent in the situation without parents as controls, and 75.6 percent, 77.4 percent, and 75.2 percent in the situation with parents as controls. This suggested that if genotype-environment interaction can be detected, it will generally be possible to identify the correct interaction type. Compared to the other indices, slightly better results were obtained with the AIC. The differences between the situations with and without parents as controls were small. This indicated that these two designs did not differ substantially in identifying the mechanism of interaction given equal sample sizes. Because of the similarity of the results for the three different indices and the two tests, in table 4 we show only the complete results for the AIC in the situation without parents as controls. The rows in table 4 refer to TABLE 4. DISCUSSION The present paper shows how genotype-environment interaction can be studied for quantitative traits. Although we have confined ourselves to two environmental conditions and three genotypes, the extension to e environmental conditions and g genotypes is straightforward. Analogously, the e x g means can be modeled with one overall mean, e - 1 environmental effects, g - 1 genotype effects, and {e - \){g - 1) interaction parameters. The test can also be extended to designs with multiple siblings. Recently, Fulker et al. Discriminative power to Identify the correct type of genotype-environment Interaction*,t,+ Interaction type Without parents as controls 1a 1b 2 3a 3b 4a 4b 5 6 Interaction type 1a 1b 2 3a 3b 4a 4b 79.41 5.65 97.35 0.60 0.45 1.50 0.60 6.65 6.45 0.90 64.80 0.10 6.50 1.70 1.10 0.25 0.05 1.25 7.60 0.20 58.94 4.45 0.05 0.60 33.35 1.55 3.20 37.66 44.95 2.50 0.25 0.15 25.50 0.70 8.50 79.90 3.45 0.20 7.15 6.90 91.30 0.25 2.80 5 6 0.95 0.05 0.05 0.05 0.05 1.45 92.30 0.30 7.70 99.70 * Results were based on 2,000 replications. t The interaction types in the rows represent the true interaction type under which the data were simulated; the interaction types in the columns represent the type that was used to analyze the data. $ Bold numbers indicate percent of correctly identified models. Am J Epidemiol Vol. 150, No. 11, 1999 Genotype-Environment Interactions for Quantitative Trait Loci (8) showed how SEM can be used with sibling pairs by creating groups consisting of unique combinations of sibling pair genotypes, modeling the means of the siblings in these groups as a function of genetic effects, and estimating the correlations among siblings in a family to model the dependency in the data. To study interaction, this model could be extended by specifying a parameter e for all siblings in environmental condition 2 to estimate the main environmental effect, assign an interaction effect / to siblings with genotype A,A] in environmental condition 2, an interaction effect n to siblings with genotype A[A2 in environmental condition 2, and an interaction effect -i to siblings with genotype A2A2 in environmental condition 2. This sibling design also provides an alternative way to control for population admixture (15). That is, instead of testing whether within parental mating type groups the means of subjects with different genotypes differ, it can be tested whether within sibling pair genotypes groups the means of siblings with different genotypes differ. For all power calculations, we assumed that the noncentral x distribution had two degrees of freedom regardless of whether both interaction parameters were larger than zero. In a sense, this is a somewhat conservative approach. Under the assumption that a specific interaction effect for heterozygotes is unlikely, parameter n could be fixed to zero in advance. Genotypeenvironment interaction could then be tested by fixing parameter i to zero. This test is more powerful because it has only one degree of freedom so that fewer subjects will be required to detect interaction. Instead of assuming that parameter n is zero, it is also possible to test this empirically although it might be useful to determine the power of this test to exclude the possibility that a nonsignificant result may indicate a lack of power. Studying genotype-environment interaction may be important to obtain a better understanding of the disease mechanisms. In this respect, our power calculations and simulation study were encouraging because they suggested that genotype-environment interaction can be detected with realistic sample sizes and that Am J Epidemiol Vol. 150, No. 11, 1999 1187 once it is found it will generally be possible to identify the mechanism of interaction. REFERENCES 1. Khoury MJ. Genetic and epidemiologic approaches to the search for gene-environment interaction: the case of osteoporosis. Am J Epidemiol 1998; 147:1-2 2. McCall R. So many interactions, so little evidence: why? In: Wachs TD, Plomin R, eds. Conceptualization and measurement of organism-environment interaction. Washington, DC: American Psychological Association, 1991:142-61 3. Khoury MJ, Adams MJ, Flanders WD. An epidemiologic approach to ecogenetics. Am J Hum Genet 1988;42:89-95. 4. Khoury MJ, James LM. Population and family relative risks of disease associated with environmental factors in the presence of gene-environment interaction. Am J Epidemiol 1993;137: 1241-50. 5. Hwang SJ, Beaty TH, Liang KY, et al. Minimum sample size estimation to detect gene-environment interaction in case-control designs. Am J Epidemiol 1994; 140:1029-37. 6. Khoury MJ, Flanders WD. Nontraditional epidemiologic approaches in the analysis of gene-environment interaction: case-control studies with no controls! Am J Epidemiol 1996; 144:207-13. 7. Yang Q, Khoury MJ, Flanders WD. Sample size requirements in case-only designs to detect gene-environment interaction. Am J Epidemiol 1997; 146:713-20. 8. Fulker DW, Cherny SS, Sham PC, et al. Combined linkage and association sib-pair analysis for quantitative traits. Am J Hum Genet 1999;64:259-67. 9. Rubinstein P, Walker M, Carpenter C, et al. Genetics of HLA disease associations: the use of the haplotype relative risk (HRR) and the "haplo=delta" (DH) estimates in juvenile diabetes from three racial groups. (Abstract). Hum Immunol 1981;3:384. 10. Spielman RS, McGinnis RE, Ewens WJ. Transmission test for linkage disequilibrium: the insulin gene region and insulindependent diabetes mellitus (IDDM). Am J Hum Genet 1993; 52:506-16. 11. Falconer DS. Introduction to quantitative genetics. Essex, England: Longman, 1989 12. Neale MC. Mx: statistical modeling. Richmond, VA: University of Virginia, Department of Psychiatry, 1994. 13. Satorra A, Saris WE. Power of the likelihood ratio test in covariance structure analysis. Psychomet 1985 ;50:83-90. 14. Akaike H. Factor analysis and AIC. Psychomet 1987;52:317-32. 15. Williams LJ, Holahan PJ. Parsimony-based fit indices for multiple indicator models: do they work? Struct Equat Model 1994;l:161-89. 16. Marsh HW, Balla JR, McDonald RP. Goodness-of-fit indexes in confirmatory factor analysis: the effect of sample size. Psychol Bull 1988; 103:39^10