Download A Hardy-Weinberg Equilibrium Test for Analyzing Population

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microevolution wikipedia , lookup

DNA paternity testing wikipedia , lookup

Heritability of IQ wikipedia , lookup

Population genetics wikipedia , lookup

Public health genomics wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Genetic drift wikipedia , lookup

Genetic testing wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Transcript
American Journal of Epidemiology
Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health 2010.
Vol. 171, No. 8
DOI: 10.1093/aje/kwq002
Advance Access publication:
March 17, 2010
Practice of Epidemiology
A Hardy-Weinberg Equilibrium Test for Analyzing Population Genetic Surveys
With Complex Sample Designs
Ramal Moonesinghe*, Ajay Yesupriya, Man-huei Chang, Nicole F. Dowling, Muin J. Khoury,
and Alastair J. Scott for the CDC/NCI NHANES III Genomics Working Group
* Correspondence to Dr. Ramal Moonesinghe, Office of Minority Health and Health Disparities, 4770 Buford Highway, Mailstop E-67,
Atlanta, GA 30341 (e-mail: [email protected]).
Initially submitted May 1, 2009; accepted for publication January 5, 2010.
Testing for deviations from Hardy-Weinberg equilibrium is a widely recommended practice for population-based
genetic association studies. However, current methods for this test assume a simple random sample and may not
be appropriate for sample surveys with complex survey designs. In this paper, the authors present a test for HardyWeinberg equilibrium that adjusts for the sample weights and correlation of data collected in complex surveys. The
authors perform this test by using a simple adjustment to procedures developed to analyze data from complex
survey designs available within the SAS statistical software package (SAS Institute, Inc., Cary, North Carolina).
Using 90 genetic markers from the Third National Health and Nutrition Examination Survey, the authors found that
survey-adjusted and -unadjusted estimates of the disequilibrium coefficient were generally similar within selfreported races/ethnicities. However, estimates of the variance of the disequilibrium coefficient were significantly
different between the 2 methods. Because the results of the survey-adjusted tests account for correlation among
participants sampled within the same cluster, and the possibility of having related individuals sampled from the
same household, the authors recommend use of this test when analyzing genetic data originating from sample
surveys with complex survey designs to assess deviations from Hardy-Weinberg equilibrium.
complex survey design; design effect; genetic association studies; Hardy-Weinberg equilibrium
Abbreviations: DA, design adjusted; HWD, Hardy-Weinberg disequilibrium; HWE, Hardy-Weinberg equilibrium; NHANES (III),
(Third) National Health and Nutrition Examination Survey; RS, Rao-Scott; SRS, simple random sample.
Increases in the availability of genetic data have resulted
in a growing number of studies attempting to characterize
genetic susceptibility to common complex diseases. However, few genetic associations reported as significant are
replicated in multiple studies. Departures from HardyWeinberg equilibrium (HWE) in control populations may
indicate systematic genotyping errors and other biases that
could lead to lack of replication (1–3). Therefore, testing
HWE of marker genotype frequencies has been widely recommended as a crucial step in any population-based genetic
association study (4).
Several methods have been developed to test HWE in
simple random samples (SRSs). Emigh (5) compares various tests for the goodness of fit of a population with HWE
with respect to their power and the accuracy of their distri-
butional approximations. Several exact tests for HWE have
been proposed (6–8). Bayesian approaches to HWE tests
have been discussed recently (9–11). Still, the most common way to test for departures from HWE is through a
goodness-of-fit v2 test (12).
When data are expensive to collect for a population-based
genetic association study, it can be cost-efficient to sample
in 2 or more stages. Whittemore and Halpern (13) described
multistage sampling in genetic epidemiology and used genetic studies of US blacks and of prostate cancer to examine
some design issues involved in multistage sampling. Scott
and Wild (14) examined logistic regression models in casecontrol studies in which controls (and possibly cases) were
obtained by using a complex sampling plan involving multistage sampling. It is important to investigate the proper
932
Am J Epidemiol 2010;171:932–941
A HWE Test for Complex Survey Designs 933
analytic method when analyzing data from complex multistage surveys.
Selection of samples in surveys rarely involves just simple random sampling. Instead, more complex sampling
schemes are usually used, involving, for example, stratification and multistage sampling. Point estimates and the variance of population parameters are impacted by the value of
each observation’s analysis weight, adjustments for selection probabilities, and other survey design features such as
stratification and clustering. Hence, the assumption of simple random sampling can yield biased point estimates and/or
an underestimation of the variance of the point estimates if
the complex sampling design is ignored.
In this paper, we derive and present an appropriate test for
deviations in HWE in the analysis of population-based genetic data obtained from sample surveys with complex survey designs. We compare this method with results from
goodness-of-fit v2 tests that assume an SRS by using data
from the National Health and Nutrition Examination Survey
(NHANES).
MATERIALS AND METHODS
Table 1. Observed and Expected Genotypic Counts When HardyWeinberg Equilibrium Holds
Genotype
Observed no.
a
Expected no.
a
P̂A ¼
Wi Yij
i¼1 j¼1
n
P
2
2
¼
n1
P
Wi þ
i¼1
Wi
i¼1
n2
P
i¼1
2
n
P
Wi
:
Wi
i¼1
The impact of departures from simple random sampling
on the precision of the estimated frequency can be measured
by the design effect that represents the combined effect of
stratification, clustering, unequal selection probabilities, and
weighting adjustments for nonresponse and noncoverage.
This statistic can be calculated as
deff ¼ VarDA P̂ VarSRS P̂ ;
where VarDA ðP̂Þ is the variance of P̂ induced by the complex
sample design (design adjusted (DA)) and VarSRS ðP̂Þ is the
variance of P̂ that would have been obtained from an SRS of
the same size. If HWE is assumed, the design-based variance estimator of P̂A , VDA ðP̂A Þ can be calculated by using
the SURVEYFREQ procedure in SAS software (SAS Institute, Inc., Cary, North Carolina). The Taylor series linearization approach is used to approximate the estimated
variance in the SURVEYFREQ procedure (15).
Am J Epidemiol 2010;171:932–941
n2
P
N̂ 1 ¼
2
N̂ P̂A
Wi
N̂ 2 ¼
i¼1
2N̂ P̂A Q̂A
aa
Wi
N̂ 3 ¼
2
N̂ Q̂A
n3
P
Wi
i¼1
N̂ ¼ N̂ 1 þ N̂ 2 þ N̂ 3 .
Let PAA, PAa, and Paa be the population frequencies of
genotypes AA, Aa, and aa, respectively, and PA be the population frequency of the A allele. Let QA ¼ 1 – PA. We want
to test the null hypothesis that the population represented by
our sample is in HWE (i.e., H0: PAA ¼ P2A , PAa ¼ 2PAQA,
and Paa ¼ Q2A ) versus the alternative hypothesis that the
population is in Hardy-Weinberg disequilibrium (HWD).
We test HWE by using a goodness-of-fit v2 test. For samples
large enough, the expected values for the 3 estimated genotypic counts are given in Table 1.
The goodness-of-fit v2 statistic is
QP ¼
n P
2
P
Aa
n1
P
i¼1
Test for HWE
Suppose n individuals are genotyped at a biallelic (Aa)
marker. We are interested in some basic properties of this
marker, such as genotype frequency, allele frequency,
100(1 a) % confidence intervals of these frequencies,
and whether it is in HWE.
Let Yij ¼ 1 if the jth allele of a marker of the ith individual
is A; Yij ¼ 0 otherwise (j ¼ 1, 2, and i ¼ 1, 2, . . ., n). Let Wi
be the sample weight assigned to the ith individual and n1,
n2, and n3 denote the genotype frequencies of AA, Aa, and
aa, respectively. The weighted allele frequency for the allele
A is then given by
AA
n X ðObserved ExpectedÞ2
:
Expected
N̂ genotypes
Rao and Scott (16, 17) studied the impact of design effects on standard v2 and likelihood ratio tests for estimated
proportions in one-way and multiway tables. They showed
that the test statistic is asymptotically distributed as
a weighted sum of independent v21 variables and that the
weights are the eigenvalues of a generalized design-effects
matrix. Their analyses showed that survey design can have
a substantial impact on the type 1 error. For one-way tables,
the SURVEYFREQ procedure in SAS software provides
a design-based goodness-of-fit test when null proportions
are specified using the first-order and second-order corrections given by Rao and Scott. The Rao-Scott (RS) v2 test
statistic is computed as QRS ¼ QP =d, where d is a design
correction that requires knowledge of variance estimates (or
design effects) for only individual cells in the goodness-offit test statistic. For a one-way table with k cells, QRS has
a v2 distribution with (k 1) degrees of freedom. The
computation of QRS using the SURVEYFREQ procedure
in SAS software is given in Appendix 1 (18). We derive
the design correction for the v2 test to test for HWE from
first principles.
Let DA ¼ PAA P2A . DA is defined as the disequilibrium
coefficient (12). Testing for HWE is equivalent to testing
the hypothesis2 H0: DA ¼ 0 versus H1: DA 6¼ 0. With
D̂A ¼ P̂AA P̂A and expected values based on the null hypothesized frequencies, the one-way frequency table is
given in Table 2.
Under this scenario,
QHW
P
2
n X ðObserved ExpectedÞ2
nD̂A
¼ 2
¼
Expected
N̂ genotypes
PA ð1 PA Þ2
HW
HW
d .
and QHW
RS ¼ QP
934 Moonesinghe et al.
Table 2. Difference Between Observed and Expected Genotypic
Counts When Hardy-Weinberg Equilibrium Holds
Genotype
AA
a
Aa
aa
Observed no.
N̂ 1 ¼ N̂ P̂AA
N̂ 2 ¼ 2N̂ P̂Aa
N̂ 3 ¼ N̂ P̂aa
Expected no.a
N̂ P̂A
2N̂ P̂A Q̂A
N̂ Q̂A
Observed – expecteda
N̂ D̂A
2N̂ D̂A
N̂ D̂A
a
2
2
N̂ ¼ N̂ 1 þ N̂ 2 þ N̂ 3 .
2
Under the null hypothesis, QHW
RS has a v distribution with
1 degree of freedom.
The design correction dHW is given by
dHW ¼ 1 þ P̂A 1 2P̂A Deff 0 P̂AA
þ 2 1 2P̂A 1 P̂A Deff 0 P̂Aa
þ 2P̂A 1 2 P̂A Deff 0 P̂aa ;
where Deff0ðP̂AA Þ, Deff0ðP̂Aa Þ, and Deff0ðP̂aa Þ are the design effects for proportions P̂AA , P̂Aa , and P̂aa , respectively,
based on the null hypothesized values (Appendix 2). The
design correction is simply the design effect of the estimate
of the disequilibrium coefficient. For an SRS design,
Deff0ðP̂AA Þ ¼ Deff0ðP̂Aa Þ ¼ Deff0ðP̂aa Þ ¼ 1, which leads
to dHW ¼ 1, as expected.
2
It might be worthwhile to note that QHW
RS is just D̂A =VðD̂A Þ
with VðD̂A Þ given by the last equation in Appendix 2. Therefore, QHW
RS can be calculated even if only the variances of the
cell proportions are known, as is often the case. Since we
used the SURVEYFREQ procedure in SAS and substituted
our design correction for the design correction in the procedure, we expressed the design correction as a function of
design-effects estimates of genotype frequency.
sample persons. Specific populations including young children, older persons, non-Hispanic blacks, and Mexican
Americans were oversampled. Multiple sample persons
could be selected from the same household. Each sampled
person was assigned the inverse of his or her selection probability as the sampling weight. The final analysis weights
incorporated the selection probabilities and include adjustments for nonresponse. The nonresponse-adjusted weights
were further poststratified by age, gender, and race/ethnicity
to account for noncoverage and to bring the final national
estimates in line with population counts (21).
We examined 90 variants in 50 genes that may play a role
in the etiology of several diseases or conditions of public
health significance. In this study, we considered the 3 major
self-reported race/ethnic populations in the United States:
non-Hispanic white, non-Hispanic black, and Mexican
American. To explore the effect of differential sampling
from multiple ethnic populations, we also considered these
3 race/ethnicity groups combined. We removed any variants
with expected frequencies of less than 5 for each genotype
under the null hypothesis to avoid potential problems associated with v2 approximation, yielding 72, 68, 70, and 75
variants for non-Hispanic whites, non-Hispanic blacks,
Mexican Americans, and the combined race/ethnicity
group, respectively. The expected frequency was calculated
by multiplying the effective sample size by the genotype
frequency. The effective sample size is the sample size divided by the design correction for the genotype frequency
under the null hypothesis. To determine the origin and extent of differences between the DA HWE tests that adjusted
for the NHANES III complex survey design with HWE tests
conducted by assuming an SRS, we examined the differences in minor allele frequencies, HWD coefficients, and
variances of the HWD coefficient.
RESULTS
Application to NHANES III: US population-based survey
with complex design
NHANES is a nationally representative survey of the US
population conducted by the National Center for Health
Statistics. Genetic data are available through the National
Center for Health Statistics on 7,159 participants from phase
2 of the Third National Health and Nutrition Examination
Survey (NHANES III). Linkage of the NHANES III phenotype data with this genetic information provides the opportunity to conduct a vast array of outcome studies designed to
investigate the association of a wide variety of health factors
with genetic variation (19, 20).
The initial stage of sampling in the NHANES III design
consisted of stratifying and selecting primary sampling
units, which are counties or groups of counties in the United
States. The primary sampling units were assigned to a particular stratum based on health and demographic information such as age, sex, and ethnicity and are selected with
probability proportional to population size. After the primary sampling units were selected, they were subsampled
successively in terms of census enumeration districts, clusters of households, households, eligible persons, and, finally,
The DA and SRS minor allele frequencies were not so
different in any of the populations. However, the range was
largest for the combined race/ethnicity group (combined—
median: 0.0041, range: 0.152, 0.1021) compared with the
separate races/ethnicities (non-Hispanic whites—median:
0.0006, range: 0.0111, 0.0142; non-Hispanic blacks—
median: 0.0007, range: 0.0062, 0.0073; Mexican
Americans—median: 0.0002, range: 0.0132, 0.0139)
(Figure 1).
The differences between the DA and SRS HWD coefficients followed a pattern similar to that for the differences
between the minor allele frequencies. The combined population had the largest range (non-Hispanic whites—median:
0.0006, range: 0.0046, 0.0079; non-Hispanic blacks—
median: 0.0001, range: 0.0039, 0.0057; Mexican
Americans—median: 0.00003, range: 0.0101, 0.0067;
combined—median: 0.0016, range: 0.034, 0.0106).
The majority of differences between the DA variance
estimates and variance estimates calculated by assuming
an SRS design were greater than zero in all of the populations (Figure 1). We found that the design correction varied
from 0.64 to 3.45 with a median value of 1.44 for
Am J Epidemiol 2010;171:932–941
A HWE Test for Complex Survey Designs 935
Figure 1. Differences in minor allele frequencies (MAF), Hardy-Weinberg disequilibrium (HWD) coefficients, variance of HWD coefficients, and P
values between the design-adjusted (DA) and simple random sample (SRS) goodness-of-fit test. Com, combined; MA, Mexican American; NHB,
non-Hispanic black; NHW, non-Hispanic white.
non-Hispanic whites, from 0.47 to 2.61 with a median value
of 1.18 for non-Hispanic blacks, from 0.35 to 3.30 with
a median value of 1.34 for Mexican Americans, and from
0.72 to 14.45 with a median value of 2.68 for the combined
race/ethnicity group compared with each race/ethnicity
group. The coefficients of variation of weights for nonHispanic whites, non-Hispanic blacks, and Mexican
Americans were 75.7, 45.9, and 60.7, respectively, whereas
the coefficient of variance for the 3 combined race/ethnicity
groups was 132.1.
The discrepancy between P values from DA v2 tests and
from SRS v2 tests was higher for the combined group compared with the separate populations (Figure 1). Figure 2
shows scatter plots of P values obtained from the DA
and SRS v2 tests for HWE for non-Hispanic whites, nonHispanic blacks, Mexican Americans, and all 3 race/
ethnicity groups combined. The Spearman coefficients of
correlation between DA and SRS P values were 0.49,
0.73, 0.74, and 0.69 for non-Hispanic whites, non-Hispanic
blacks, Mexican Americans, and the combined group,
respectively.
We assumed a level of significance of a ¼ 0.05 for the
HWE tests. Genetic variants with significant DA test results
Am J Epidemiol 2010;171:932–941
but nonsignificant SRS results, or vice versa, for nonHispanic whites, non-Hispanic blacks, Mexican Americans,
and the combined groups are presented in Tables 3–6. All 4
markers in Table 3 resulted in HWE for the SRS test and
HWD for the DA test. Fifty percent of the markers tested in
Tables 4 and 5 resulted in HWE for the SRS test and HWD
for the DA test. Of the 22 markers shown in Table 6, 18 were
in HWD for the SRS test and HWE for the DA test.
DISCUSSION
In this paper, we present a method of testing HWE of
genetic data obtained from large-scale population surveys
that incorporate design features such as clustering, stratification, and unequal weights. We found, within ethnic populations, that our DA method provides similar estimates of
HWD coefficients but different estimates of the variance of
the HWD coefficients compared with the unadjusted
method. The test statistics derived from the 2 methods were
not significantly different in general, but, in a few cases,
different conclusions were possible. Choosing the results
of the SRS test would not be appropriate in these situations
because these results could be biased. Regarding the
936 Moonesinghe et al.
Figure 2. Comparison of P values obtained from design-adjusted (DA) and simple random sample (SRS) goodness-of-fit tests for HardyWeinberg equilibrium for A) non-Hispanic whites, B) non-Hispanic blacks, C) Mexican Americans, and D) all 3 race/ethnicity groups combined.
combined population, we found that both the HWD coefficient and variance differed between the methods, yielding
significant differences in P values. Hence, all analyses of
HWE need to incorporate sample weights and the cluster
design to avoid misinterpretation of results.
In a recent paper, Li and Graubard (22) derived a test for
HWE for complex sample designs by extending the standard
Pearson v2 test to a quadratic test statistic that converges in
distribution to a linear combination of independently iden-
tically distributed v2 random variables. The coefficients of
the linear combination are eigenvalues (generalized design
effects) of a generalized design matrix. The first-order RS
adjustment corrects the asymptotic expectation of the Pearson statistic by dividing the test statistic by the mean of the
unknown eigenvalues. As shown in Appendix 1, the SURVEYFREQ procedure estimates this mean as a function of
the design-effect estimates of the cell proportions. In our
method, we directly derived the approximate variance of
Table 3. Genetic Variants That Have a Significant DA Test Result and a Nonsignificant SRS-based Test Result for
non-Hispanic Whites
Design
Correction
SRS x2 Test
Statistic
DA x2 Test
Statistic
SRS
P Value
DA
P Value
0.2849
1.06687
0.09315
4.05995
0.76021
0.043912
0.39554
0.38466
1.11168
3.69649
8.95267
0.05453
0.002771
2,580
0.05543
0.05759
1.48911
3.64068
7.90103
0.05638
0.004941
2,621
0.39031
0.38104
0.97062
2.35951
5.4372
0.12452
0.019712
Sample
Size
SRS
MAF
RS1143623
(IL1B-09)
2,553
0.27419
RS1982073
(TGFB1-01)
2,580
RS361525
(TNF-04)
RS731236
(VDR-01)
Variant
DA
MAF
Abbreviations: DA, design adjusted; MAF, minor allele frequency; SRS, simple random sample.
Am J Epidemiol 2010;171:932–941
A HWE Test for Complex Survey Designs 937
Table 4. Genetic Variants That Have a Significant DA Test Result and a Nonsignificant SRS-based Test Result for
non-Hispanic Blacks
Sample
Size
SRS
MAF
DA
MAF
Design
Correction
SRS x2 Test
Statistic
DA x2 Test
Statistic
SRS
P Value
DA
P Value
RS17033
(ADH1B-05)
1,983
0.06556
0.06886
2.04383
5.63837
0.60586
0.01757
0.43635
RS2740574
(CYP3A4-02)
1,973
0.35656
0.35977
0.78473
4.72747
3.0101
0.02968
0.08275
RS2243250
(IL4-01)
1,988
0.35941
0.36044
1.60461
4.67857
2.96803
0.03054
0.08493
RS1126643
(ITGA2-01)
1,995
0.29073
0.2962
0.86825
2.18732
5.2004
0.13915
0.02258
RS1208
(NAT2-01)
1,976
0.38436
0.38273
1.0353
3.64309
4.57592
0.0563
0.03242
RS1799930
(NAT2-08)
1,987
0.28309
0.27971
1.52494
4.30104
2.25384
0.03809
0.13328
RS1799983
(NOS3-01)
1,979
0.12759
0.13106
0.91475
2.75324
6.72757
0.09706
0.00949
RS2070744
(NOS3-11)
1,997
0.15273
0.15143
0.75106
3.34888
8.7921
0.06725
0.00303
RS1001581
(XRCC1-23)
1,994
0.36785
0.36674
0.99246
1.61095
4.7698
0.20436
0.02896
Variant
Abbreviations: DA, design adjusted; MAF, minor allele frequency; SRS, simple random sample.
the disequilibrium coefficient and then expressed the design
correction as a function of the design-effect estimates of
genotype frequency (Appendix 2). Li and Graubard also
examined the accuracy of type I error for the minor allele
frequencies and the number of degrees of freedom for estimating the covariance matrix that affect the asymptotic dis-
tribution. The degrees of freedom for the variance estimates
can be rather small in some situations and thus using an F
distribution instead of a v2 distribution works much better,
as noted by Thomas and Rao (23).
To our knowledge, NHANES is the first survey with a national probability sample design to make population genetic
Table 5. Genetic Variants That Have a Significant DA Test Result and a Nonsignificant SRS-based Test Result for
Mexican Americans
Sample
Size
SRS
MAF
DA
MAF
Design
Correction
SRS x2 Test
Statistic
DA x2 Test
Statistic
SRS
P Value
DA
P Value
RS1229984
(ADH1B-02)
1,997
0.05734
0.05454
0.34759
2.17854
11.3665
0.13995
0.00075
RS1042713
(ADRB2-01)
2,012
0.41725
0.41754
0.72123
2.22763
15.6747
0.13556
0.00008
RS1800871
(IL10-01)
1,996
0.38853
0.38099
0.86537
5.41428
3.2626
0.01997
0.07088
RS1800872
(IL10-02)
1,989
0.38612
0.37927
1.24439
4.51289
1.5081
0.03364
0.21943
RS2243248
(IL4-02)
1,995
0.11955
0.12042
1.03095
4.97631
3.3562
0.0257
0.06695
RS11003125
(MBL2-11)
1,986
0.48615
0.49039
0.61979
0.59943
4.3774
0.43879
0.03642
RS1801394
(MTRR-01)
2,006
0.26271
0.26077
2.17738
4.0929
0.2536
0.04306
0.61458
RS2070744
(NOS3-11)
2,009
0.24764
0.25134
1.48019
6.81555
3.5963
0.00904
0.05791
RS10517
(NQO1-15)
1,993
0.06999
0.06505
0.50318
1.67786
4.3932
0.19521
0.03608
RS1982073
(TGFB1-01)
2,006
0.48455
0.49734
0.73455
2.59522
0.10719
0.00154
RS731236
(VDR-01)
2,062
0.234
0.23803
0.72136
1.48039
0.22371
0.02888
Variant
10.032
4.7746
Abbreviations: DA, design adjusted; MAF, minor allele frequency; SRS, simple random sample.
Am J Epidemiol 2010;171:932–941
938 Moonesinghe et al.
Table 6. Genetic Variants That Have a Significant DA Test Result and a Nonsignificant SRS-based Test Result for
All 3 Race/Ethnicity Groups Combined
Sample
Size
SRS
MAF
DA
MAF
Design
Correction
SRS x2 Test
Statistic
DA x2 Test
Statistic
SRS
P Value
DA
P Value
RS17033
(ADH1B-05)
6,546
0.14352
0.09619
3.30191
84.0162
3.26314
<0.00001
0.07085
CS3 (CBS-A)
6,634
0.13024
0.10252
1.28429
19.2444
2.5989
0.00001
0.10694
RS2472299
(CYP1A1-81)
6,502
0.30844
0.29408
2.17262
3.552
9.38943
0.05947
0.00218
RS1057910
(CYP2C9-01)
6,587
0.04099
0.0588
1.74484
4.722
3.15768
0.02978
0.07557
RS1800790
(FGB-01)
6,574
0.13637
0.17392
2.47273
8.4412
0.00045
0.00367
0.98304
RS2243248
(IL4-02)
6,544
0.11033
0.08819
1.13597
10.1855
0.01378
0.00142
0.90656
RS1805015
(IL4R-05)
6,528
0.22143
0.18323
1.47335
16.6996
2.17859
0.00004
0.13994
RS11003125
(MBL2-11)
6,489
0.34073
0.34349
5.63858
85.8332
3.03771
<0.00001
0.08135
RS7096206
(MBL2-12)
6,563
0.16502
0.20789
3.20942
11.1589
0.80699
0.00084
0.36901
RS1800469
(MGC4093-03)
6,591
0.33993
0.31326
3.48069
10.6258
2.09685
0.00112
0.1476
RS1801131
(MTHFR-01)
6,572
0.22801
0.28681
3.71405
12.4364
3.50174
0.00042
0.0613
RS1041983
(NAT2-05)
6,571
0.37133
0.36546
2.39437
8.7445
0.10395
0.00311
0.74714
RS1799930
(NAT2-08)
6,572
0.26407
0.30861
2.04206
6.3492
0.00322
0.01174
0.95477
RS1799983
(NOS3-01)
6,518
0.22622
0.29272
3.91872
6.2886
0.62103
0.01215
0.43066
RS2070744
(NOS3-11)
6,585
0.27434
0.35212
2.66737
29.9528
2.10065
<0.00001
0.14724
RS1800566
(NQO1-01)
6,519
0.24643
0.20412
1.89389
6.8019
0.01177
0.00911
0.91359
RS689453
(NQO1-08)
6,489
0.06203
0.07344
2.67823
5.5425
0.14813
0.01856
0.70033
RS361525
(TNF-04)
6,580
0.0519
0.05559
2.67393
0.1022
7.51465
0.74916
0.00612
RS731236
(VDR-01)
6,789
0.30896
0.35933
1.87878
2.7111
5.8414
0.09965
0.01565
RS1799782
(XRCC1-03)
6,531
0.08728
0.0573
2.69923
14.1928
1.82113
0.00016
0.17718
RS25487
(XRCC1-01)
6,564
0.27041
0.32903
5.05789
11.4193
0.13261
0.00073
0.71574
RS25489
(XRCC1-02)
6,527
0.06228
0.0444
0.72137
9.6848
0.07123
0.00186
0.78955
Variant
Abbreviations: DA, design adjusted; MAF, minor allele frequency; SRS, simple random sample.
data available to researchers for analysis. Therefore, allele
frequencies, genotype frequencies, and associations between disease and genetic variants are unbiased estimates
of the respective population parameters. For example, gene
variant databases such as the International HapMap Project
(http://www.hapmap.org) and the SNP500Cancer Database
(http://snp500cancer.nci.nih.gov) are convenience samples
and may not well represent the target populations (24, 25).
Population-based estimates obtained from the NHANES III
survey may help in estimating the US population that ben-
efits from genomic-based tools, such as risk factor reduction; disease screening efforts; or diagnostic tests, drugs, or
other preventive or therapeutic interventions.
There have been discussions about whether the sampling
design must be considered in the analysis of survey data
from large-scale health surveys. Some guidelines are provided for when to use sample weights and the features of the
sample design in the analysis of survey data (26, 27). There
are several reasons to take into account the sampling design
in testing for HWE. First, the HWE test examines whether
Am J Epidemiol 2010;171:932–941
A HWE Test for Complex Survey Designs 939
the population represented by the sample is in HWE. Assuming the sample is selected by using an SRS design does
not result in testing HWE for the target population. Second,
some correlation is likely among participants sampled
within the same cluster, such as the selection of multiple
individuals in a household in the NHANES III sample. The
number of individuals selected in a household in the sample
ranged from 1 to 11 with an average of 1.6, and some of
them may be related. When the v2 goodness-of-fit test for
HWE is used on samples with related individuals, the type I
error could be greatly inflated.
When calculating the variance for DA estimates, the first
2 stages of the sample design—strata and primary sampling
units—are considered. When there are more related people
in households within a primary sampling unit, the intraclass
correlation that measures the internal homogeneity for the
primary sampling units, and consequently the variance of
DA, increases. This increase in variance is taken into account in the design correction. A test for HWE suitable for
any samples with related individuals has been proposed
(28). However, our method does not require prior knowledge of the pedigree structure and can account for the correlation among the sampled members whether they are
related or not.
Tests of HWE may be used to examine the associations
between genetic variants and diseases. For example, Lee
(29) proposed scanning the genome for disease-susceptible
genes by testing for deviations from HWE in a gene bank of
affected individuals under the assumption that the source
population from which the cases arise is in HWE. Chen
and Chatterjee (30) proposed an alternative analytic strategy
to assess the association between a binary disease outcome
and a genetic marker by comparing the observed genotype
frequencies of cases with the expected genotype frequencies
of controls, assuming HWE. Song and Elston (31) derived
an HWD trend test by calculating the difference between the
HWE test statistic for cases and controls to fine-map
a disease-susceptible locus.
Methods to appropriately analyze genetic data from studies involving complex survey designs, such as the proposed
genome-wide association studies of NHANES III and the
National Children’s Study (www.nationalchildrensstudy.
gov), will be needed in the near future. The HWE test proposed in this paper represents an essential step in addressing
that need.
ACKNOWLEDGMENTS
Author affiliations: Office of Minority Health and Health
Disparities, Centers for Disease Control and Prevention,
Atlanta, Georgia (Ramal Moonesinghe); Office of Public
Health Genomics, Centers for Disease Control and Prevention, Atlanta, Georgia (Ajay Yesupriya, Man-huei Chang,
Nicole F. Dowling, Muin J. Khoury); and Department of
Statistics, The University of Auckland, Auckland, New
Zealand (Alastair J. Scott).
The findings and conclusions in this report are those of
the author(s) and do not necessarily represent the views of
Am J Epidemiol 2010;171:932–941
the Centers for Disease Control and Prevention/Agency for
Toxic Substances and Disease Registry.
Conflict of interest: none declared.
REFERENCES
1. Salanti G, Amountza G, Ntzani EE, et al. Hardy-Weinberg
equilibrium in genetic association studies: an empirical evaluation of reporting, deviations, and power. Eur J Hum Genet.
2005;13(7):840–848.
2. Hosking L, Lumsden S, Lewis K, et al. Detection of genotyping errors by Hardy-Weinberg equilibrium testing. Eur
J Hum Genet. 2004;12(5):395–399.
3. Trikalinos TA, Salanti G, Khoury MJ, et al. Impact of violations and deviations in Hardy-Weinberg equilibrium on postulated gene-disease associations. Am J Epidemiol. 2006;
163(4):300–309.
4. Kocsis I, Györffy B, Németh E, et al. Examination of HardyWeinberg equilibrium in papers of Kidney International: an
underused tool. Kidney Int. 2004;65(5):1956–1958.
5. Emigh TH. A comparison of tests for Hardy-Weinberg equilibrium. Biometrics. 1980;36(4):627–642.
6. Chapco W. An exact test of the Hardy-Weinberg law. Biometrics. 1976;32(1):183–189.
7. Guo SW, Thompson EA. Performing the exact test of HardyWeinberg proportion for multiple alleles. Biometrics. 1992;
48(2):361–372.
8. Wigginton JE, Cutler DJ, Abecasis GR. A note on exact tests
of Hardy-Weinberg equilibrium. Am J Hum Genet. 2005;
76(5):887–893.
9. Shoemaker J, Painter I, Weir BS. A Bayesian characterization
of Hardy-Weinberg disequilibrium. Genetics. 1998;149(4):
2079–2088.
10. Ayres KL, Balding DJ. Measuring departures from HardyWeinberg: a Markov chain Monte Carlo method for estimating
the inbreeding coefficient. Heredity. 1998;80(pt 6):769–777.
11. Montoya-Delgado LE, Irony TZ, de B Pereira CA, et al. An
unconditional exact test for the Hardy-Weinberg equilibrium
law: sample-space ordering using the Bayes factor. Genetics.
2001;158(2):875–883.
12. Weir BS. Genetic Data Analysis II. Sunderland, MA: Sinauer
Associates, Publishers; 1996.
13. Whittemore AS, Halpern J. Multi-stage sampling in genetic
epidemiology. Stat Med. 1997;16(1-3):153–167.
14. Scott AJ, Wild CJ. The analysis of multi-stage case-control
studies. In: Skinner CJ, Chambers R, eds. Analysis of Complex
Surveys. New York, NY: John Wiley and Sons; 2003:109–122.
15. Wolter KM. Introduction to Variance Estimation. New York,
NY: Springer-Verlag; 1985.
16. Rao JNK, Scott AJ. The analysis of categorical data from
complex sample surveys: chi-squared tests for goodness of fit
and independence in two-way tables. J Am Stat Assoc. 1981;
76(374):221–230.
17. Rao JNK, Scott AJ. On simple adjustments to chi-square tests
with sample survey data. Ann Stat. 1987;15(1):385–397.
18. SAS Institute, Inc. SAS/STAT 9.1 User Guide. Cary, NC: SAS
Institute, Inc; 2004.
19. NHANES III Genetic Data. Hyattsville, MD: National Center
for Health Statistics, Centers for Disease Control and
Prevention; 2007. (http://www.cdc.gov/nchs/nhanes/genetics/
genetic.htm). (Accessed February 19, 2010).
20. Chang MH, Lindegren ML, Butler MA, et al. Prevalence in the
United States of selected candidate gene variants: Third
940 Moonesinghe et al.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
National Health and Nutrition Examination Survey, 1991–
1994. Am J Epidemiol. 2009;169(1):54–66.
Ezzati T, Khare M. Nonresponse adjustments in a national
health survey. Proceedings of the Survey Research Methods
Section of the American Statistical Association. 1992;339–344.
Li Y, Graubard BI. Testing Hardy-Weinberg equilibrium and
homogeneity of Hardy-Weinberg disequilibrium using complex survey data. Biometrics. 2009;65(4):1096–1104.
Thomas DR, Rao JNK. Small-sample comparisons of level
and power for simple goodness-of-fit statistics under cluster
sampling. J Am Stat Assoc. 1987;82(398):630–636.
International HapMap Consortium. A haplotype map of the
human genome. Nature. 2005;437(7063):1299–1320.
Packer BR, Yeager M, Burdett L, et al. SNP500Cancer:
a public resource for sequence validation, assay development,
and frequency analysis for genetic variation in candidate
genes. Nucleic Acids Res. 2006;34(Database issue):D617–
D621.
Lemeshow S, Letenneur L, Dartigues JF, et al. Illustration of
analysis taking into account complex survey considerations:
the association between wine consumption and dementia in the
PAQUID Study. Personnes Ages Quid. Am J Epidemiol. 1998;
148(3):298–306.
Korn EL, Graubard BI. Epidemiologic studies utilizing surveys: accounting for the sampling design. Am J Public Health.
1991;81(9):1166–1173.
Bourgain C, Abney M, Schneider D, et al. Testing for HardyWeinberg Equilibrium in samples with related individuals.
Genetics. 2004;168(4):2349–2361.
Lee WC. Searching for disease-susceptibility loci by testing
for Hardy-Weinberg disequilibrium in a gene bank of affected
individuals. Am J Epidemiol. 2003;158(5):397–400.
Chen J, Chatterjee N. Exploiting Hardy-Weinberg Equilibrium
for efficient screening of single SNP associations from casecontrol studies. Hum Hered. 2007;63(3-4):196–204.
Song K, Elston RC. A powerful method of combining measures of association and Hardy-Weinberg disequilibrium for
fine-mapping in case-control studies. Stat Med. 2006;25(1):
105–126.
The design correction uses the null hypothesis proportions specified with the ‘‘TESTP¼’’ option. The design
correction is computed as
X
1 P0C Deff 0 ðP̂C Þ=ðC 1Þ;
d¼
C
¼ Var P̂
VarSRS P0C ¼ Var P̂C
where Deff
0 P̂C C
n 1 , P̂C is the proportion esti1 f P0C 1 P0C
mate for table level C, f is the overall sampling fraction, n is
the number of observations in the sample, Varð
P̂C Þ is the
design-based variance of P̂C , and VarSRS P0C is the
variance of the hypothesized proportion, P0C , for an SRS
of identical sample size.
For a stratified cluster sample design, define the
following:
h ¼ 1, 2, . . ., H is the stratum number with a total of H strata;
i ¼ 1, 2, . . ., nh is the cluster number within stratum h, with
a total of nh sample clusters in stratum h; j ¼ 1, 2, . . ., mhi is
the unit number within cluster i of stratum h, with a total of
mhi sample units from cluster i of stratum h; and Whij is the
sampling weight for unit j in cluster i of stratum h. Then,
nhi
H P
P
mhi is the total number of observations in the
n¼
h¼1 i¼1
sample.
Let dhij C ¼ 1 if observation (hij) is in column C; 0
otherwise. The estimates N̂ C and N̂ are given below.
N̂ C ¼
nh X
mhi
H X
X
dhij C Whij
h¼1 i¼1 j¼1
N̂ ¼
nh X
mhi
H X
X
Whij
h¼1 i¼1 j¼1
APPENDIX 1
For one-way tables, the RS (16, 17) v2 statistic provides
a design-based goodness-of-fit test for specified null proportions with the ‘‘TESTP¼’’ option in the SURVEYFREQ
procedure in SAS software. Under the null hypothesis, the
RS v2 statistic approximately follows a v2 distribution with
(C 1) degrees of freedom for a table with C levels. The RS
v2 QRS is computed as QRS ¼ QP/d, where d is the design
correction for one-way tables and QP is given by
X
2
QP ¼ ðn=N̂Þ
N̂ C EC =EC ;
C
where n is the total sample size, N̂ is the estimated overall
total, N̂ C is the estimated total for level C, and EC is the
expected total for level C under the null hypothesis. For
specified null proportions, the expected total for level C is
given by EC ¼ N̂P0C, where P0C is the null proportion for
level C.
APPENDIX 2
The null hypothesized proportions for the HWE test
2
are given by PAA ¼ P̂A, PAa ¼ 2P̂A ð1 P̂A Þ, and Paa ¼
2
ð1 P̂A Þ . We express the allele frequency for the
A allele, P̂A , as a function of the genotype frequencies:
P̂A ¼ P̂AA þ 0:5P̂Aa ¼
1 þ P̂AA P̂aa
:
2
2
1 þ P̂AA P̂aa
Because D̂A ¼ P̂AA ¼ P̂AA ,
2
@ D̂A
1 þ P̂AA P̂aa
@ D̂A
¼ 1
¼ 0;
¼ 1 P̂A ;
2
@ P̂AA
@
P̂Aa
@ D̂A
1 þ P̂AA P̂aa
¼
and
¼ P̂A .
2
@ P̂aa
2
P̂A
Let V ¼ (vij/n), i ¼ 1, 2, 3; j ¼ 1, 2, 3 be the variance
covariance matrix of ðP̂AA ; P̂Aa ; P̂aa Þ for a general survey
Am J Epidemiol 2010;171:932–941
A HWE Test for Complex Survey Designs 941
design in which n is the sample
series linearization, n
2
v11
VðD̂A Þ 1 P̂A 0 P̂A 4 v12
v13
size. Then, from Taylor
v12
v22
v23
30
1
v13
1 P̂A
A
v23 5@ 0
v33
P̂A
2
2
¼ 1 P̂A v11 þ 2P̂A 1 P̂A v13 þ P̂A v33 :
Because P̂1 þ P̂3 ¼ 1 P̂2 , v11 þ 2v13 þ v33 ¼ v22
and nVðD̂A Þ ð1 2P̂A Þð1 P̂A Þv11 þ P̂A ð1 P̂A Þv22 þ
P̂A ð2P̂A 1Þv33 .
The design effects for proportions P̂AA , P̂Aa , and P̂aa are
given by
v11
v22
Deff 0 ðP̂AA Þ ¼ , Deff 0 ðP̂Aa Þ ¼ , and Deff 0 ðP̂aa Þ ¼
v01
v02
v33
, where v01/n, v02/n, and v03/n are the variances of P̂AA ,
v03
Am J Epidemiol 2010;171:932–941
P̂Aa , and P̂aa for SRS designs based on the null hypothesized
2
2
proportions. Finally, using the fact that v01 ¼ P̂A 1 P̂A ,
and
v03 ¼
v02 ¼ 2P̂A ð1 P̂A Þð1 2P̂A ð1 P̂A ÞÞ,
2
2
ð1 P̂A Þ ð1 ð1 P̂A Þ Þ, and the variance of D̂A under
2
2
the null hypothesis for an SRS, nVsrs ðD̂A Þ ¼ P̂A 1 P̂A ,
the design correction, dHW, is given by
dHW ¼
VðD̂A Þ
¼ ð1 þ P̂A Þð1 2P̂A ÞDeff 0 ðP̂AA Þ
Vsrs ðD̂A Þ
þ 2ð1 2P̂A ð1 P̂A ÞÞDeff 0 ðP̂Aa Þ
þ ð2P̂A 1Þð2 P̂A ÞDeff 0 ðP̂aa Þ: