Download (2008) Multiple testing on the directed acyclic graph of gene

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
SUPPLEMENTARY MATERIALS AND METHODS
GWAS and replication study
The case and control samples from the German GWAS, and the samples used to replicate
significant gene sets, are described in detail elsewhere (Frank et al, 2012, Edenberg et al,
2010, Bierut et al, 2010). To replicate significant gene sets, independent GWAS datasets were
retrieved
from
the
database
of
genotypes
and
phenotypes
(dbGAP,
http://www.ncbi.nlm.nih.gov/gap). These were generated within: ‘CIDR: Collaborative Study
on the Genetics of Alcoholism’ (COGA; European ancestry subsample only); and ‘Study of
Addiction: Genetics and Environment’ (SAGE; European ancestry subsample only). Funding
support for the CIDR-COGA study was provided through the Center for Inherited Disease
Research (CIDR) and the Collaborative Study on the Genetics of Alcoholism (COGA). The
CIDR-COGA study is a genome-wide association study funded as part of the COGA.
Assistance with phenotype harmonization and genotype cleaning, as well as with general
study coordination, was provided by the COGA. Assistance with data cleaning was provided
by the National Center for Biotechnology Information. Support for the collection of datasets
and samples was provided by the COGA (U10 AA008401). Funding support for genotyping,
which was performed at the Johns Hopkins University Center for Inherited Disease Research,
was provided by the NIH GEI (U01HG004438), the National Institute on Alcohol Abuse and
Alcoholism, and the NIH contract ‘High throughput genotyping for studying the genetic
contributions to human disease’ (HHSN268200782096C). The datasets used for the analyses
described
in
this
manuscript
were
http://www.ncbi.nlm.nih.gov/sites/entrez?Db=gap
obtained
through
from
dbGaP
dbGaP
accession
at
number
phs000125.v1.p1. Funding support for the Study of Addiction: Genetics and Environment
(SAGE) was provided through the NIH Genes, Environment, and Health Initiative (GEI)
(U01HG004422). SAGE is one of the genome-wide association studies funded as part of the
Gene Environment Association Studies (GENEVA) under GEI. Assistance with phenotype
harmonization and genotype cleaning, as well as with general study coordination, was
provided by the GENEVA Coordinating Center (U01 HG004446). Assistance with data
cleaning was provided by the National Center for Biotechnology Information. Support for the
collection of datasets and samples was provided by the COGA (U10 AA008401), the
Collaborative Genetic Study of Nicotine Dependence (P01 CA089392) and the Family Study
of Cocaine Dependence (R01 DA013423). Funding support for genotyping, which was
performed at the Johns Hopkins University Center for Inherited Disease Research, was
provided by the NIH GEI (U01HG004438), the National Institute on Alcohol Abuse and
Alcoholism, the National Institute on Drug Abuse and the NIH contract ‘High throughput
genotyping for studying the genetic contributions to human disease’ (HHSN268200782096C).
The datasets used for the analyses described in this manuscript were obtained from dbGaP at
http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000092.v1.p1
through dbGaP accession number phs000092.v1.p.
Since an adequate replication sample size is essential, the COGA and SAGE samples were
merged and a set based analysis was run for the 19 datasets resulting from the initial set based
analysis of the German sample.
The COGA and SAGE datasets show partial overlap. Following removal of duplicates (n=
773) and outliers (n=12; individuals deviating > 6 standard deviations from the mean on any
of the first 2 principal coordinates from multidimensional scaling), the merged COGA/SAGE
dataset comprised 1333 cases and 1437 controls.
Global Test
For the gene-set based analysis, R-package globaltest version 5.12.0 was applied. Details of
the Global test are provided elsewhere (Goeman et al, 2004; Juraeva et al, 2014). A logistic
regression model was applied, using AD case-control status as the response variable and the
number of minor alleles on the considered loci as predictor variables. A logistic regression
model was used, since the response variable (case-control status) is binary. Calculation of
gene-set association scores took several potential confounding factors into account. Firstly, to
account for possible underlying correlation structure, the Global Test with subject sampling
was applied on the basis of 10,000 permutations of case-control status. Secondly, to account
for potential gender differences, gender was included as a covariate. Thirdly, to account for
differences in pathway size, a SNP label permutation test with 1000 permutations was
performed. Since the main aim of the discovery stage was the identification of associated
pathways, less conservative correction for multiple testing was applied. Multiplicity
correction was applied for each individual collection of gene-sets. Each gene-set collection,
with the exception of the GO terms, was subjected to Benjamini-Hochberg correction for
multiple testing (Benjamini and Hochberg, 1995). For the GO terms, the Focus-level method
(globaltest R package; Goeman and Mansmann, 2008) was applied. During the replication
stage, the Benjamini-Hochberg method was applied to all tested gene-sets.
Investigation of association between free-access alcohol self-administration and the XRCC5
marker rs828701 in social drinkers
Subjects
Participants were instructed to induce pleasant alcohol effects, i.e. as if they were drinking
alcohol at a week-end party, but to avoid unpleasant alcohol effects. An intravenous line was
inserted. The participant could then press a button to release bouts of ethanol infusions (6%
v/v in normal saline). Each bout caused the blood alcohol concentration (BAC) to rise by 7.5
mg% (=0.075 g ethanol/kg blood) within a period of 2.5 minutes. Thereafter, the BAC fell by
-1mg% per minute until the subject’s next response. The button was connected to a computerassisted infusion system (CAIS). The button was deactivated for the 2.5 minute period in
which the BAC increased, and the CAIS enforced a safety limit of 120 mg%. A
physiologically-based pharmacokinetic model was used to calculate the required infusion
rates. The infusion rates differed between subjects in order to adjust for height, weight, sex,
and age. The main outcome measure was the maximum BAC achieved during selfadministration. The free-access paradigm is influenced by alcohol tolerance, hedonistic
aspects of drinking („liking“), and craving for alcohol („wanting“). Liking, wanting, and
sedative responses were shown to differ between heavy and light drinkers, and are of
relevance in terms of the later development of alcohol use disorder symptoms (King et al,
2011, King et al, 2014).
REFERENCES
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and
powerful approach to multiple testing. J R Stat Soc Ser B Stat Methodol 57: 289-300.
Bierut LJ, Agrawal A, Bucholz KK, Doheny KF, Laurie C, Pugh E, et al (2010) A genomewide association study of alcohol dependence. Proc Natl Acad Sci U S A 107(11):5082-5087.
Edenberg HJ, Koller DL, Xuei X, Wetherill L, McClintick JN, Almasy L, et al (2010).
Genome-wide association study of alcohol dependence implicates a region on chromosome
11. Alcohol Clin Exp Res 34(5): 840-852.
Frank J, Cichon S, Treutlein J, Ridinger M, Mattheisen M, Hoffmann P, et al (2012).
Genome-wide significant association between alcohol dependence and a variant in the ADH
gene cluster. Addict Biol 17(1): 171-180.
Goeman JJ, van de Geer SA, de Kort F, van Houwelingen HC (2004). A global test for groups
of genes: testing association with a clinical outcome. Bioinformatics 20(1): 93-99.
Goeman JJ, Mansmann U (2008) Multiple testing on the directed acyclic graph of gene
ontology. Bioinformatics 24(4): 537-544.
Juraeva D, Haenisch B, Zapatka M, Frank J, GROUP Investigators, iPSYCH-GEMS SCZ
working group, et al (2014). Integrated pathway-based approach identifies association
between genomic regions at CTCF and CACNB2 and schizophrenia. PLOS Genet (accepted).
King AC, de Wit H, McNamara PJ, Cao D (2011). Rewarding, stimulant, and sedative alcohol
responses and relationship to future binge drinking. Arch Gen Psychiatry. 68(4):389-399.
King AC, McNamara PJ, Hasin DS, Cao D (2014). Alcohol challenge responses predict future
alcohol use disorder symptoms: a 6-year prospective study. Biol Psychiatry. 75(10):798-806.