Download Linkage analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetic engineering wikipedia , lookup

RNA-Seq wikipedia , lookup

Heritability of IQ wikipedia , lookup

Genetic drift wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Population genetics wikipedia , lookup

Fetal origins hypothesis wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Designer baby wikipedia , lookup

Medical genetics wikipedia , lookup

Tay–Sachs disease wikipedia , lookup

Behavioural genetics wikipedia , lookup

Human genetic variation wikipedia , lookup

Microevolution wikipedia , lookup

Genome (book) wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Tag SNP wikipedia , lookup

Public health genomics wikipedia , lookup

Transcript
Mapping of complex traits
Andy Willaert
Center for Medical Genetics Ghent
Complex traits
Complex traits: Diabetes, Crohn, Hypertension, Osteoporosis,...
Complex inheritance patterns



Gausse curve
Many different gene-variants involved each having a small effect!
Importance of environmental factors!
Importance of gene-gene interactions and gene-environment
interactions!
Traditional linkage analyses difficult for complex traits
Mapping of complex traits
 Model-based linkage analysis (parametric):
 Depends on knowing that a mutation in a single gene is inherited in a
specific mendelian inheritance pattern
 Powerful method for mapping single-gene disorders
 Not very useful for complex traits
 Model-free linkage analysis (non-parametric):
 Does not assume any particular mode of inheritance to explain the
inheritance pattern
 Depends solely on the assumption that affected relatives will be more
likely to have disease-predisposing alleles in common than is expected
by chance.
Affected sibpair method
 Affected sibpair method
In general:
 Relies on pairs of family members such as siblings, concordant for the
phenotype
 Siblings have on average one allele of two in common at any locus (Full
siblings share on average 50% of their DNA)
 If an allele is shared more frequently than expected (more than 50%) by
sibs concordant for a particular phenotype, than the allele predisposes
to that phenotype
Affected sibpair method
 Affected sibpair method
In practice:
 DNA of a set of affected sibs or affected individuals in families is analysed
by use of hundreds of polymorphic markers throughout the entire genome
(genome scan)
 Elevated degrees of allele sharing (significantly more than 50 %) between
affected pairs at a polymorphic marker suggests that a locus involved in
the disease is located close to the marker
 Degree of allele-sharing can be assessed by use of a non-parametric
LOD-score (NPL-score) which is comparable to parametric LOD-score
 NPL-score >3.6 = evidence for increased allele-sharing
NPL-score >5.4 = highly significant increased allele-sharing
Affected sibpair method
Affected sibpair method does not require to make assumptions
about the inheritance patterns, but method is rather insensitive
and imprecise
 Insensitivity is reflected in the fact that large numbers of sibpairs or
relatives are required to detect a significant deviation from the expected
50% allele-sharing – many hundreds/thousands of sibpairs or families
needed
 Imprecise: Only broad regions of increased allele-sharing can be
identified and not a narrow, critical interval as in model-based linkage
analysis
Association analysis
Association analysis

Analysis of the DNA of two groups of participants: people with the
disease being studied and similar people without the disease.

If certain genetic variations are found more frequently in people with
the disease compared to people without disease, the variations are
said to be "associated" with the disease.
Association analysis

The strength of an association between disease and genotype is calculated
by an odds ratio
Patients
Controls
Totals
Allele A present
a =23
b=4
a+b
Allele A absent
c=97
d=116
c+d
Totals
a+c
b+d
Disease Odds Ratio for allele A = the chance that an allele A carrier
develops the disease divided by the chance that an allele A noncarrier develops the disease
a
Disease Odds Ratio for allele A = b = ad = 23X116 = 6.9
c bc 4X97
d
!! Seven times higher chance of getting the disease if a person carries
the allele A than if the person carries the B allele
Association analysis
 The significance of an association can be assessed by performing
χ2 test:
Patients
Controls
Totals
Allele A present
a =23
b=4
a+b
Allele A absent
c=97
d=116
c+d
Totals
a+c
b+d
Test if values of a, b, c and d differ from what would be expected if
there was no association
Χ2 = 15 with 1 df; P < 10-10 Highly significant association between
allele A and the disease!
Association analysis
 Strengths association studies:
 Powerful tool for pinpointing precisely the genes and the alleles that
contribute to genetic disease
 No need to carry out laborious family studies and collection of samples
from many members of a pedigree
 Weaknesses association studies:
 Population stratification:
- A disease that happens to be more common in a certain subpopulation
and any allele that also happens to be more common in that certain
subpopulation can be falsely associated.
- Can be avoided by careful selection of cases and controls (not sampled
from different subpopulations) or by using family-based association study
designs
Association analysis
 Weaknesses association studies:
 Linkage disequilibrium (LD):
- All alleles in LD with an allele involved in the disease will show an apparently
positive association whether they have any functional relevance in disease
predispotion or not
- Still useful, since the associated alleles must at least be in loci that are close
enough to the real disease locus to appear associated
LD1
LD2
A T
A
G C
Genome-wide Association analysis

Genome-wide association (GWA)studies

Until recently, association studies have been limited to particular sets of
variants in restricted sets of genes

Recently more powerful genome-wide association studies are being
performed, without any preconception of what genes and genetic variants
migth be contributing to the disease
Genome-wide Association analysis

What has made genome-wide association possible?
1) Publication of the sequence of the human genome in 2001. This
sequence has been very informative about the vast majority of bases that
are invariant across individuals.
2) HAPMAP project focuses on DNA sequence differences among
individuals → SNPs were characterised in 270 individuals in four different
populations: European, African, Chinese and Japanese populations and
a first map of 1.3 million common SNPs was published in 2005, extended
to 3.1 million SNPs in 2007. LD-patterns between SNPs revealed.
3) Genome-wide association studies require the ability to genotype a
sufficiently set of variants in a large patient sample for a low cost: High
throughput genotyping platforms available: Affymetrix/Illumina chips
Genome-wide Association analysis
 Tagging SNPs for genome-wide association
 Hapmap provides information about LD between SNPs on the genome
and divides the genome into LD-blocks of about 10 kb in European
population
Restricted number of
haplotypes within LD
block
Tagging SNPs capture most frequent haplotypes
 Genotyping a few hundred thousand tag SNPs in a GWA-study only
a bit less useful than genotyping all 10 million common SNPs
Genome-wide Association analysis
 A Catalog of Published Genome-Wide Association Studies
http://www.genome.gov/gwastudies/
CDCV VERSUS CDRV
Nature of genetic component contributing to complex traits?
• ‘Common Disease, Common Variant (CDCV)’ hypothesis: genetic
variations with relatively high frequency in the population, but
relatively low penetrance, are the major contributors tot genetic
susceptibility to common diseases.
But: Genetic variants from GWA: explain only small fraction
(5%) of heritable risk for common diseases
• ’Common Disease, Rare Variant (CDRV)’ hypothesis: multiple rare
DNA sequence variations, each with relatively high penetrance, are
the major contributors to genetic susceptibility to common diseases
Linkage versus Association
Linkage versus Association
Case studies
Case studies
Positional cloning: the overall strategy of mapping the
location of a disease gene by linkage/associaton,
followed by attempts to identify the gene on the basis of
its map position.
Case studies
 Positional cloning of a complex disease by genome-wide
association: Age-related Macular Degeneration (AMD)
 Progressive degenerative disease of the portion
of the retina, responsible for central vision causing
blindness in 1.75 million Americans older than 50y
 Characterised by the accumulation of extracellular
protein behind the retina in the region of the macula
 Ample evidence for a genetic contribution, although most AMD patients
are not in families with a clear mendelian pattern
 Environmental contributions important (increased risk of AMD in
cigarette smokers)
Case studies
 Positional cloning of a complex disease by genome-wide
association: Age-related Macular Degeneration (AMD)
 Case (96) –control (50) genome-wide association study using 116.000
SNPs revealed association of alleles at two common SNPs with AMD.
 Both alleles showed an odds ratio of 4 and 7 in affected individuals who
were respectively heterozygous and homozygous for either of these
alleles.
 Both SNPs were located within an intron of the gene encoding
complement factor H (CFH), important in inflammation
 Examination of the HAPMAP revealed that these two SNPs were in LD
with SNPs across a 41 kB LD-block on chromosome 1
 Search through the SNPs in the 41 kb LD-block revealed a
nonsynonymous SNP (Tyr402His) in the CFH gene, with even stronger
association with AMD
Case studies
 Positional cloning of a complex disease by genome-wide
association: Age-related Macular Degeneration (AMD)
 Replication in other case-control samples with AMD and estimated to be
responsible for 43% of all the genetic contribution to the disease
 CFH protein is found in retinal tissue, protecting against inflammation
and the resulting accumulation of extracellular protein. The Tyr402His
variant of the CFH gene is less protective!
 Consequently, variants in other components of the complement system
have been investigated as candidate loci for AMD: SNPs in factor B and
complement factor 2, altering amino acids, are associated with AMD.
 Conclusion: For the complex disorder AMD, a genome-wide association
study finally led to the identification of SNPs at CFH, complement factor 2
and factor B, estimated to account for most of the genetic contribution to
AMD.
Case studies
 Positional cloning of a complex disease by model-free
Linkage mapping: Inflammatory Bowel Disease (Crohn)
 Chronic inflamatory disease of the gastrointestinal
tract that primarily affects adolescents and young
adults
 Divided into two major categories: Crohn disease
and ulcerative colitis (UC)
 Family and Twin studies provided ample evidence
for a genetic contribution to Crohn, although
most patients are not in families with a clear
mendelian pattern
Case studies
 Positional cloning of a complex disease by model-free
Linkage mapping: Inflammatory Bowel Disease (Crohn)
 Many genome scans using model-free linkage analysis carried out in
families with two or more IBD affected individuals
 11 genomic regions with positive NPL scores, the one with the highest
score (>5,4) showing linkage to Crohn only and not to UC (most of the
other regions showed linkage to both forms of IBD)
 A locus, termed IBD1, was proposed to reside in this region (16q12) of
the highest LOD-score
 Association study using SNPs in the region of 160 kb around the
marker with the highest NPL score revealed three SNPs with strong
evidence for LD with the disease.
 Three SNPs located in the coding exons for the gene NOD2 or
CARD15, causing either amino acid substitutions (Arg702Trp,
Gly908Arg) or premature protein termination (Leu1007fsinsC)
Case studies

Positional cloning of a complex disease by model-free
Linkage mapping: Inflammatory Bowel Disease (Crohn)
 NOD2 protein binds to gram-negative bacterial cell walls and
participates in the inflammatory response to bacteria by activating NFkB transcription factor in mononuclear leukocytes
 The three variants reduce the ability of NOD2 to activate NF-kB, altering
the ability of monocytes in intestinal wall to respond to resident bacteria,
predisposing to an abnormal inflammatory response
 Additional association studies in several independent cohorts of
Crohn patients confirmed strong association of the three variants with
Crohn
 Genetic contribution of NOD2 variants is supported by a dosage effect:
-Heterozygotes for NOD2 variants have odds ratio of 1.5 to 4
-Homozygotes for NOD2 variants have odds ratio of 15 to 40
Case studies
 Positional cloning of a complex disease by model-free
Linkage mapping: Inflammatory Bowel Disease (Crohn)
 Discovery of NOD2 variants helps explain complex inheritance
pattern in Crohn:
1) Three NOD2 variants not necessary to cause Crohn
» Half of all white patients with Crohn disease have one or two
copies of a NOD2 variant, half do not.
» Three NOD2 variants are associated with Crohn in Europe, but
are not found in Asian or African populations (NOD2 is not
associated with Crohn in these populations)
2) Three NOD2 variants not sufficient to cause Crohn
» 20 % of the European population is heterozygous for the three
variants and show no signs of Crohn
» Homozygotes and compound heterozygotes for the NOD2
variants show penetrance less than 10%
Case studies
 Positional cloning of a complex disease by model-free
Linkage mapping: Inflammatory Bowel Disease (Crohn)
 Conclusions:
1) Other genetic or environmental factors acting on the genotypic
susceptibility at the NOD2 locus
2) The obvious connection between Crohn disease (inflammatory bowel
disease) and structural variants in NOD2 (modulator of antibacterial
inflammatory response) is a strong clue to what some of these other
genetic /environmental factors might be
3) Genetic analysis of Crohn disease exemplifies how to think about
complex traits and how to identify genetic contributions