* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download A-13-LinkageAnalysis
Biology and consumer behaviour wikipedia , lookup
Gene expression profiling wikipedia , lookup
Genetic drift wikipedia , lookup
Fetal origins hypothesis wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Y chromosome wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Neocentromere wikipedia , lookup
Genome evolution wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Skewed X-inactivation wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Population genetics wikipedia , lookup
Genomic imprinting wikipedia , lookup
X-inactivation wikipedia , lookup
Designer baby wikipedia , lookup
Gene expression programming wikipedia , lookup
Human leukocyte antigen wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome (book) wikipedia , lookup
Public health genomics wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Microevolution wikipedia , lookup
Gene Hunting: find genes responsible for a given disease
Main idea: If a disease is statistically linked with a marker
on a chromosome, then tentatively infer that a gene causing
the disease is located near that marker.
.
Some slides were prepared by Ma’ayan Fishelson, some by Nir, and most are
mine. I have slightly edited all slides.
Human Genome
Most human cells contain
46 chromosomes:
2 sex chromosomes
(X,Y):
XY – in males.
XX – in females.
22 pairs of
chromosomes named
autosomes.
2
Sexual Reproduction
egg
ביצית
sperm
תא זרע
gametes
zygote
ביצית מופרית
תאי מין
3
Chromosome Logical Structure
Locus – the location of genes
or other markers on the
chromosome.
Allele – one variant form (or
state) of a gene/marker at a
particular locus.
Locus1
Possible Alleles: A1,A2
Locus2
Possible Alleles: B1,B2,B3
4
Genotypes versus Phenotypes
At each locus (except for sex chromosomes) there are 2
genes. These constitute the individual’s genotype at the
locus.
The expression of a genotype is termed a phenotype.
For example, hair color, weight, or the presence or
absence of a disease.
5
Recombination Phenomenon
A recombination between
2 genes occurred if the
haplotype of the individual
contains 2 alleles that
resided in different
haplotypes in the
individual's parent.
(Haplotype – the alleles at
different loci that are received by
an individual from one parent).
6
An example - the ABO locus.
The ABO locus
determines detectable
Phenotype
Genotype
antigens on the surface
A
A/A, A/O
of red blood cells.
The 3 major alleles
B
B/B, B/O
(A,B,O) interact to
AB
A/B
determine the various
ABO blood types.
O
O/O
O is recessive to A and
B. Alleles A and B are
Note that the listed genotypes are unordered
codominant.
(we don’t know which allele is from the father
and which one is from the mother).
7
Example: ABO near AK1 on Chromosome 9
O
A
A1/A1
A O
A1 A2
Recombinant
2
1
O O
A2 A2
A2/A2
A
A
A1/A2
4
3
O O
A1 A2
O
A2/A2
A O
A2 | A2
5
A1/A2
8
Example for Finding Disease Genes
A
H
A1/A1
H A
A1 A2
Recombinant
2
1
A A
A2 A2
A2/A2
H
H
A1/A2
4
3
A A
A1 A2
A
A2/A2
H |A
A2 | A2
5
A1/A2
We use a marker with codominant alleles A1/A2.
We speculate a locus with alleles H (Healthy) / A (affected)
If the expected number of recombinats is low (close to
zero), then the speculated locus and the marker are
tentatively physically closed.
9
The method just described is called
genetic linkage analysis. It uses
the phenomena of recombination in
families of affected individuals to
locate the vicinity of a disease gene.
10
Comments about the example
Often:
Pedigrees are larger and more complex.
Not every individual is typed.
There are more markers and they have more than
two alleles.
Recombinants cannot always be determined.
11
Usually recombination can not be simply counted
A
A
A1/A1
A O
A1 A2
Recombinant ?
Sometimes !
2
1
A O
A 2 A2
A2/A2
A
A
A1/A2
4
3
? ?
A1 A2
A
A2/A2
A O
A2 | A2
5
A1/A2
One can compute the likelihood of data given every
location and choose the most likely location.
12
A Bayesian Network Model
L11f
L11m
Selector of
maternal allele
at locus 1 of
person 3
X11
S13m
P(s13m) = ½
L13m
Maternal allele at locus 1 of person 3 (offspring)
Selector variables Sijm are 0 or 1 depending on whose
allele is transmitted to offspring i at maternal locus j.
P(l13m | l11m, l11f,,S13m=0) = 1 if l13m = l11m
P(l13m | l11m, l11f,,S13m=1) = 1 if l13m = l11f
P(l13m | l11m, l11f,,s13m) = 0 otherwise
13
L11m
L12m
L11f
L12f
Probabilistic model for two loci
X11
S13m
X12
L13f
L13m
X13
Model for locus 1
L21m
S23m
L22m
L21f
X21
X22
L22f
S23f
L23f
L23m
Model for locus 2
S13f
X23
14
Probabilistic model for Recombination
L11m
L12m
L11f
X11
S13m
L12f
X12
S13f
L13f
L13m
X13
L21m
S23m
L22m
L21f
X21
X22
S23f
L23f
L23m
2
1 2
P( s23t | s13t , 2 )
where t {m,f}
1 2
2
L22f
X23
θ2 is called the recombination fraction between loci 2 & 1.
15
Modeling Phenotypes I
L11f
L11m
X11
S13m
Y11
L13m
Phenotype variables Yij are 0 or 1 depending on whether a
phenotypic trait associated with locus i of person j is
observed. E.g., sick versus healthy. For example model of
perfect recessive disease yields the penetrance probabilities:
P(y11 = sick | X11= (a,a)) = 1
P(y11 = sick | X11= (A,a)) = 0
P(y11 = sick | X11= (A,A)) = 0
16
Introducing a tentative disease Locus
L11m
Marker locus
L12m
L11f
X11
S13m
L12f
X12
S13f
L13f
L13m
X13
Disease locus: assume
S
sick means xij=(a,a)
23m
2
1 2
P( s23t | s13t ' , 2 )
1
2
2
L21m
L22m
L21f
X21
L22f
X22
S23f
L23f
L23m
X23
The recombination fraction θ 2 is unknown. Finding it can
help determine whether a gene causing the disease lies in
the vicinity of the marker locus.
17
SUPERLINK
Stage 1: each pedigree is translated into a Bayesian network.
Stage 2: value elimination is performed on each pedigree (i.e.,
some of the impossible values of the variables of the network are
eliminated).
Stage 3: an elimination order for the variables is determined,
according to some heuristic.
Stage 4: the likelihood of the pedigrees given the θ values is
calculated. This is done by by performing variable elimination
according to the elimination order determined in stage 3.
18