Download A-13-LinkageAnalysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Biology and consumer behaviour wikipedia , lookup

Gene expression profiling wikipedia , lookup

Chromosome wikipedia , lookup

Genetic drift wikipedia , lookup

Fetal origins hypothesis wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

RNA-Seq wikipedia , lookup

Y chromosome wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Neocentromere wikipedia , lookup

Genome evolution wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Epistasis wikipedia , lookup

Skewed X-inactivation wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Gene wikipedia , lookup

Karyotype wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Population genetics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Polyploid wikipedia , lookup

X-inactivation wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression programming wikipedia , lookup

Human leukocyte antigen wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genome (book) wikipedia , lookup

Public health genomics wikipedia , lookup

Hardy–Weinberg principle wikipedia , lookup

Microevolution wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Transcript
Gene Hunting: find genes responsible for a given disease
Main idea: If a disease is statistically linked with a marker
on a chromosome, then tentatively infer that a gene causing
the disease is located near that marker.
.
Some slides were prepared by Ma’ayan Fishelson, some by Nir, and most are
mine. I have slightly edited all slides.
Human Genome
Most human cells contain
46 chromosomes:


2 sex chromosomes
(X,Y):
XY – in males.
XX – in females.
22 pairs of
chromosomes named
autosomes.
2
Sexual Reproduction
egg
‫ביצית‬
sperm
‫תא זרע‬
gametes
zygote
‫ביצית מופרית‬
‫תאי מין‬
3
Chromosome Logical Structure

Locus – the location of genes
or other markers on the
chromosome.

Allele – one variant form (or
state) of a gene/marker at a
particular locus.
Locus1
Possible Alleles: A1,A2
Locus2
Possible Alleles: B1,B2,B3
4
Genotypes versus Phenotypes
At each locus (except for sex chromosomes) there are 2
genes. These constitute the individual’s genotype at the
locus.
 The expression of a genotype is termed a phenotype.
For example, hair color, weight, or the presence or
absence of a disease.

5
Recombination Phenomenon
A recombination between
2 genes occurred if the
haplotype of the individual
contains 2 alleles that
resided in different
haplotypes in the
individual's parent.
(Haplotype – the alleles at
different loci that are received by
an individual from one parent).
6
An example - the ABO locus.



The ABO locus
determines detectable
Phenotype
Genotype
antigens on the surface
A
A/A, A/O
of red blood cells.
The 3 major alleles
B
B/B, B/O
(A,B,O) interact to
AB
A/B
determine the various
ABO blood types.
O
O/O
O is recessive to A and
B. Alleles A and B are
Note that the listed genotypes are unordered
codominant.
(we don’t know which allele is from the father
and which one is from the mother).
7
Example: ABO near AK1 on Chromosome 9
O
A
A1/A1
A O
A1 A2
Recombinant
2
1
O O
A2 A2
A2/A2
A
A
A1/A2
4
3
O O
A1 A2
O
A2/A2
A O
A2 | A2
5
A1/A2
8
Example for Finding Disease Genes
A
H
A1/A1
H A
A1 A2
Recombinant
2
1
A A
A2 A2
A2/A2
H
H
A1/A2
4
3
A A
A1 A2
A
A2/A2
H |A
A2 | A2
5
A1/A2
We use a marker with codominant alleles A1/A2.
We speculate a locus with alleles H (Healthy) / A (affected)
If the expected number of recombinats is low (close to
zero), then the speculated locus and the marker are
tentatively physically closed.
9
The method just described is called
genetic linkage analysis. It uses
the phenomena of recombination in
families of affected individuals to
locate the vicinity of a disease gene.
10
Comments about the example
Often:
Pedigrees are larger and more complex.
 Not every individual is typed.
 There are more markers and they have more than
two alleles.
 Recombinants cannot always be determined.

11
Usually recombination can not be simply counted
A
A
A1/A1
A O
A1 A2
Recombinant ?
Sometimes !
2
1
A O
A 2 A2
A2/A2
A
A
A1/A2
4
3
? ?
A1 A2
A
A2/A2
A O
A2 | A2
5
A1/A2
One can compute the likelihood of data given every
location and choose the most likely location.
12
A Bayesian Network Model
L11f
L11m
Selector of
maternal allele
at locus 1 of
person 3
X11
S13m
P(s13m) = ½
L13m
Maternal allele at locus 1 of person 3 (offspring)
Selector variables Sijm are 0 or 1 depending on whose
allele is transmitted to offspring i at maternal locus j.
P(l13m | l11m, l11f,,S13m=0) = 1 if l13m = l11m
P(l13m | l11m, l11f,,S13m=1) = 1 if l13m = l11f
P(l13m | l11m, l11f,,s13m) = 0 otherwise
13
L11m
L12m
L11f
L12f
Probabilistic model for two loci
X11
S13m
X12
L13f
L13m
X13
Model for locus 1
L21m
S23m
L22m
L21f
X21
X22
L22f
S23f
L23f
L23m
Model for locus 2
S13f
X23
14
Probabilistic model for Recombination
L11m
L12m
L11f
X11
S13m
L12f
X12
S13f
L13f
L13m
X13
L21m
S23m
L22m
L21f
X21
X22
S23f
L23f
L23m
2 
1   2
P( s23t | s13t , 2 )  
where t  {m,f}

1  2 
 2
L22f
X23
θ2 is called the recombination fraction between loci 2 & 1.
15
Modeling Phenotypes I
L11f
L11m
X11
S13m
Y11
L13m
Phenotype variables Yij are 0 or 1 depending on whether a
phenotypic trait associated with locus i of person j is
observed. E.g., sick versus healthy. For example model of
perfect recessive disease yields the penetrance probabilities:
P(y11 = sick | X11= (a,a)) = 1
P(y11 = sick | X11= (A,a)) = 0
P(y11 = sick | X11= (A,A)) = 0
16
Introducing a tentative disease Locus
L11m
Marker locus
L12m
L11f
X11
S13m
L12f
X12
S13f
L13f
L13m
X13
Disease locus: assume
S
sick means xij=(a,a)
23m
2 
1   2
P( s23t | s13t ' , 2 )  


1


2
 2
L21m
L22m
L21f
X21
L22f
X22
S23f
L23f
L23m
X23
The recombination fraction θ 2 is unknown. Finding it can
help determine whether a gene causing the disease lies in
the vicinity of the marker locus.
17
SUPERLINK

Stage 1: each pedigree is translated into a Bayesian network.

Stage 2: value elimination is performed on each pedigree (i.e.,
some of the impossible values of the variables of the network are
eliminated).

Stage 3: an elimination order for the variables is determined,
according to some heuristic.

Stage 4: the likelihood of the pedigrees given the θ values is
calculated. This is done by by performing variable elimination
according to the elimination order determined in stage 3.
18