Download Intro

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Minimal genome wikipedia , lookup

Medical genetics wikipedia , lookup

Dominance (genetics) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Twin study wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Genetic drift wikipedia , lookup

Neocentromere wikipedia , lookup

Genomic imprinting wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Frameshift mutation wikipedia , lookup

X-inactivation wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Genomic library wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Ploidy wikipedia , lookup

Behavioural genetics wikipedia , lookup

Point mutation wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Gene expression programming wikipedia , lookup

Genetic testing wikipedia , lookup

Genetic code wikipedia , lookup

Karyotype wikipedia , lookup

Human genome wikipedia , lookup

Non-coding DNA wikipedia , lookup

Population genetics wikipedia , lookup

Genomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Tag SNP wikipedia , lookup

RNA-Seq wikipedia , lookup

Heritability of IQ wikipedia , lookup

Chromosome wikipedia , lookup

Genetic engineering wikipedia , lookup

Helitron (biology) wikipedia , lookup

Gene wikipedia , lookup

Genome editing wikipedia , lookup

Designer baby wikipedia , lookup

Genome evolution wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Human genetic variation wikipedia , lookup

Public health genomics wikipedia , lookup

Polyploid wikipedia , lookup

Microsatellite wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genome (book) wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Introduction to Linkage
Analysis
March 2002
3 Stages of Genetic Mapping

Are there genes influencing this trait?


Where are those genes?


Epidemiological studies
Linkage analysis
What are those genes?

Association analysis
Where are those genes?
Outline

How is genetic information organized?



Examples of genetic variation



Chromosomes
Sequence
Changes that have observable effects
Genetic markers
Linkage analysis

Strategy for surveying variation in families
Genetic Information

Human Genome



22 autosomes
X and Y
Sequence of 3 x 109 base-pairs


~17-20 bp can identify unique sequence in the genome
Variation


Most sequence is conserved across individuals
1 in 103 base-pairs differs between chromosomes
DNA

Polymer of 4 bases

Purines



Pyrimidines



(A) – Adenine
(G) – Guanine
(C) – Cytosine
(T) – Thymine
Double Helix


Complementary Strands
Hydrogen Bonds
Some Types of DNA Sequence

Genes







~30,000 in humans
Exons, translated into protein
Introns, transcribed into RNA, but not protein
Promoters
Enhancers
Repeat DNA
Pseudogenes
Genetic Code





DNA  RNA  Protein
DNA: 4 bases (A,T,C,G)
RNA: 4 bases (A,U,C,G)
Proteins: 20 amino-acids
Universal Genetic Code


Translation between DNA/RNA and protein
Three bases code for one amino-acid
Genetic Code
Example of CFTR Variants
Position
482
1609
1654
2566
3659
Mutation
G->C
C->T
Deletion of 3 nucleotides
AT insertion
C deletion
Effect
Arg-117 -> His-177
Gln-493 -> STOP
Deletion of Phe-508
Frameshift
Frameshift
Phenotype vs. Genotype

Genotype


Phenotype


Underlying genetic constitution
Observed manifestation of a genotype
Different changes within CFTR all lead
to cystic fibrosis phenotype
Common types of DNA variants





Tandem repeats
Microsatellites
Single nucleotide polymorphisms
Insertions
Deletions
Repeat Length Polymorphisms

Variable Number Tandem Repeats




VNTRs
Typical repeat units of 10 – 100s bp
E.g.: ~110 bp repeat in IL1RN gene
Microsatellites

Simple repeat sequences



Most popular are 2, 3 or 4 bp
E.g.: ACACACAC …
D naming scheme (e.g., D2S160)
Microsatellites

Most popular markers for linkage
analysis



Large number of alleles (10 is common)
Can distinguish and track individual
chromosomes in families
Relatively abundant

~15,000 mapped loci
SNPs


Single Nucleotide Polymorphisms
Change one nucleotide





Insert
Delete
Replace it with a different nucleotide
Many have no phenotypic effect
Some can disrupt or affect gene
function
A little more on SNPs

Most SNPs have only
two alleles



Easy to automate their
scoring
Becoming extremely
popular
Typing Methods



Sequencing
Restriction Site
Hybridization
Classifying Genotypes

Each individual carries two alleles




Homozygotes


If there are n alternative alleles …
… there will be n (n + 1) / 2 possible genotypes
3 possible genotypes for SNPs, typically more for
microsatellites and VNTRs
The two alleles are the same
Heterozygotes

The two alleles are different
Genes in an individual

Sexual reproduction



One copy inherited from father
One copy inherited from mother
Each individual has



2 copies of each chromosome
2 copies of each gene
These copies may be similar or different
Meiosis



Leads to formation
of haploid gametes
from diploid cells
Assortment of
genetic loci
Recombination or
crossover
What happens in meiosis…
Recombination
Non-Recombinant
Gametes
Recombinant
Gametes
/
/
/
/
1-

Recombination

Actual



No. of recombinants between two locations
An average of one per Morgan
Observed

Usually, only odd / even number of
crossovers between two locations can be
established
Recombination and Map Distance
Observed Recombination
1.00
0.80
0.60
0.40
0.20
0.00
0.00
0.20
0.40
0.60
Distance
0.80
1.00
Intuition for Linkage Analysis

Millions of variations that could be
responsible for disease


Impractical to investigate individually
Within families, they organized into
limited number of haplotypes

Sample modest number of markers to
determine whether each stretch of
chromosome is shared
Tracing Chromosomes
Tracing Chromosomes
1 2
1 3
1 4
3 4
2 3
1 3
5 6
3 5
1 5
IBD


At each location, try to establish whether
siblings (or twins) share 0, 1 or 2
chromosomes
Inference may be probabilistic
Example of Scoring IBD

Parental genotypes are
available
A/C

A/C
Siblings are IBD = 2

Share maternal and
paternal chromosomes
A/A
A/A
Example of Scoring IBD II



Parental genotypes
unavailable
IBD between siblings
may be 0, 1 or 2
Likelihood of each
outcome depends on
frequency of allele A
A/A
A/A
Example of IBD scoring III

Looking at multiple
consecutive markers
helps infer IBD


Especially without
parental genotypes
IBD = 2 may be
quite likely
A/A
C/G
A/T
G/G
A/A
C/G
A/T
G/G
Notation





 - IBD sharing (0, ½ and 1)
Z0 - probability  = 0
Z1 - probability  = ½
Z2 - probability  = 1
ˆ  Z2  12 Z1, estimated IBD sharing
Typical IBD information
Pair
Chr.
Pos (cM)
z0
z1
z2
pi-hat
5378-5479
5378-5479
5378-5479
5378-5479
5378-5479
3
3
3
3
3
10
20
30
40
50
0.00
0.00
0.00
0.00
0.01
0.01
0.01
0.50
1.00
0.98
0.99
0.99
0.50
0.00
0.01
0.995
0.995
0.750
0.500
0.500
Model
 = 0.0, 0.5, 1.0
0.5 [DZ], 1.0 [MZ]
1.0
Q
A
C
Twin
1
E
E
C
A
Twin
2
Q
No Linkage
Linkage
Hypothesis

Test evidence for linked genetic effect

Fit two models



Full model (Q,A,C,E)
Restricted model (A,C,E)
Maximum likelihood test

Compare likelihoods using ²
Analysis

Estimate


For example, using Genehunter or Merlin
Test hypothesis at each location


 along chromosome
Summarize results in linkage curve
Chi-squared is 50:50 mixture of 1 df
and point mass zero
Lod scores

Often, report results as lod scores
LOD  log 10
 4.6 

L(Q, A, C , E )
L( A, C , E )
2
Genome is large, many locations tested

Threshold for significance is usually LOD > ~3
Sample Linkage Curve
LOD