* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Link to Powerpoint
Heritability of IQ wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Gene expression programming wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genetic drift wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Behavioural genetics wikipedia , lookup
Genetic engineering wikipedia , lookup
Genome evolution wikipedia , lookup
Dominance (genetics) wikipedia , lookup
Tay–Sachs disease wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Fetal origins hypothesis wikipedia , lookup
Human genetic variation wikipedia , lookup
Population genetics wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Medical genetics wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Designer baby wikipedia , lookup
Microevolution wikipedia , lookup
Genome (book) wikipedia , lookup
Introduction to Genetics Debashis Ghosh Professor and Chair, Biostatistics and Informatics, ColoradoSPH Question we tackle today • What do we mean by a gene? • Steve Mount (ongenetics.blogspot.com): “A gene is all of the DNA elements required in cis for the properly regulated production of a set of RNAs whose sequences overlap in the genome. ” • Mark Gerstein (2007, Genome Biology): “The gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products” What is a gene? • No ``one-size-fits-all” definition • The previous definitions are useful to contextualize data that are generated from experiments • Thinking carefully about evolution and the constraints it has placed on functions is also important From Genotype to Phenotype • Full genotypes (genomes) are coming…But inheritance is complex • Genetic markers are characters inherited in a way that is simple enough to easily track • Want to find genetic markers that explain or predict phenotypes – e.g., disease, susceptibility – Ideally, the marker would be causative • But that is rare Alleles as Genes • At each gene locus, we have two alleles, one transmitted to us by our father, and one by our mother. • Usual assumption: Each parent randomly transmits one of his/her alleles to the child • For real datasets, this is identical to DNA variants referred to as single-nucleotide polymorphisms (SNPs) Diploid Inheritance From Mom From Dad Heterozygote From Mom From Dad Homozygote Phenotypic Dominance From Mom From Dad Heterozygote Light blue dominant Dark blue recessive Mixed Dominance Dark blue dominant Light blue recessive Diploid Inheritance Heterozygote Homozygote Dark Blue Is Dominant Recessive Phenotype Only Visible in Homozygote Mendelian Ratios Recombination From Grandma From Grandpa Chromosomal Segment in Mom (she’s a diploid, remember) From Mom From Dad Chromosomal Segment in You (You’re diploid too) Crossing Over From Grandma From Grandpa Sister Chromatids Recombine (Cross Over) During Meiosis Inherited by You Lost (Except in Tetrad Analysis) Products of Meiosis Recombination: Basic Points • Recombination switches which chromosome in the parent (i.e., originating from which grandparent) is passed along to the offspring • Alleles physically adjacent on a chromosome are more likely to be passed on together than alleles far apart • Alleles very far apart or on different chromosomes are inherited randomly Finding Disease Genes • • • • Assemble data set of probands Assemble data set of control population Might have pedigree if runs in families Might have trios to determine linkage – Proband plus two parents • Look for linkage between genetic markers and disease – In pedigree – In dataset of less related individuals Genetic Markers • Polymorphic in population – Different variants in different individuals – Single Nucleotide Polymorphism (SNP) – Variable Number of Tandem Repeats (VNTR) • minisatellites – Short Tandem Repeats (STR) • Microsatellites • Very high mutation rate: strand slippage • Haplotype – A set of closely linked SNPs inherited as unit Linkage Analysis • Set of variable markers distributed throughout genome • Identify linkage regions (haplotypes) that cosegregate (are inherited) with disease or trait Pedigree Analysis • Tabulate the occurrence of a trait in an extended family – Pedigree is family’s mating history Assumptions and Complications • Single gene with Mendelian inheritance – Best use of extended families – Few extended families with trait • Quantitative traits are multigenic – Includes most widespread or “common” inherited diseases – Sib pairs are best for complex traits with incomplete penetrance (see next slide) Incomplete Penetrance • Not everyone with genotype will have the disease – – – – Delayed or adult onset Mild or undetectable symptoms Environmental and developmental factors Unknown genetic factors • Disease allele = increase probability of disease, relative risk • We don’t always know in pedigree who has the disease genotype! Evaluating Linkage • Remember, individual is a recombinant with respect to two genes, A and B, if inherits the allele from one parental chromatid at A and inherits the allele from the other parental chromatid at B • The recombination fraction q AB is the probability that a child is recombinant • If A and B are tightly linked, then q AB is small Simple LOD Scores • • • • Total number of offspring, P Number of recombinant offspring, R P-R R Likelihood of the Data = qAB (1- qAB ) Maximum likelihood estimate • LOD score for linkage in pedigree is é L(D | q = qˆ ) ù éqˆ R (1- qˆ )P-R ù AB AB AB AB log10 ê = log ú ú 10 ê P 1/ 2 ë L(D | q AB = 1/ 2) û ë û Complications • Need to know phase, genotypes of parents, to identify recombinants – Can estimate informativeness of additional data depending on heterozygosity of markers • Many disease versus marker comparisons are involved – Multiple comparisons – But, markers are not independent • Population structure • LOD scores > 3 (1000:1) give general sense; >5 very strong Population structure • Genetic markers have different patterns in different populations; this has the possibility of confounding associations between genetic markers with disease phenotypes. Realistic Complications • Include Penetrance(X|G) – Likelihood of observing trait X given the genotype G • Prior(G) – Likelihood of observing the genotype in an individual • Transmit(Gm|Gk,Gl, q) – Probability that offpring will have genotype Gm given parental genotypes Gk and Gl, and the recombination parameter q LOD Graph • Can look at LOD score over a range of 's, • not just MLE. Usual assumption is LOD > 3 is evidence for linkage, LOD < -2 is evidence for exclusion Example: 27 recombinant Out of 139 gametes (example from S. Purcell) Recombination Probability and Distance along Chromosome • Recombination does not increase linearly – Multiple recombination events possible over greater distances, but also interference • Can estimate genetic distance from recombination rates – Measure in Morgans, or cM – c ABthe expected number of crossovers, is additive Mapping Functions • Haldane’s mapping function – Crossovers are assumed random and independent c AB = - 12 ln(1- 2qAB ) • Kosambi’s mapping function – Models interference: crossovers not too close – Most popular c AB = 1 4 ln[(1+ 2qAB ) (1- 2qAB )] Genetic versus Physical • Mapping is not simple – Recombination rate varies along chromosomes • Male versus Female – Men 28.51M over whole genome • 1.05 Mb/cM – Women 42.96M (excluding X) • 0.88 Mb/cM • In Drosophila, about 0.4 Mb/cM Modeling Penetrance • Single locus, three genotypes f DD = P(disease | DD) f Dd = P(disease | Dd) f dd = P(disease | dd) • If f DD = f Dd =1, f dd = 0 – Disease is Mendelian dominant • If f DD =1, f Dd = f dd = 0 – Disease is Mendelian recessive • Spontaneous mutations: • incomplete penetrance: f dd > 0 f DD <1 Extending Analysis • SNPs scattered throughout genome – LOD scores for regions, not individual marker • Multipoint linkage analysis – Establish order relationship among 3+ markers • Non-parametric analysis can be better for complex traits, incomplete penetrance – Work with affected siblings – Less statistical power than model-based methods • Identical by descent (IBD) versus chance Non-Parametric • Concerning siblings or other relatives – Need “both affected” and “only one affected” pairs • Correlate shared IBD alleles with affected state, proportion in two classes – High correlation means linkage to disease Mention T1D (Genomewide) Association Studies • Correlate markers with disease over a large population • Marker may be disease (rare) • Large regions of chromosome in linkage disequilibrium with disease allele – Marker is in disease gene haplotype • Regions of chromosome tend to be inherited as a unit – Tapers off over time due to recombination Association Studies • Linkage disequilibrium varies among populations – Depends on population structure, age • coalescent – Europeans have a lot, African populations only a little – Population of human origin is more diverse, older • Need dense, cheap markers over genome: Genome Wide Association Studies (GWAS) QTL and GWAS • Quantitative Traits, polygenic traits that are assumed to have additive effects – Height, heart disease – Quixotic Trait Loci? • Each gene has a small effect • Huge genotyping efforts now paying off • BUT only a small fraction of genetic component is accounted for even in huge studies – Tradeoffs of including broader human population Common Disease versus Rare Variants • Common disease, common variants: The most frequently occurring alleles/SNPs should explain most of the etiology of a disease. - Current studies do NOT show this to be the case. • Newer paradigm: rare variants • - occur less frequently but have larger associations with disease Sullivan, Daly and Donovan, Nature Reviews Genetics, 2012 • Different results in different populations • Heritability – What makes a gene matter to a disease? – Take advantage of human phenotyping – What genes CAN contribute to disease or modification of disease? • A golden age of personal genomics? Acknowledgments • David Pollock, Biochemistry and Molecular Genetics