Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Operations research wikipedia , lookup
Corecursion wikipedia , lookup
Computer simulation wikipedia , lookup
Regression analysis wikipedia , lookup
Mathematical optimization wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Inverse problem wikipedia , lookup
Genetic algorithm wikipedia , lookup
Pattern recognition wikipedia , lookup
Hardy–Weinberg principle wikipedia , lookup
Least squares wikipedia , lookup
Mendel-Penetrance Module Presenter: Joseph Kim Mentors: Dr.Kenneth Lange Brian Dolan What is Mendel? Software package Performs statistical analysis to solve a variety of genetic problems http://www.biomath.medsch.ucla.edu/faculty/kla nge/software.html Goal Beta test Mendel’s new Penetrance module Methods: Find data pertaining to penetrance Plug data into Mendel See if results agree with already established results Penetrance Our definition: the statistical relationship between genotype and phenotype; the likelihood of the phenotype given the genotype Incomplete Penetrance-Example not x-linked (male to male transmission) Incompletely dominant II-1 not affected *color reflects phenotype, not genotype http://www.uic.edu/classes/bms/bms655/lesson4.html Mendel-Penetrance Module Statistically models penetrance of alleles using pedigree data Outputs parameters of the fitted model such as μ and σ (normal distribution) Motivation The output of Mendel can be used for finding disease genes by linkage analysis and association analysis “Increase power of genetic analysis” – Brian Dolan Mendel can be used to determine who’s at risk of being affected with the genetic disease Why is Mendel Better? More versatile statistical models and a better ascertainment correction Commercial software assume that the observations are independent Better trait models enable better mapping of disease and trait genes Background-Likelihood L ... G1 Gn Pen( X i i | Gi ) Prior(G j ) j Tran(G m | Gk , Gl ) {k ,l , m} L: the likelihood of the pedigree data n:number of people Xi:phenotype of ith person Gi:possible genotype of ith person product on j is taken over all founders product on {k,l,m} is taken over all parent-offspring triples Lange, Kenneth. Mathematical and Statistical Methods Background-Pen Function Contains all parameters to be optimized Example: Probability Density Function N(μ,σ ) http://en.wikipedia.org/wiki/Normal_distribution Generalized Linear Models (GLM) Normal Distribution is not sufficient Incorporate other GLM to overcome deficiencies in the normal distribution Binomial Poisson Exponential Gamma Inverse Normal Lognormal Background-Prior Function The frequencies of genotypes in population Typically incorporate Hardy-Weinberg genotype frequencies Assume different loci are independent Ex: For two locus trait A/a and B/b, P(A,b)=P(A)P(b) Background-Tran Function Punnett Square Optimization Maximize L with respect to parameters Only concerned with parameters in Penetrance function Use Lagrange multipliers to limit values of parameters Use iterative methods to solve for the parameters http://www.ecs.umass.edu/mie/labs/injection/research/process/ Distribution of Phenotypes The values in the population fit a continuous distribution. Courtesy of Dr. Janet Sinsheimer Different curves have different parameters Mendel will fit and give parameters for distribution of given data http://en.wikipedia.org/wiki/Normal_distribution Input files Initialize Parameters θ0 Calculate L under θm Repeat until convergence Find θm+1 that increases L Output files Mendel Files Input files: Control.in Ped.in Locus.in Map.in Var.in Output file: Mendel.out Mendel.out What Do the Numbers Mean? Parameters define the probability distribution function of the penetrance; it is a property of the penetrance of the trait Knowing the parameters will allow more accurate results for research that requires knowledge in these properties (i.e. formulas that depend on these values) Results Verified the program using large pedigree segregating high triglycerides Bugs found: 1 Default Scaling factor causing underflow (Truncation Error) resulting in early termination of the iterations Acknowledgements Dr. Kenneth Lange Brian Dolan Dr. Janet Sinsheimer Lara Bauman Dr.Sharp and Dr.Johnston Dr. Richard Johnston Socalbsi Bibliography http://www.uic.edu/classes/bms/bms655/lesson4.html Sobel E, Papp JC, Lange, K. “Detection and integration of genotyping errors in statistical genetics” Am J Hum Genet. 2002 Feb;70(2):496-508. Epub 2002 Jan 8. PMID: 11791215 Lange, Kenneth. Optimization. Springer-Verlag NY, LLC. New York: 2004. Lange, Kenneth. Mathematical and Statistical Methods for Genetic Analysis. Second Edition. Springer-Verlag New York, Inc. New York: 2002. Sinsheimer, Janet. Quantitative Traits slides http://en.wikipedia.org/wiki/Normal_distribution http://www.ecs.umass.edu/mie/labs/injection/research/process/