Download Two Problems in High-Dimensional Statistics: A Specific One on the Analysis of Gene Function, and a General One on Ranking and Selection

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

RNA-Seq wikipedia , lookup

Polymorphism (biology) wikipedia , lookup

Minimal genome wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Ridge (biology) wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene expression programming wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression profiling wikipedia , lookup

The Selfish Gene wikipedia , lookup

Group selection wikipedia , lookup

Microevolution wikipedia , lookup

Transcript
Department of Statistics
STATISTICS COLLOQUIUM
MICHAEL NEWTON
Departments of Statistics and of Biostatistics and Medical Informatics
University of Wisconsin-Madison
Two Problems in High-Dimensional Statistics: A Specific One on
the Analysis of Gene Function, and a General One on Ranking
and Selection
MONDAY, April 14, 2014, at 4:00 PM
133 Eckhart Hall, 5734 S. University Avenue
Refreshments following the seminar in Eckhart 110
ABSTRACT
I will investigate two inference problems. The data structure in the first case is a list of genes
identified by some genome-wide assay, and the problem is to characterize the functional content
of the list using known functional properties of all genes. By modeling the gene-list in terms of
latent activities of biological functions, one obtains a compelling multivariate procedure to address
the task. The available method ignores what turns out to be an important constraint in the highdimensional discrete parameter space. I will discuss this constraint and the inference improvements
that follow when the constraint is respected. In the second part of the talk I will discuss the
general problem of ranking and selection. Data form a collection of unit-specific point estimates
coupled with unit-specific quantities that inform the uncertainty in the underlying unit-specific
parameters (e.g., standard errors). When uncertainty varies substantially among units, different
ranking/selection schemes can give very different inferences about the true ranking. I introduce
and investigate a generic empirical Bayesian quantity—the r-value—which leads to rankings that
are optimal in the sense of maximizing the joint probability that a unit’s parameter is among
the top alpha-fraction of parameters and the unit is inferred to be in that top fraction. The
gene-set problem is addressed in a paper with Zhishi Wang, Qiuling He, and Bret Larget. The
ranking/selection findings are part of work with Nicholas Henderson.
For further information and inquiries about building access for persons with disabilities, please contact
Kirsten Wellman at 773.702.8333 or send her an email at [email protected]. If you wish to subscribe to
our email list, please visit the following website: https://lists.uchicago.edu/web/arc/statseminars.