Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genome (book) wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Gene expression programming wikipedia , lookup

Genetic engineering wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

History of genetic engineering wikipedia , lookup

RNA-Seq wikipedia , lookup

Designer baby wikipedia , lookup

Gene expression profiling wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
So You Want to be a Statistician?
Alex Trindade
Associate Professor of Statistics
Department of Mathematics & Statistics
Texas Tech University
What is Statistics?
The Science of learning from data...
Sample
Population
Inference
What is Statistics?
• Descriptive Statistics (Easy Statistics)
– Graphical and numerical summaries of data.
– Averages; variances; percentiles; etc.
•
Ex: mean high temperature in Lubbock on May 8 is 81 degrees.
– Histograms; pie-charts; bar-charts; scatter-plots; etc.
Ex: histogram of mean high temperature in Lubbock on May 8 for
last 100 years.
Histogram
Bin
More
90.6
83.7
76.8
69.8
20
10
0
62.9
Frequency
•
Frequency
What is Statistics?
• Inferential Statistics (Hard Statistics)
– Estimate/Predict unknown quantities from a sample of data.
•
•
Ex: predict this year’s high temperature in Lubbock on May 8,
based on the previous 100 years.
Ex: estimate the location and expression patterns of the gene(s)
that determine adult height on the human genome.
– Quantify the uncertainty in these estimates by giving a measure
of their variability (margin of error).
•
•
Ex: assess the variability of the prediction of this year’s high
temperature in Lubbock on May 8, based on the previous 100
years.
Ex: assess the variability in the estimate of the location and
expression patterns of gene(s) determinining adult height on the
human genome.
Why Does Statistics Work?
Largely because of the Central Limit Theorem...
•
Sample averages, and many other variables, tend to follow a Normal
(or bell-shaped) Distribution.
•
This is a universal Law of Randomness.
•
The speed of modern computers is only just now reaching the
required levels to make sophisticated statistical analyses feasible.
What do Statisticians do?
They are present behind the scenes in every field
of scientific endeavor...
What do Statisticians do?
• Agricultural & Animal Sciences
– Design field experiments to determine which of 3 varieties of corn
is the most productive.
– Determine, from the analysis of a few samples of meat, what the
contamination rate with E. Coli is for the entire herd.
– Estimate animal abundance in a particular region using
capture/recapture data.
What do Statisticians do?
• Meteorology/Climatology
– Help climate scientists forecast the weather; build models for
climate change; etc.
What do Statisticians do?
• Medical Sciences (Biostatisticians)
– Design clinical trials for newly developed drugs to ascertain their
efficacy and safety.
What do Statisticians do?
• Business, Finance, Economics
– Work in insurance & banking industries to develop models that
quantify the risk involved in various activities (driving, lending
money, etc.). (Called actuaries.)
– Formulate models that describe the behavior of the economy and
associated financial instruments (stocks, mutual funds, etc.). (Called
econometricians.)
What do Statisticians do?
• Engineering (Quality Control Professionals)
– Determine the reliability of various products/devices:
•
•
•
Ex: lower bound on failure load of aircraft parts.
Ex: average life cycle of medical devices.
Ex: upper & lower acceptability bounds on size of mechanical parts.
– Six Sigma Process.
What do Statisticians do?
• Genetics/Bioinformatics
– Work in concert with geneticists and computer scientists to unravel
one of the biggest mysteries of our time:
•
•
•
Where exactly are the genes located on our chromosomes, and what
do they do, exactly?
There are approx 3 billion base pairs and 21,000 genes in the human
genome.
With each gene only a few base pairs in length, this is like finding a
needle in a haystack!
Some Projects I’ve Worked(-ing) On
• The Boeing Company (Seattle, Washington)
– World leader in aircraft manufacture.
– Worked with a team of engineers to build statistical models to
predict lower bounds for the failure loads of certain parts.
Some Projects I’ve Worked(-ing) On
• Encision, Inc. (Boulder, Colorado)
– Manufactures medical instruments for laparoscopic surgery.
– Quantified the reliability (life-cycle) of a particular device.
Some Projects I’ve Worked(-ing) On
• Risk Management and Financial Engineering Lab
(University of Florida, Gainesville)
– Within the department of Industrial Engineering.
– Developed statistical models for assessing financial risk.
Some Projects I’ve Worked(-ing) On
• Advanced Laboratory for Radiation Dosimetry
Studies (University of Florida, Gainesville)
– Within the department of Nuclear & Radiological Engineering.
– Developed statistical models for assessing patient-specific
skeletal dosimetry in radiation therapy.
And now, a recent discovery!
How to Become a Statistician?
It’s a long road...
• Bachelors Degree (B.S.) in Mathematics
– or other mathematics-intense discipline like engineering,
economics, physics, chemistry, etc.
– 4+ years beyond high school.
• Masters Degree (M.S.) in Statistics/Biostatistics
– or mathematics (for Ph.D.-level statistician).
– 2+ years beyond B.S.
– Work involves mostly programming under the direction of a
Ph.D.-level statistician; not too exciting; but money is good!
• Doctoral Degree (Ph.D.) in Statistics/Biostatistics
– 3+ years beyond M.S.
– Work is varied; involves real discovery at the frontiers of science;
typically fascinating!
Who Employs Statisticians?
A lot of industries...
• Pharmaceutical Companies
– Typically based in large metro areas (NE seaboard).
– Both M.S. and Ph.D. level.
– Median starting salaries as of 2007:
•
$70,000 (M.A.); $95,000 (Ph.D.).
• Large Engineering & Business Companies
– Typically based in large metro areas.
– Both M.S. and Ph.D. level.
– Median starting salaries as of 2007:
•
$60,000 (M.A.); $80,000 (Ph.D.).
Who Employs Statisticians?
• The Federal Government
– Primarily based in the Washington D.C. area.
– Both M.S. and Ph.D. level.
– Median starting salaries as of 2007:
•
$60,000 (M.A.); $90,000 (Ph.D.).
• Colleges & Universities
– Widely distributed.
– Ph.D. level only.
– Median starting salaries as of 2007:
•
•
•
$85,000 (Biostatistics, 12 months)
$72,000 (Statistics, 9 months, University)
$58,000 (Statistics, 9 months, College)
My Biography
•
•
•
•
•
•
•
Originally from Europe (Portugal).
Grew up in South Africa.
Bachelors in Mathematics, 1988, University of
Southampton (England).
Masters in Mathematics, 1992, University of Oklahoma.
Programmer, 1993-1995, IBM (Dallas).
PhD in Statistics, 2000, Colorado State University.
Assistant Prof., 2000-2007, Dept. of Statistics, University
of Florida.
So, do You Want to be a
Statistician?
(www.amstat.org)