Download Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Operations research wikipedia , lookup

Statistical inference wikipedia , lookup

Time series wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Statistics
January - March 2011
Tim Bunnell, Ph.D., Jobayer Hossain, Ph.D.,
Larry Holmes, Ph.D.
Nemours Bioinformatics Core Facility
Nemours Biomedical Research
Overview
•  Class goals
–  Master basic statistical concepts
–  Learn analytic techniques & when to apply
them
–  Learn how to interpret analysis results
–  Develop familiarity with SPSS & REDCap
–  Gain understanding that will transfer to a
broad range of other statistics tool
Nemours Biomedical Research
Overview
•  Class structure
–  8 sessions
–  1.5 hours per session
–  Several homework assignments
•  Class website
–  http://www.nemoursresearch.org/open/StatClass/January2011
Nemours Biomedical Research
Statistics
Science of collection, presentation,
analysis, and reasonable interpretation
of data.
Nemours Biomedical Research
Types of Data or Variables
• 
Variable - any characteristic of an individual or entity.
–  A variable can take different values for different individuals.
–  Variables can be categorical or quantitative. Per S. S. Stevens…
• 
Types
–  Nominal - Categorical variables with no inherent order or ranking sequence such
as names or classes (e.g., gender). Value may be a numerical, but without
numerical value (e.g., I, II, III). The only operation that can be applied to Nominal
variables is enumeration.
–  Ordinal - Variables with an inherent rank or order, e.g. mild, moderate, severe.
Can be compared for equality, or greater or less, but not how much greater or
less.
–  Interval - Values of the variable are ordered as in Ordinal, and additionally,
differences between values are meaningful, however, the scale is not absolutely
anchored. Calendar dates and temperatures on the Fahrenheit scale are
examples. Addition and subtraction, but not multiplication and division are
meaningful operations.
–  Ratio - Variables with all properties of Interval plus an absolute, non-arbitrary
zero point, e.g. age, weight, temperature (Kelvin). Addition, subtraction,
multiplication, and division are all meaningful operations.
Nemours Biomedical Research
Descriptive & Inferential
Statistics
•  Descriptive Statistics: Summarizing or
characterizing data
–  What is the mean, standard deviation, or distribution (for
quantitative variables such as age, height, or weight).
–  What are the counts, relative frequency, or percentages
(for categorical variables such as gender, eye color, or
disease status).
•  Inferential Statistics: Drawing conclusions from the
immediate data; predicting.
–  What is the probability that these two groups of subjects
were sampled from the same population?
–  What is the likelihood this child has disease X given the
results of an assay?
–  How large a dose is needed to achieve a specific
response level?
Nemours Biomedical Research
Population and Sample
•  Population: The entire collection of individuals or
measurements about which information is desired.
•  Sample: A subset of the population selected for study.
–  Primary objective is to create a subset of population
whose center, spread and shape are as close as possible
to that of population.
Nemours Biomedical Research
Parameter v.s. Statistic
•  Parameter:
–  Any statistical characteristic of a population.
–  E.g., population mean, median, standard deviation are
parameters.
–  Parameters describe the properties of a population
–  Parameters are usually unknown
–  Parameters are fixed unless the population changes.
Nemours Biomedical Research
Parameter v.s. Statistic
•  Statistic:
–  A characteristic of a sample drawn from a population.
–  E.g., sample mean, median, standard deviation are
statistics.
–  Statistics estimate properties (parameters) of a population
–  Statistics are known exactly (because calculated)
–  Statistics usually vary from one sample to another.
–  Are used for making inferences about parameters
Nemours Biomedical Research
Statistical Inference
sample
population
• Statistical inference is the process by which we acquire
information about populations from samples.
• Two types of estimates for making inferences:
– Point estimation.
– Interval estimate.
Nemours Biomedical Research
Statistics
Science of collection, presentation,
analysis, and reasonable interpretation
of data.
Science of collection, management,
presentation, analysis, and reasonable
interpretation of data.
Nemours Biomedical Research
REDCap
•  REDCap (Research Electronic Data Capture)
•  A secure, web-based application for building
databases and collecting and managing data
•  Users can create and design their own
databases, enter their data as it is collected
and export data formatted for import into
common statistical packages (i.e. SPSS,
SAS, STATA and R)
•  A calendar function allows scheduling of
events and appointments.
Nemours Biomedical Research
REDCap
•  Initially developed and deployed at Vandervilt
University
•  Now supported by a consortium of a growing
number of partner institutions
•  Citation:
Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde
JG. Research electronic data capture (REDCap)--a metadatadriven methodology and workflow process for providing
translational research informatics support. J Biomed Inform. 42
(2):377-81, 2009.
Nemours Biomedical Research
Why use REDCap?
•  Accessibility
–  Everyone working on the project can get to the
data
–  Data entry can be done from anywhere
•  Security
–  Data is stored on our servers, backed up nightly
–  Users must authenticate to view the data
–  Users can be assigned different levels of
permission such as read-only, add/edit data,
delete records, etc
–  Users can be restricted from viewing certain forms
and exporting certain fields
Nemours Biomedical Research
Why use REDCap?
•  Versatility
–  Data or data subsets can be easily selected,
sorted and exported in formats directly ready for
analysis in SPSS, SAS, S+/R or even Excel
–  Schedules can be automatically generated from
defined events
•  Validity
–  Using a tool like REDCap to set up a database
and data entry forms for a study is a valuable first
step in being sure that you have covered all the
devilish details that are inherent in designing a
valid research study
Nemours Biomedical Research
REDCap Training
•  Training Videos
–  REDCap overview (watch this before the
next session)
–  Other REDCap basics
–  Types of REDCap databases
–  Special features
•  Website
https://www.nemoursresearch.org/redcap
/index.php?action=training
Nemours Biomedical Research
REDCap Accounts
•  Accounts
–  Every REDCap users is assigned a unique account
–  Passwords should be kept confidential. Do not
share your password with anyone else
–  Only sign on and use the system under your own
username and password
–  Always log off the system before leaving the site.
–  You will each receive an email message with your
REDCap username and password
–  There will be two databases that contain data sets
for the class
Nemours Biomedical Research
Statistical Tools & Packages
• 
Excel
– 
– 
• 
SAS
– 
– 
– 
• 
More academic background than SAS
Still quite powerful and less expensive
R
– 
– 
– 
– 
• 
Other extreme from Excel
Industrial-strength statistics
Programming language + functions
STATA
– 
– 
• 
Rudimentary data summarization and analysis.
It’s everywhere
Freely available, open source statistical programming environment
Excellent graphics capabilities
Extensive library of functions and packages with online support community
Ghastly learning curve!
SPSS
– 
– 
– 
Statistical Package for the Social Sciences
Grown into a well-supported and fairly user friendly package
Our tool of choice for user-generated statistical analyses in Nemours
Nemours Biomedical Research
SPSS
•  Available from Citrix Metaframe Server
–  External connection via Connect2
–  Internal use Metaframe desktop link
•  Everyone has permission to use it
•  Limit of 17 licenses total for all Nemours
•  Limit of 10 licenses for this class
Nemours Biomedical Research