Download Intro to Statistics - Phillips Scientific Methods

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Data mining wikipedia , lookup

Misuse of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
Statistics
The Baaaasics
S
“For most biologists, statistics is just a useful tool, like a
microscope, and knowing the detailed mathematical basis of a
statistical test is as unimportant to most biologists as knowing
which kinds of glass were used to make a microscope lens.”
–John McDonald, Biological Data Analysis Professor, University of Delaware
So Why Stats?
S Useful for drawing conclusions about an entire population
by sampling an acceptable, smaller size.
S Useful for converting raw data into tables or graphs that are
quick and easy to decipher.
S Understanding how data is used can help you determine if
the source is reliable and trustworthy. (i.e., “Studies show...”
Can we trust what this says??? And yet we conform so much
of our lives to these “studies.”)
What Do I Need To Know About Stats?
S Descriptive Statistics – “the analysis of data that helps
describe, show or summarize data in a meaningful way such
that, for example, patterns might emerge from the data.”
S It is NOT used to help draw conclusions beyond the data
collected (i.e., cannot be extrapolated to make claims regarding
the whole population, if the whole population was not
measured).
S Inferential Statistics – makes inferences about populations
using data drawn from a smaller population.
Descriptive Statistics
Mean
Median
Mode
Range
Standard Deviation
Measures of Confidence: Standard Error &95% Confidence Intervals
S
Descriptive Statistics in a Nutshell
S
Collect Data (i.e. conduct survey)
S
Summarize Data (mean, median,
mode, standard deviation, error
bars, etc.)
S
Present Data (table or graph)
Raw Data Table
Graphical Representation
Graphs usually plot the mean or
median of the raw data set.
By doing this, it is much easier to
identify patterns or to see if there is a
normal distribution in the population.
3Ms: MEAN, Median, Mode
S
Mean (aka average).
S In biology, we want to know
the mean of the entire
population (μ), but we use the
sample population’s mean (x)
as an estimate of μ.
S
Most frequently used of the 3Ms
S
Use it when you have a typical
data set (no outliers).
S
How to calculate x:
1. Find the sum of all the data sets.
2. Count the number of data sets.
3. Divide the sum by the # of data
sets.
3Ms: Mean, MEDIAN, Mode
S
S
Median – when the data are
ordered from smallest or largest,
the median is the midpoint of the
data.
Median is great to use when your
data contains significant outliers
that can skew the mean.
S
For example:
27, 27, 28, 28 29, 31, 32, 33, 33, 33, 34, 35, 163
S
What would be the average?
27+27+28+28+29+31+32+33+33+33+34+35+163
13
x = 41
3Ms: Mean, Median, MODE
S
S
Mode is the value that appears
the most often.
It is not frequently used, but can
be useful for categorical data
where the user wants to know
what is the most common
category.
27, 27, 28, 28 29, 31, 32,
,
Mode = 33
,
, 34, 35, 163
Range
S Smallest measurement to the largest.
S The larger the range, the greater the variability.
S However, extremely small or large values (outliers) can make
the variability appear really high.
S The standard deviation is a more accurate representation of
the spread of the data, which will be discussed further when
the semester begins. Make sure you have watched the
Bozeman video on this!
WARNING!
You will either have a short quiz on this information and/or it will
be on your first exam. Make sure you know the difference between
the 3Ms and the max vs. min (i.e. range).
We will take some measurements on the 1st or 2nd day of the
semester and you will apply this information, along with graphing.
Additional Slides
(need to know)
Population vs. Sample Size
Qualitative vs. Quantitative
Discrete vs. Continuous
S
Population vs. Sample Size
Qualitative vs. Quantitative Data
S Qualitative - data are measurements that each fail into one of several
categories. (hair color, ethnic groups and other attributes of the population)
S Qualitative data are generally described by words or letters.
S They are not as widely used as quantitative data because many numerical
techniques do not apply to the qualitative data.
S Quantitative - data are observations that are measured on a numerical scale
(distance traveled to college, number of children in a family, etc.)
S Quantitative data are always numbers and are the result of counting or
measuring attributes of a population.
S Quantitative data can be separated into two subgroups: discrete & continuous
Discrete vs. Continuous Data
o Discrete - if it is the result of counting (eg., the
number of students of a given ethnic group in a
class, the number of books on a shelf, ...)
o Continuous - if it is the result of measuring
(eg., distance traveled, weight of luggage, …)
Discrete Data
S This is called discrete data because
the units of measurement (for
example, CDs) cannot be split up;
there is nothing between 1 CD and
2 CDs.
S Shoe sizes are a classic example of
discrete data, because sizes 39 and
40 mean something, but size 39.2,
for example, does not.
Continuous Data
S The data set shows a group of continuous data.
S This data is called continuous because the scale of measurement -
distance - has meaning at all points between the numbers given, ex. we
can travel a distance of 1.2 and 1.85 and even 1.632 miles.