Download Statistics in Biology

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Statistics for Biology
Descriptive Statistics
mean
number of times each value
occurs
Repeated measurements in biology are
rarely identical, due to random errors and
natural variation. If enough measurements
are repeated they can be plotted on a
histogram, like the one on the right. This
usually shows a normal distribution, with
most of the repeats close to some central
value. Many biological phenomena follow
this pattern: eg. peoples' heights, number of
peas in a pod, the breathing rate of insects,
etc.
95% CI
95% CI
normal
distribution
curve
values
The central value of the normal distribution
curve is the mean (also known as the arithmetic
mean or average). But how reliable is this
mean? If the data are all close together, then the
mean is probably good, but if they are scattered
widely, then the calculated mean may not be
very reliable. The width of the normal
distribution curve is given by the standard
deviation (SD), and the larger the SD, the less reliable the data. For comparing different sets of data, a better
measure is the 95% confidence interval (CI). This is derived from the SD. The purpose of taking a random
sample from a lot or population and computing a statistic, such as the mean from the data, is to approximate the
true mean of the population. A 95% confidence interval provides a range of values which is likely to contain
the population parameter of interest (in this case, the average) in approximately 95% of the cases. You can be
pretty confident that the real mean lies somewhere in this range. Whenever you calculate a mean you should
also calculate a confidence limit to indicate the quality of your data.
small confidence limit,
low variability,
data close together,
mean is reliable
large confidence limit,
high variability,
data scattered,
mean is unreliable
In Excel the mean is calculated using the formula =AVERAGE (range) , the SD is calculated using
and the 95% CI is calculated using =CONFIDENCE (0.05, STDEV(range), COUNT(range)) .
=STDEV (range) ,
This spreadsheet shows two sets of data with the
same mean. In group A the confidence interval is
small compared to the mean, so the data are
reliable and you can be confident that the real
mean is close to your calculated mean. But in
group B the confidence interval is large
compared to the mean, so the data are unreliable,
as the real mean could be quite far away from
your calculated mean. Note that Excel will
always return the results of a calculation to about
8 decimal places of precision. This is
meaningless, and cells with calculated results should always be formatted to a more sensible precision, usually 2
decimal places (Format menu > Cells > Number tab > Number).
see also: https://explorable.com/statistics-tutorial
Plotting Data
Once you have collected data you will want to plot a graph or chart to show trends or relationships clearly. With
a little effort, Excel produces very nice charts. First enter the data you want to plot into two columns (or rows)
and select them.
Drawing the Graph. Click on the chart wizard
. This has four steps:
1. Graph Type. For a bar graph choose Column and for a scatter graph (also known as a line graph) choose
XY(Scatter) then press Next. Do not choose Line.
2. Source Data. If the sample graph looks OK, just hit Next. If it looks wrong you can correct it by clicking on
the Series tab, then the red arrow in the X Values box, then highlight the cells containing the X data on the
spreadsheet. Repeat for the Y Values box.
3. Chart Options. You can do these now or change them later, but you should at least enter suitable titles for the
graph and the axes and probably turn off the gridlines and legend.
4. Graph Location. Just hit Finish. This puts the chart beside the data so you can see both.
Changing the Graph. Once you have drawn the graph, you can now change any aspect of it by double-clicking
(or sometimes right-clicking) on the part you want to change. For example you can:
 move and re-shape the graph
 change the background colour (white is usually best!)
 change the shape and size of the markers (dots)
 change the axes scales and tick marks
 add a trend line or error bars (see below)
Lines. To draw a straight "line of best fit" right click on a point, select Add Trendline, and choose linear. In the
option tab you can force it to go through the origin if you think it should, and you can even have it print the line
equation if you are interested in the slope or intercept of the trend line. If instead you want to "join the dots"
(and you don't often) double-click on a point and set line to automatic.
Error bars. These are used to show the confidence intervals on the graph. You must already have entered the
95% confidence limits on the spreadsheet beside the X and Y data columns. Then double-click on the points on
the graph to get the Format Data Series dialog box and choose the Y Error Bars tab. Click on the red arrow in
the Custom + box, and highlight the range of cells containing your confidence limits. Repeat for the Custom box.
The standard error of the mean (SEM) is calculated by dividing the
standard deviation by the square root of number of measurements that
make up the mean (often represented by N). In this case, 5 measurements
were made (N = 5) so the standard deviation is divided by the square root
of 5. By dividing the standard deviation by the square root of N, the
standard error grows smaller as the number of measurements (N) grows
larger. This reflects the greater confidence you have in your mean value as
you make more measurements. You can make use of the of the square root
function, SQRT, in calculating this value. 2 X SEM = 95% CI
The standard deviation is a measure of the fluctuation of a set of data about the sample mean.
The SEM is an estimate of the fluctuation of a sample mean about the "true" population mean.
The error bars are attempting to give a range of plausible values for the population mean (at
that point in time), given the fact that sample means will fluctuate from sample to sample.
"The standard deviation is a measure of the variability within a sample population, and
that SEM is a measure of how well that sample population represents the whole
population."
Problems
1.
Here are the results of an investigation into the rate of photosynthesis in the pond weed Elodea. The number
of bubbles given off in one minute was counted under different light intensities, and each measurement was
repeated 5 times. Use Excel to calculate the means, standard deviation and 95% confidence limits of these
results, then plot a graph of the mean results with error bars and a line of best fit.
light
intensity
(Lux)
0
500
1000
2000
3500
5000
repeat 1
repeat 2
repeat 3
repeat 4
repeat 5
5
12
7
42
45
65
2
4
20
25
40
54
0
5
18
31
36
72
2
8
14
14
50
58
1
7
24
38
28
36
There is a bewildering variety of statistical tests available, and it is important to choose the right one. This
flow chart will help you to decide which statistical test to use, and the tests are described in detail on the
following pages.
normal
data
Testing for a
correlation
non-normal Spearman correlation coefficient
data
=CORREL (range 1, range 2)
on ranks of data
0=no correlation/ 1=perfect correlation
Plot
scatter
graph
Finding how one
factor affects another
Testing for
a relation
between 2 sets
Calculate
mean and
95% CI from
replicates
Measurements
start
here
What
kind
of
test?
Testing for
a difference
between sets
Pearson correlation coefficient
=CORREL (range 1, range 2)
0 = no correlation
1 = perfect correlation
Linear regression
Add Trendline to graph and
Display Equation.
Gives slope and intercept of line
same
individuals
Paired t-test
=TTEST(range1, range2, 2, 1)
If P<5% then significant difference
If P>5% then no significant difference
different
individuals
Unpaired t-test
=TTEST(range1, range2, 2, 2)
If P<5% then significant difference
If P>5% then no significant difference
2 sets
Plot
bar
graph
What
kind
of
data?
>2 sets
Frequencies (counts)
Comparing observed
counts to a theory
What
kind
of
test?
Testing for a difference
between counts
Testing for an association
between groups of counts
ANOVA
Tools menu > Data analysis > Anova
If P<5% then significant difference
If P>5% then no significant difference
2 test
=CHITEST(obs range, exp range)
If P<5% then disagree with theory
If P>5% then agree with theory
2 test
=CHITEST(obs range, exp range)
If P<5% then significant difference
If P>5% then no significant difference
2 test for association
=CHITEST(obs range, exp range)
If P<5% then significant association
If P>5% then no significant association
Statistics to Test for a Correlation
Correlation statistics are used to investigate an association between two factors such as age and height;
weight and blood pressure; or smoking and lung cancer. After collecting as many pairs of measurements as
possible of the two factors, plot a scatter graph of one against the other. If both factors increase together then
there is a positive correlation, or if one factor decreases when the other increases then there is a negative
correlation. If the scatter graph has apparently random points then there is no correlation.
variable 1
No Correlation
variable 2
Negative Correlation
variable 2
variable 2
Positive Correlation
variable 1
variable 1
There are two statistical tests to quantify a correlation: the Pearson correlation coefficient (r), and
Spearman's rank-order correlation coefficient (rs). These both vary from +1 (perfect correlation) through
0 (no correlation) to –1 (perfect negative correlation). If your data are continuous and normally-distributed
use Pearson, otherwise use Spearman. In both cases the larger the absolute value (positive or negative), the
stronger, or more significant, the correlation. Values grater than 0.8 are very significant, values between 0.5
and 0.8 are probably significant, and values less than 0.5 are probably insignificant.
In Excel the Pearson coefficient r is calculated using the formula: =CORREL (X range, Y range) . To
calculate the Spearman coefficient rs, first make two new columns showing the ranks (or order) of the two
sets of data, and then calculate the Pearson correlation on the rank data. The highest value is given a rank of
1, the next highest a rank of 2 and so on. Equal values are given the same rank, but the next rank should
allow for this (e.g. if there are two values ranked 3, then the next value is ranked 5).
In this example the size of breeding pairs
of penguins was measured to see if there
was correlation between the sizes of the
two sexes. The scatter graph and both
correlation coefficients clearly indicate a
strong positive correlation. In other words
large females do pair with large males. Of
course this doesn't say why, but it shows
there is a correlation to investigate further.
Linear Regression to Investigate a Causal Relationship.
If you know that one variable causes the changes in the other variable, then there is a causal relationship. In
this case you can use linear regression to investigate the relation in more detail. Regression fits a straight
line to the data, and gives the values of the slope and intercept of that line (m and c in the equation y = mx +
c).
The simplest way to do this in Excel is to plot a scatter graph
of the data and use the trendline feature of the graph. Rightclick on a data point on the graph, select Add Trendline, and
choose Linear. Click on the Options tab, and select Display
equation on chart. You can also choose to set the intercept to
be zero (or some other value). The full equation with the
slope and intercept values are now shown on the chart.
**You can do this in Logger Pro -Analyze: Linear Fit,
and look at the Correlation value.
In this example the absorption of a yeast cell suspension is plotted against its cell concentration from a cell
counter. The trendline intercept was fixed at zero (because 0 cells have 0 absorbance), and the equation on
the graph shows the slope of the regression line.
The regression line can be used to make quantitative predictions. For example, using the graph above, we
could predict that a cell concentration of 9 x 107 cells per cm3 would have an absorbance of 1.37 (9 x
0.152).
The larger the absolute value (positive or negative), the stronger, or more significant, the correlation. Values
grater than 0.8 are very significant, values between 0.5 and 0.8 are probably significant, and values less than
0.5 are probably insignificant
Correlation Coefficient, r
The quantity r, called the linear correlation coefficient, measures the strength and the direction of a linear
relationship between two variables. The linear correlation coefficient is sometimes referred to as the Pearson
product correlation coefficient
The larger the absolute value (positive or negative), the stronger, or more significant, the correlation. Values
grater than 0.8 are very significant, values between 0.5 and 0.8 are probably significant, and values less than
0.5 are probably insignificant
Coefficient of Determination, r 2 or R2
The coefficient of determination is such that 0 < r 2 < 1, and denotes the strength of the linear association
between x and y.
The coefficient of determination represents the percent of the data that is the closest to the line of best
fit. For example, if r = 0.922, then r 2 = 0.850, which means that 85% of the total variation in y can be
explained by the linear relationship between x and y (as described by the regression equation). The other
15% of the total variation in y remains unexplained.
The coefficient of determination is a measure of how well the regression line represents the data. If the
regression line passes exactly through every point on the scatter plot, it would be able to explain all of the
variation. The further the line is away from the points, the less it is able to explain.
http://mathbits.com/MathBits/TISection/Statistics2/correlation.htm
T-Test to Compare Two Sets of Data
Another common form of data analysis is to compare two sets of measurements to see if they are the same or
different. For example are plants treated with fertiliser taller than those without? If the means of the two sets
are very different, then it is easy to decide, but often the means are quite close and it is difficult to judge
whether the two sets are the same or are significantly different. To compare two sets of data use the t-test,
which tells you the probability (P) that there is no difference between the two sets. This is called the null
hypothesis.
P varies from 0 (impossible) to 1 (certain). The higher the probability, the more likely it is that the two sets
are the same, and that any differences are just due to random chance. The lower the probability, the more
likely it is that that the two sets are significantly different, and that any differences are real. Where do you
draw the line between these two conclusions? In biology the critical probability is usually taken as 0.05 (or
5%). This may seem very low, but it reflects the facts that biology experiments are expected to produce quite
varied results. So if P > 5% then the two sets are the same (i.e. fail to reject the null hypothesis), and if
P < 5% then the two sets are different (i.e. reject the null hypothesis). For the t test to work, the number of
repeats should be at least 5.
In Excel the t-test is performed using the formula: =TTEST (range1, range2, tails, type) . For the examples
you'll use in biology, tails is either 1 or 2 (for a "one-tailed" or "two-tailed" test). You need to decide which
of the following types of effect you expect to find:
 The first mean to be larger than the
second
 The first mean to be smaller than the
second
 The first mean to be different from
the second in either direction
The tail is the extreme end of the distribution
of the data and your experiment can be one of
two types:
 One tailed tests expect the effect to
be in a certain direction, so the first
two points above are examples of 1
tailed experiments
 Two tailed tests are used when you
have no idea which sample will be
larger than the other, but you are
looking for any difference. The third
point above is such a case.
If you have stated your experimental
hypothesis with care, it will tell you which
type of effect you are looking for. For
example, the hypothesis that "Coffee
improves memory" is one tailed because you
expect an improvement. Testing whether the
pH of a stream changed from the previous
year suggests a two tailed test as no direction
is implied. So remember, don't be vague with
your hypothesis if you are looking for a
specific effect. If unsure, use a 2-tailed test so you don't miss something!
Type can be either 1 for a paired test (where the two sets of data are from the same individuals), or 2 for an
unpaired test (where the sets are from different individuals). The cell with the t test P should be formatted as
a percentage (Format menu > cell > number tab > percentage). This automatically multiplies the value by
100 and adds the % sign. This can make P values easier to read and understand. It’s also a good idea to plot
the means as a bar chart with error bars to show the difference graphically.
In the first example the yield of potatoes in 10 plots treated with one fertiliser was compared to that in 10
plots treated with another fertiliser. Fertiliser B delivers a larger mean yield, but the unpaired t-test P shows
that there is a 8% probability that this difference is just due to chance. Since this is >5% we fail to reject the
null hypothesis that there is no significant difference between the two fertilizers.
In the second example the pulse rate of 8 individuals was measured before and after eating a large meal. The
mean pulse rate is certainly higher after eating, and the paired t-test P shows that there is only a tiny 0.005%
probability that this difference is due to chance, so the pulse rate is significantly higher after a meal and we
reject the null hypothesis.
http://www.gla.ac.uk/sums/users/jdbmcdonald/PrePost_TTest/chooset1.html
The outcome of a significance test is a probability, referred to as a P value.
First, let’s be clear what the P value means. It will be simpler to do that in the
context of a particular example. Suppose we wish to know whether treatment
A is better (or worse) than treatment B (A might be a new drug, and B a
placebo). We’d take a group of people and allocate each person to take either
A or B and the choice would be random. Each person would have an equal
chance of getting A or B. We’d observe the responses and then take the
average (mean) response for those who had received A and the average for
those who had received B. If the treatment (A) was no better than placebo
(B) (that would be our null hypothesis), the difference between means should
be zero on average. But the variability of the responses means that the
observed difference will never be exactly zero. So how big does it have to be
before you discount the possibility that random chance is all you were seeing.
You do the test and get a P-value.
The P value is the probability that you would find a difference as
big as that observed, or a still bigger value, if in fact A and B
were identical.
If this probability is low enough, the conclusion would be that it’s unlikely that
the observed difference (or a still bigger one) would have occurred if A and B
were identical, so we conclude that they are not identical, i.e. that there is a
genuine difference between treatment and placebo. (you do not accept the
null hypothesis and offer an alternative hypothesis that the drug caused the
difference).
http://www.dcscience.net/?p=6518
ANOVA to Compare >2 sets of Data
The t test is limited to comparing two sets of data, so to compare many groups at once you need analysis of
variance (ANOVA). From the Excel Tools menu select Data Analysis then ANOVA Single Factor. This
brings up the ANOVA dialogue box, shown here.
 Enter the Input Range by clicking in the box
then selecting the range of cells containing the
data, including the headings.
 Check that the columns/rows choice is correct
(this example is in three columns), and click in
Labels in First Row if you have included these.
The column headings will appear in the results
table.
 Leave Alpha at 0.05 (for the usual 5%
significance level).
 Click in the Output Range box and click on a
free cell on the worksheet, which will become
the top left cell of the 8 x 15-cell results table.
 Finally press OK.
The output is a large data table, and you may
need to adjust the column widths to read it
all. At this point you should plot a bar graph
using the averages column for the bars and
the variance column for the error bars.
The most important cell in the table is the Pvalue, which as usual is the probability that
the null hypothesis (that there is no
difference between any of the data sets) is
true. This is the same as a t-test probability,
and in fact if you try ANOVA with just two
data sets, it returns the same P as a t test. If P
> 5% then there is no significant difference
between any of the data sets (i.e. the null
hypothesis is true), but if P < 5% then at
least one of the groups is significantly
different from the others.
In the example on this page, which concerns the grain yield from three different varieties of wheat, P is
0.14%, so is less than 5%, so there is a significant difference somewhere. The problem now is to identify
where the difference lies. This is done by examining the variance column in the summary table. In this
example, varieties 2 and 3 are very similar, but variety 1 is obviously the different one. So the conclusion
would be that variety 1 has a significantly lower yield than varieties 2 and 3.
Download:http://www.analystsoft.com/en/products/statplusmac/
Chi-squared Test for Frequency Data
Sometimes the data from an experiment are not measurements but counts (or frequencies) of things, such as
counts of different phenotypes in a genetics cross, or counts of species in different habitats. With frequency
data you can’t usually calculate averages or do a t test, but instead you do a chi-squared (2) test. This
compares observed counts with some expected counts and tells you the probability (P) that there is no
difference between them. In Excel the 2 test is performed using the formula: =CHITEST (observed range,
expected range) . There are three different uses of the test depending on how the expected data are calculated.
Sometimes the expected data can be calculated from a quantitative theory, in which case you are testing
whether your observed data agree with the theory. If P < 5% then the data do not agree with the theory,
and if P > 5% then the data do agree with the theory. A good example is a genetic cross, where Mendel’s
laws can be used to predict frequencies of different
phenotypes. In this example Excel formulae are used to
calculate the expected values using a 3:1 ratio of the
total number of observations. The 2 P is 53%, which
is much greater than 5%, so the results do indeed
support Mendel’s law. Incidentally a very high P
(>80%) is suspicious, as it means that the results are
just too good to be true.
Other times the expected data are calculated by
assuming that the counts in all the categories should be the same, in which case you are testing whether
there is a difference between the counts. If P < 5% then
the
counts are significantly different from each other, and
if P
> 5% then there is no significant difference between
the
counts. In the example above the sex of children born
in a
hospital over a period of time is compared. The
expected values are calculated by assuming there
should be equal numbers of boys and girls, and the 2 P
of
6.4% is greater than 5%, so there is no significant
difference between the sexes.
1.
2.
If the count data are for categories in two groups, then the expected data can be calculated by assuming
that the two groups are independent. If P < 5% then there is a significant association between the two
groups, and if P > 5% then the two groups are independent. Each group can have counts in two or more
categories, and the observed frequency data are set out in a table, called a contingency table. A copy of this
table is then made for the expected data, which are calculated for each cell from the corresponding totals of
the observed data, using the formula E = column total x row total / grand total . In this example the flow rate of a
stream (the two categories fast / slow) is compared to the type of stream bed (the four categories weedchoked / some weeds / shingle / silt) at 50 different sites to see if there is an association between them. The
2 P of 1.1% is less than 5%, so
there is an association between flow
rate and stream bed.
3.
Problems
1. In a test of two drugs 8 patients were given one drug and 8 patients another drug. The number of hours of relief from symptoms
was measured with the following results:
Drug A
3.2
1.6
5.7
2.8
5.5
1.2
6.1
2.9
Drug B
3.8
1.0
8.4
3.6
5.0
3.5
7.3
4.8
Find out which drug is better by calculating the mean, SD and 95% confidence limit for each drug, then use an appropriate
statistical test to find if it is significantly better than the other drug.
2. In one of Mendel's dihybrid crosses, the following types and numbers of pea plants were recorded in the F2 generation:
Yellow round seeds
Yellow wrinkled seeds
Green round seeds
Green wrinkled seeds
289
122
96
39
According to theory these should be in the ratio of 9:3:3:1. Do these observed results agree with the expected ratio?
3. The areas of moss growing on the north and south sides of a group of trees were compared.
North side
20
43
53
86
70
of tree
South side
63
11
21
54
9
of tree
Is there a significant difference between the north and south sides?
54
74
4. Five mammal traps were placed in randomly-selected positions in a deciduous wood. The numbers of field mice captured in
each trap in one day were recorded. The results were:
Trap
A
B
C
D
E
no. of mice
22
26
21
8
23
Trap D caught far fewer mice than the others. Did this happen by chance or is the result significant?
5. In an investigation into pollution in a stream, the concentration of nitrates was measured at six different sites, and a diversity
index was calculated for the species present.
Site
1
2
3
4
5
6
413.3
439.7
726
850
567.3
766.7
Conductivity (S)
Diversity index
7.51
5.17
4.49
3.82
5.88
3.74
Is there a correlation between conductivity and diversity, and how strong is it? (The diversity index is calculated from biotic
data, so is not normally distributed.)
6. The blood groups of 400 individuals, from 4 different ethnic groups were recorded with the following results:
Ethnic group
Blood Group O
Blood Group A
Blood Group B
Blood Group AB
1
46
40
7
3
2
48
39
12
2
3
53
33
12
4
4
55
30
13
3
Is there as association between blood group and ethnic group?
7. The effect of enzyme concentration on rate of a reaction was investigated with the following results.
Enzyme concentration (mM)
0
0.1
0.2
0.5
0.8
1.0
Rate (arbitrary units)
0
0.8
1.1
3.2
6.6
7.2
Plot a graph of these results, fit a straight line to the data, and find the slope of this line. Use the slope to predict the rate at an
enzyme concentration of 0.7mM.
More Practice:
1. Go http://mathbench.umd.edu/modules/prob-stat_normal-distribution/page01.htm
Complete the activity: Normal Distributions and the Scientific Method
2. Go to: http://mathbench.umd.edu/modules/prob-stat_bargraph/page01.htm
Complete the activity: Bar Graphs and Standard Error
3. Go to: http://mathbench.umd.edu/homepage/statistical_tests.htm
Complete the t-test activity (pages 1-5, 13) and the chi-square activity.
Let’s Talk About Stats: Understanding the Lingo
by Laura Fulford on 26th of February, 2014 in Lab Statistics & Math
http://bitesizebio.com/19291/a-basic-guide-to-stats-understanding-the-lingo/
The type of data you have, the number of measurements, the range of your data values
and how your data cluster are all described using statistical terms. To determine which
type of statistical test is the best fit for analyzing your data, you first need to learn
some statistics lingo.
Variables
Variables are anything that can be measured; they are your data points, and the type
you have affects the statistical test you use. Measurement or numerical variables are
the main type of variables that are obtained in biological research, so I’ll focus on
these.
Measurement variables can either be continuous, which means they can be any value
between two points, (for example half and quarter measurements)
while discrete variables are whole numbers (such as ranking 1-5).
As I will show you
later, some tests can only be used with continuous variables, while others can
accommodate discrete values.
Sample size (n)
Sample size refers to the number of data points in your set of data. In general, the
larger your sample size, the better. However, factors such as time, cost and practicality
limit the sample size you use. As an absolute minimum you need an n of 3 to perform
a statistical test, but look at publications with similar experiments to determine what is
considered acceptable.
The size of your sample will affect the variance of your data
(see below).
Data Spread
How spread out your data is gives you an idea of how reliable it is – data with low
variance is more reliable than data with high variance. It is therefore useful to know
how variable your data is, and there are several simple measures for determining this.
Variance
Variance is the simplest measure of the spread of the data and is the average of the
squared differences of the mean. It tells us how spread out the data is from the mean;
the larger the number the more spread out the data (higher variance). To calculate
variance, first find the mean of the data points. Next, find the difference between each
sample and the mean and then square the result.
Finally, average the results of the
squared differences.
Note: if you are calculating sample variance (which you most likely are, since this
means you are measuring just a sample of a population rather than the entire
population) then you divide by n-1 when finding the average of the squared differences
rather than n. This is to correct for the fact that you are only estimating the variance
(since you are not measuring the entire population) rather than accurately computing it.
Standard Deviation
The standard deviation (SD) is the most widely used method for measuring the spread
of the data. SD is simply the square root of the variance and similarly tells us how
much the samples deviate from the mean. The standard deviation is often preferred to
the variance as it is produces figures in which the majority of the data is on the same
scale, making the results easier to display.
Standard Error of the mean
While SD tells you about the variability of your data, SE provides information on the precision of the sample
mean. SE is calculated by dividing the SD by the square root of the sample size. When you take a
sample of observations from a population, the mean of the sample is an estimate of the mean of
all of the observations in the population. If your sample size is small, your estimate of the mean
won't be as good as an estimate based on a larger sample size. You'd often like to give some
indication of how close your sample mean is likely to be to the parametric mean. One way to do
this is with the standard error of the mean. If you take many random samples from a population,
the standard error of the mean is the standard deviation of the different sample means. About
two-thirds (68.3%) of the sample means would be within one standard error of the parametric
mean, 95.4% would be within two standard errors, and almost all (99.7%) would be within three
standard errors.
As you increase your sample size, sample standard deviation will fluctuate, but it will not
consistently increase or decrease. It will become a more accurate estimate of the parametric
standard deviation of the population. In contrast, the standard error of the means will become
smaller as the sample size increases. With bigger sample sizes, the sample mean becomes a
more accurate estimate of the parametric mean, so the standard error of the mean becomes
smaller.
Confidence intervals
Confidence limits and standard error of the mean serve the same purpose, to express the
reliability of an estimate of the mean. In some publications, vertical error bars on data points
represent the standard error of the mean, while in other publications they represent 95%
confidence intervals. I prefer 95% confidence intervals. When I see a graph with a bunch of points
and vertical bars representing means and confidence intervals, I know that most (95%) of the
vertical bars include the parametric means. When the vertical bars are standard errors of the
mean, only about two-thirds of the bars are expected to include the parametric means; I have to
mentally double the bars to get the approximate size of the 95% confidence interval.
Distribution
Distribution, as the name implies, describes how your data is distributed. There are
many ways your data can be distributed and this can affect the statistical test you use.
The most well-known distribution is of course the normal distribution, which has a bell
shape. A normal distribution means the data is symmetrical, with values higher and
lower than the mean equally likely, but the frequency of values drops off quickly the
further away from the mean.
Non-normal distributions are skewed; the mean is usually not in the middle. Most
statistical tests assume that the distribution is normal, but beware – many common
statistical tests are not valid for highly skewed data.
p value
The p value is what you are searching for – the number that will tell you whether you
have achieved the holy grail of science: statistical significance! It is generally
considered that a result with a p value less than 0.05 is unlikely to have occurred by
random chance and is therefore statistically significant. In contrast, results with
a p value greater than 0.05 are not considered significant, as it cannot be ruled out that
they did not occur by random chance.
The p value is affected by sample size and if your sample size is too small you will not
obtain a significant result even if the observed effect is real. Therefore you need to
ensure you have a suitable sample size.
Paired or unpaired data
One factor that will be important in determining which type of test to use is whether or
not your data is paired. Paired data is derived from equivalent and matched
populations. For example, if you are comparing two drugs and you give drug A to 10
people of a certain age and population one day and 10 people of the same age and
population drug B another day, your data is matched and you can use a paired test. If
10 people are given drug A but 15 people given drug B, then your data is unpaired.
Parametric vs non-parametric test
A parametric test is used when the data is assumed to be of normal distribution and
equal variance. In contrast, non-parametric tests make no assumptions about
distribution or variance. In general, non-parametric tests are less powerful, but more
conservative. Any significance you find with the test is probably more real.
Type 1 and Type 2 errors
A statistical test can give a false result – often when the wrong test is used or a test is
used incorrectly. Two types of errors can be encountered.
A Type 1 error is a false positive. It is when you conclude that a result is statistically
significant when in fact it isn’t. A Type 2 error is a false negative, it occurs when actual
significance is missed.
Resources:
General Overview of Statistical Terms:
**http://click4biology.info/c4b/1/stat1.htm
**http://www.drexel.edu/dvsf/statistics_help.htm
**http://www.statsoft.com/textbook/stathome.html
**http://udel.edu/~mcdonald/statintro.html
Overview of which test to use:
http://www.le.ac.uk/bl/gat/virtualfc/Stats/introst.html
http://www.graphpad.com/guides/prism/6/statistics/
http://onlinestatbook.com/rvls.html
http://www.psychstat.missouristate.edu/introbook/sbk00.htm