Download Lecture 5 - Quantification and Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Measurement,
Quantification and Analysis
Some Basic Principles
Three Major Issues
• 1) Biological and especially ecological data
show high variability in quantitative traits
• 2) We almost never measure everything in
field research; rather we sample from larger
populations or data sets
• Sampling leads to uncertainty about
conclusions, so we always must estimate our
uncertainty
Variability
Continuous data
All natural processes are variable,
Whether continuous or discreet
Discreet data
Plus, better sampling effort better describes distributions
In many processes, we observe characteristic distributions
Binomial – Few interacting factors
Normal – Many interacting factors
2 factors: One way to get AA or aa, 2 ways to get Aa
4 factors: One way to have AAAA or aaaa, 4 ways to get AAAa or aaaA, and 6 ways to get AAaa
Sampling and Estimation
A characteristic of field biology is the attempt to estimate parameters
from highly variable populations of uncertain “true” value.
To calculate the average in a sample:
Mean = Sum of all observations/number of observation
To estimate the variability of the observations:
-1
Variance = Sum of (individual observation – Mean of observations)2
_____________________________________________
Number of Individual Observations - 1
Or to express this in the same units as the Mean:
Standard deviation = Square Root of the Variance
Probability
All natural processes are
variable,
Whether continuous or discreet
What happens when we
estimate means? Select 5
observations at random. Then
10. Then 25.
1) Better sampled populations yield better
distributions
2) Larger sample sizes yield better
estimates
3) Means will also be variable, and will
have a characteristic distribution
To estimate the variability of the means:
Divide the standard deviation (the square root of the variance)
by the square root of the sample size (why? Variability of the means
is dependent upon sample size.)
Recall, To estimate the variability of the observations:
Variance = Sum of (individual observation – Mean of observations)2
_____________________________________________
Number of Individual Observations – 1
To estimate the variability of the means:
Divide the square root of the variance, the standard deviation,
by the square root of the sample size. The bigger the sample size,
the less variable the means
This is the Standard Error, which is used to calculate a Confidence Interval
Uncertainty
Confidence intervals represent a level of confidence about the true value of the mean.
In other words, if you sample repeated with a given sample size, a 95 % CI means that in 95 %
of the samples you collect, you will have the value of the true mean.
•No matter how well we
sample, we will “missestimate” the population
parameter a certain
percentage.
•What level of error are we
willing to accept?
•With a 95 % limit, 5 % of
the time.
•In theory, the tails are
limitless, so we must set a
criterion.
•Decision rule – 5 % error.
•Minimize this with
replication
Importance of Replication?
One sample: Wrong 5% or 1/20 of the times you sample
Two replicated samples: Wrong 1/20 x 1/20 or 1/400
Three replicated samples: Wrong 1/20 x 1/20 x 1/20 or 1/8,000
One things we do frequently in science is compare
things.
For example, if one population bigger than another,
which population are we sampling from?
What kinds of errors can we make?
What confidence do we want?
What error will we accept?
A
B
Fundamental Principles
• Have clearly defined hypotheses
• Measure carefully
• Sample intensively – large sample sizes reduce
Beta-Error
• Replicate – Replication reduces Alpha-Error
Samples of Data Sets from
Previous Projects that
required Quantification and
Statistical Analysis
Forearm (mm)
Sum of
Squares
4053.985
Between Groups
Within Groups
Total
Between Groups
2
Mean Square
2026.993
152.971
43
3.557
4206.957
45
254.274
2
127.137
98.161
43
2.283
352.435
45
Within Groups
Total
F
569.784
Sig.
.000
55.693
.000
Principal Component Plot
REGR factor score 2 for analy sis 1
Foot (mm)
df

2 .00 0 00


1 .00 0 00


0 .00 0 00

2

3
3

3

2

2
2
-1 .0 0 00 0
-2 .0 0 00 0
1

1

1
1
1 
1
1  1
1

1


1
1


1
 1
1

1

1



-2 .0 0 00 0
1

1


1
1


1  11

 1
2
2



1

-1 .0 0 00 0
0 .00 0 00
1
1 .00 0 00
REGR factor score 1 for analysis 1
1