Download Sampling and statictics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Sampling
When we want to study populations.
We don’t need to count the whole population.
We take a sample that will REPRESENT the
whole population.
How do we know when our
sample is representative?
 We can use a RUNNING MEAN.
 Work out the mean of your data as you
collect it.
 When the mean doesn’t change then your
sample is representative.
Sampling works best
when…
 You take lots of samples
 When the samples are RANDOM
 When the sample sizes are large
 Samples are unbiased
Reliable data…
 Has been repeated many times
 We can then see any anomalies
 We can see any variation in our results
Accurate data…
 When the method has been followed very
closely
 No errors in the process
 All results should be similar (close to the
mean)
Precise data
 Has been carried out using equipment that
has good precision i.e. many decimal places
or measured to the smallest increment
possible.
How to take random
samples.
LEARN THIS!
 Dividing the area into a grid (e.g. place a
grid underneath a Petri dish)
 Coordinates are chosen at random by using
a random number generator e.g. using a
calculator or random number (such as a
phone book!)to select co-ordinates
 Sample this area using the relevant method.
Methods of sampling
 For immobile organisms - QUADRATS
Look at
•% cover
•Density of species
•Frequency of species.
Types of Quadrat
•These allow RAPID collection of
data
•We do not need to define individual
plants when collecting data about %
cover
Frame Quadrat
Point Quadrat
Sampling mobile species
 Mark – release – recapture.
 To estimate a population size use this
equation…
Total of 1st capture x Total of 2nd Capture
Total Marked in 2nd Capture
How to catch mobile
organisms…
 Use a beating tray!
 Trapping
Take care with marking
the organisms!
 Always “Mark” the organisms in an area
that is not visible. This will reduce the
chance of attracting predators
Things to consider…
 Always allow your 1st capture to re-integrate
into the environment before you carry out
your 2nd capture. This will give representative
data.
 This process does not consider migration
 This process does not consider breeding
seasons.
To look at the distribution
of species in a habitat
 We can use a TRANSECT
Transects…
 Allow us to take a line through an area
 This can give us a guide to follow when
looking systematically at the distribution of
organisms.
 Particularly good for looking at zones like
sea shores.
What do you do with your
data?
Test a Null Hypothesis
Your Null Hypothesis will be that the
Independent Variable will have NO EFFECT
on the Dependent Variable
We use statistical analysis to prove or
disprove the Null Hypothesis.
Which statistical tests do
we carry out?
 Standard deviation – Year 12 work
 Standard error and 95% confidence limits
 Chi Square
 Spearman Rank
Standard Error and 95%
Confidence Limits
 S.E. gives us the parameters that the total
population can fall into, no matter what sample
you take, 95% of the time.
 Standard error gives us our 95% confidence
limits.
 This means that our results happen due to
chance less than 5% of the time.
How to work out the
standard error.
 S.E. =
Standard deviation
n
We plot standard error as
bars on a graph.
Confidence limits
• Draw your axis
X
• Plot your means
X
Sample 1
Sample 2
• Plot TWO
standard errors
either side and
draw a line.
What do these bars tell
us?
Look to see
whether the
bars overlap.
X
These do not
This means that
we
can
overlap so we can
REJECT the
Nullthe 2 sets
say that
X
of data are
Hypothesis
significantly
different at the
95% confidence
limits
X
These 2 sets of
This means data
thatdo
weoverlap so
we say
that they
can ACCEPT the
Null
X
are not significantly
Hypothesis
different at the 95%
confidence limits
Crabs
 Crabs were found on 2 different beaches; one
sandy and one rocky.
 On the sandy beach the crabs were - 5, 7, 8, 8,
7, 10, 14, 3, 6, 7, 11, 20, 21, 3, 17 cm
 On the rocky beach the crabs were - 10, 12, 15,
18, 19, 22, 14, 23, 23, 29, 11, 12, 22, 18, 17 cm
 What is your Null Hypothesis?
 We can use standard error to test the Null
Hypothesis.
Chi-Square 2
Chi-square (2) is used to decide if
differences between sets of data are
significant.
 It compares your Observed data with the
Expected data and tells you the probability
(P) of your Observed results being due to
chance.
Null Hypothesis
 Before we start an investigation we write
a null hypothesis.
 This tells us that we think there will NOT
be any relationship in our results.
 We accept or reject this hypothesis at the
end of the analysis.
How to do Chi-square…
Look at this example
 Suppose you flip a coin 100 times. You know that
if the coin is fair or unbiased that there should be
50% of heads and tails.
 How do you know though that the coin really is fair
and not biased in some way?
 We’re going to test this. What is your null
hypothesis?
My results…
Outcome
Observed
Number, O
Heads
60
Tails
40
Total
100
Expected
Number, E
Work out the Chi Square!
(O-E)²
E
Try this one…
Observed Expected
red
34
40
pink
84
80
white
42
40
160
160
Total
Work out Chi-Square!
Method…
Actual
numbers
red
flowers
pink
flowers
white
flowers
Total
34
84
Expected
numbers
40
80
42
40
160
160
(O-E)2
(O-E)2/E
36
0.9
16
0.2
4
0.1
1.2
What does this all mean?
 The Chi-Square value will help us to find the
probability of our results being due to
chance, or whether something is significantly
influencing them.
 In Biology we say that if results occur due to
chance more than 5% of the time then we
cannot say that they are significant.
 Our Chi-Square value can help us find out
what % of our results are due to chance.
 We have to use a probability table to find this
out…
Before we look at the
probablility table…
 The number of variables you have, minus
1 = N-1 gives you your DEGREES OF
FREEDOM (dF)
 Follow the numbers across until you find the
one that is closest to, but not higher than, your
Chi-Square result.
 Read up
 Look at the probability.
 If it is 0.05 (5%) or less then it means that 5%
(or less) of your results are due to chance –
these would be significant results.
 You would reject the Null hypothesis
Our
CRITICAL
VALUES
1- I have 4 dF and a Chi-Square value of 16.45. What is my conclusion?
2- I have 3 dF and a Chi-Square value of 4.25. What is my conclusion?
3- I have 5 dF and a Chi-Square value of 3.27. What is my conclusion?
4- I have 6 dF and a Chi-Square value of 13.98. What is my conclusion?
Spearman Rank… rs
 This statistical test tells us whether there is a
significant association between two sets of
data.
 E.g you could carry out Spearman Rank to
prove a significant association between
temperature and Enzyme Activity.
 You MUST have at least 7 measurements
Is there a significant association
between wing length in seeds and
the distance they fall from the
parent tree?
Seed Number
Length of wing/mm
Distance from tree/ m
1
34
21
2
28
19
3
40
17
4
33
15
5
42
30
6
35
22
7
23
17
8
27
20
9
20
15
First, rank each column
from lowest to highest
Notice that for 2 results that are
the same we rank them between
two levels e.g. 1.5 and 1.5 instead
of 1 and 2.
Seed
Number
Length of
wing/mm
Distance from
tree/ m
Rank 1 – wing
length
Rank 2 – Distance
from tree
1
34
21
6
7
2
28
19
4
5
3
40
17
8
3.5
4
33
15
5
1.5
5
42
30
9
9
6
35
22
7
8
7
23
17
2
3.5
8
27
20
3
6
9
20
15
1
1.5
Next, find the difference
between the ranks.
Seed
Number
Length of
wing/mm
Distance
from tree/
m
Rank 1 –
wing length
Rank 2 –
Distance
from tree
Difference
between ranks
D
1
34
21
6
7
1
2
28
19
4
5
1
3
40
17
8
3.5
4.5
4
33
15
5
1.5
3.5
5
42
30
9
9
0
6
35
22
7
8
1
7
23
17
2
3.5
1.5
8
27
20
3
6
3
9
20
15
1
1.5
0.5
Next, Square the difference
Seed
Number
Length
of wing
/mm
Distance
from
tree/ m
Rank 1 –
wing
length
Rank 2 –
Distance
from tree
Difference
between
ranks
D
Difference
Squared
D2
1
34
21
6
7
1
1
2
28
19
4
5
1
1
3
40
17
8
3.5
4.5
20.25
4
33
15
5
1.5
3.5
12.25
5
42
30
9
9
0
0
6
35
22
7
8
1
1
7
23
17
2
3.5
1.5
2.25
8
27
20
3
6
3
9
9
20
15
1
1.5
0.5
0.25
How to work it out…
r s = 1- 6 x ΣD2
n3 -n
n = the number of pairs of items in the
sample.
D2= the difference between ranks
squared
So in our example…
Difference
Squared
D2
1
1
20.25
12.25
0
1 – 6 x 47
729-9
1 – 282
720
1
2.25
9
0.25
r s = 1- 6 x ΣD2
n3 -n
= 0.61
So, what do
we do with the
0.61?
Number of pairs of
Critical Value
Use
the
Critical
Value
Table
measurements
5
1.00
6
0.89
7
0.79
8
0.74
9
0.68
10
0.65
12
0.59
14
0.54
16
0.51
18
0.48
If your Spearman Rank Value is less than the Critical Value then
you ACCEPT the Null Hypothesis
If your Spearman Rank Value is more than the Critical Value
then you REJECT the Null Hypothesis