Download L2 Sampling Exercise

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

Sampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
L2 Sampling Exercise
A possible solution
Selecting a sample


I will select my sample using a
systematic sampling technique.
For this I will systematically select
every 4th student (136/30) starting
at student number 71 which I have
randomly selected using the random
number function on my calculator.
Sample Data
year
sex
distance
year
sex
distance
year
sex
distance
12
m
0.1
13
f
2.5
11
f
0.8
12
f
0.5
13
f
1.9
11
f
1.6
12
m
0.2
13
m
0.1
11
m
1.7
12
m
1.6
13
f
3.2
11
f
3.2
12
f
1.2
13
f
2.5
11
f
1.6
12
f
2.3
13
f
0.8
11
f
1.2
12
f
4.7
13
f
3.9
11
f
1.9
12
f
1.5
13
m
0.5
11
m
1.6
12
f
3.2
11
f
2.5
11
m
0.1
11
m
3.0
11
f
2.1
11
f
1.9
Justify choice of method


I chose a systematic method
because it was quite quick and easy
to use.
I know that it will give me a good
spread of year groups because the
data has been sorted into year
groups.
Representative


Because I used a systematic
method I have a close to, (see
later) proportional representation of
year levels.
I seem to have selected more
females than males, 21 females and
9 males, so males are perhaps
under represented.
Statistics


For my sample I got a mean of
1.7967 and a sample standard
deviation of 1.168.
Using these values I would estimate
a population mean of 1.8 km and a
standard deviation for the
population of 1.2 km.
0
1
2
3
4
5
Box Plot
0
1
2
3
4
5
From my box plot I can see that the data is evenly spread.
It is fairly symmetrical about the median and a bit spread out between
the upper quartile and the largest value.
The median distance is 1.65 km which is just lower than the mean
indicating the data is skewed more toward the lower distances.
75% of students live within 2.5 km of school.
Evaluation



The sampling process is an appropriate one given
the way the original data was presented. Because
it was ordered according to year levels a
systematic sampling method gave me the same
sort of result that a stratified method would have
given me, with a lot less work.
I noticed in my sample more females than males
however this would not affect the estimate as a
student who lives any given distance from school
is just as likely to be male or female.
In dividing 136 by 30 the answer comes out as
4.53. I chose to sample every 4th person rather
than every 5th. This should not have biased my
result in any way as there is still a representative
number chosen from each level.
Evaluation continued





The mean for the 9 yr12 students is 1.7, sd = 1.5
The mean for the 8 yr 13 students is 1.925, sd =
1.35
The mean for the 13 yr 11 students is 1.78, sd =
0.84
From the statistics above it would appear that
there is little difference between the distances
traveled for the different year levels.
Due to the median being a bit lower than the
mean it suggests that the mean has been affected
by the unusually large values of 4.7 km and 3.9
km which are significantly larger than the rest.
Conclusion

For this reason I would suggest that
a better estimate for the average
distance is closer to 1.7 km.