Download Candidate Name

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
LEVEL 3
WJEC Level 3 Certificate in
STATISTICAL PROBLEM
SOLVING USING SOFTWARE
SPECIMEN ASSESSMENT
MATERIALS - External
Teaching from 2015
Level 3 Certificate in Statistical Problem Solving using software ESAMs 1
Candidate Name
Centre Number
Candidate Number
LEVEL 3 CERTIFICATE IN
STATISTICAL PROBLEM SOLVING USING
SOFTWARE
SPECIMEN PAPER
1 hour
ADDITIONAL MATERIALS
The use of a calculator is permitted in this examination.
INSTRUCTIONS TO CANDIDATES
Write your name, centre number and candidate number in
the spaces at the top of this page.
Answer all questions.
Write your answers in the spaces provided in this booklet.
For Examiner’s use only
Maximum
Mark
Question
Mark
Awarded
1.
2
2.
2
3.
5
4.
14
5.
4
6.
8
Total
35
INFORMATION FOR CANDIDATES
The number of marks is given in brackets at the end of each question or part-question.
You are reminded of the need for good English and orderly, clear presentation in your
answers.
No certificate will be awarded to a candidate detected in any unfair practice during the
examination.
Level 3 Certificate in Statistical Problem Solving using software ESAMs 2
1.
The data below are taken from the Global Slavery Index which states that:
'29·8 million people are in modern slavery globally'.
The three countries with the highest number of people in modern slavery are listed in
table 1 below.
Country Name
India
China
Pakistan
Rank for
Number of
Slaves
1
2
3
Country Population
Number of
Slaves
1 236 686 732
1 350 695 000
179 160 111
13 956 010
2 949 243
2 127 132
Table 1: Countries with the top 3 highest number of slaves (Global Slavery Index)
From the table above, India, China and Pakistan have the highest number of people
in modern slavery.
Explain why these three countries may not have the highest proportions of their
populations in modern slavery.
[2]
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
Level 3 Certificate in Statistical Problem Solving using software ESAMs 3
2.
The mean mark and the standard deviation for an end of term examination were
calculated for two classes, Class A and Class B.
Class A had a mean mark of 35·2 marks and a standard deviation of 2·8 marks.
Class B had a mean mark of 48·8 marks and a standard deviation of 5·1 marks.
Considering the summary statistics above, explain which class would you prefer to
teach.
Give a reason for your answer.
[2]
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
Level 3 Certificate in Statistical Problem Solving using software ESAMs 4
3.
House prices are always in the news.
For example:
The average asking price of a home has risen above the one per cent stamp duty
threshold of £250,000, according to new figures.
Sunday Telegraph, 23 March 2014
London's house prices soar to hits new record average of £409,000.
London Evening Standard, 28 February 2014
Figure 1 represents the prices of houses sold in 2013 for an area in the South West
of England.
Table 2 shows the descriptive statistics for the prices of houses sold in 2013 for an
area in the South West of England.
15
Frequency
10
5
House Price £s in 1000s
0
150
200
250
300
350
400
450
500
550
600
Figure 1: Prices of houses sold in 2013 for an area in the South West of England
Descriptives for house price
Minimum
£215 000
Mean
£318 179
Median
£288 500
Mode
£250 000
Maximum
£480 000
Standard Deviation
£77 678
Range
£265 000
Count
28
Table 2: Descriptive statistics for prices of houses sold in 2013 for a certain area in the South West of
England.
Level 3 Certificate in Statistical Problem Solving using software ESAMs 5
(a)
Which is the most appropriate average to use for this sample of house
prices?
[1]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(b)
Explain why you consider the average you chose in part (a) to be the most
appropriate.
[2]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(c)
Referring to the information on the previous page, state two possible ways in
which reports on the average cost of houses can be misleading.
[2]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Level 3 Certificate in Statistical Problem Solving using software ESAMs 6
4.
“Smoking cigarettes is probably the No. 1 cause of adverse outcomes for
babies," says Robert Welch, MD, chairman of the Department of Obstetrics and
Gynecology at Providence Hospital in Southfield, Michigan.'
Table 3 shows 4 responses from a total of 1132 obtained from mothers with newborn
babies.
Mother's Weight
(lbs)
90
135
107
250
Mother's Height
(inches)
60
67
66
66
Baby's Birth
Weight (lbs)
6·88
9·31
7·69
7·88
Mother Smokes
NO
NO
YES
NO
Table 3: Extract from 1132 responses from mothers with newborn babies.
(a)
A statistician wishes to compare the average weight of babies born to
mothers who smoke with those who do not smoke.
How could the statistician present the data from the 1132 mothers graphically
to make the comparison?
State the variables the statistician would use.
[2]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(b)
Give a reason for your choice of graphical display in part (a).
[1]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(c)
The weights of newborn babies for mothers who smoke and for the weights of
newborn babies for mothers who do not smoke may be considered to be
Normally distributed and have equal variances.
Give two reasons why a two sample t-test is an appropriate test to compare
the average birth weight of babies born to mothers who smoke with those
who do not smoke.
[2]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Level 3 Certificate in Statistical Problem Solving using software ESAMs 7
(d
State clearly the null and the alternative hypothesis for a two sample t-test to
compare the average birth weight of babies born to mothers who smoke with
those who do not smoke.
[4]
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
…………………………………………………………………………………………
A two sample t-test at the 0·05 level is carried out to compare the average
birth weight of babies born to mothers who smoke with those who do not
smoke using statistical software.
The output for this test is in table 4.
Mean
Variance
Number of Observations
Pooled Variance
Degrees of freedom
t Stat
P(T≤t) one-tail
t Critical one-tail
P(T≤t) two-tail
t Critical two-tail
Birth Weight (lb) Non Smoker
Birth Weight (lb) Smoker
7·690
7·089
1·187
1·289
688
444
1·227
1130
8·914
0·000
1·646
0·000
1·962
Table 4: Statistical software output for the two sample t-test for birth weight of babies born to mothers
who smoke with those who do not smoke
(e)
Interpret this output and clearly state your conclusions.
[3]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Level 3 Certificate in Statistical Problem Solving using software ESAMs 8
(f)
If the statistician wanted to carry out further studies on the weights of
newborn babies what other factors concerning the mothers might they wish to
investigate?
[2]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
5.
A researcher wanted to investigate whether UK secondary school pupils learn how to
carry out a statistical investigation more effectively by using statistical software or by
plotting graphs by hand and carrying out calculations using a calculator.
The study involved 52 pupils in year 11 at a secondary school in the UK.
The researcher decided that one class (26 pupils) would be taught in a classroom
using calculators for the statistical analysis and plotting graphs by hand.
The other class (26 pupils) would carry out the analysis in a computer room using
statistical software.
(a)
State the target population for this investigation.
[1]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(b)
Write down three weaknesses in this experiment design.
[3]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Level 3 Certificate in Statistical Problem Solving using software ESAMs 9
6.
“Brush your child’s teeth twice a day for two minutes and pay special attention
to the back molars, where cavities tend to develop” is the advice from
Live Science 21 March 2014.
A researcher wishes to investigate whether the frequency of teeth brushing by
children in the UK is related to age.
The researcher was only able to collect data from 12-year-olds and 15-year-olds at
one large UK secondary school.
The results are shown in table 5.
Frequency of teeth brushing
Less than once a day
Once a day
Twice a day
Three or more times a day
Total
Number of children
Age
12 years
15 years
24
10
149
84
434
339
45
65
652
498
Total
34
233
773
110
1150
Table 5: The number of times children clean their teeth a day at a UK secondary school.
The researcher decided to carry out a Chi-Squared test on these data.
(a)
Give a reason why the Chi-Squared test is appropriate.
[1]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
(b)
State clearly the null and the alternative hypothesis that the researcher is
testing.
[4]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Level 3 Certificate in Statistical Problem Solving using software ESAMs 10
The p-value for the Chi-Squared test in this case is less than 0·001.
Interpret the p-value at the 0·05 level of significance.
[2]
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
……………………………………………………………………………………………………………
(c)
Write the conclusion in relation to the original problem.
[1]
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
…………………………………………………………………………………………………
Level 3 Certificate in Statistical Problem Solving using software ESAMs 11
Specimen Assessment Materials
Statistical Problem Solving
1. Full explanation with reference to the
population not taken into account
e.g. “in order to compare we should take into
account the population and use the
percentages/proportions or fractions. The table
only shows the counts”.
2. Choice of class with clear reasoning for
choice referring to the means and standard
deviations
e.g. “Prefer to teach Class A. Although the
students have not such a high mean/average
mark as Class B the spread is less and
therefore the ability of the children similar” or
“Prefer to teach Class B. The students in
Class B have a higher mark and perhaps are
more able but would be more challenging to
teach as the class would be of mixed ability”.
3. (a) Median or modal group/class.
3. (b) Explanation that the distribution is
positively skewed (or not symmetrical) so
median is the best measure of location as it is
not distorted by extreme values.
OR
Explanation that the distribution is positively
skewed (or not symmetrical) so the modal
group/class is not distorted by extreme values.
3. (c) Two statements relating to the possible
misuse using the mean instead of the median
or mode.
e.g. “The statements from the Sunday
Telegraph and the London Evening Standard
could be related to selling prices not the sold
price”
“The mean can be used to exaggerate the sold
prices of houses”
“The way the average is calculated is not
given; not sure which average is being used so
very difficult to compare”.
Comments
Page 1
1 mark for sight of percentage/fraction but
no reference to the population
OR
1 mark for a partial explanation
Mark
AC
2
3.3
[2]
2
3.3
No marks if Class A or B stated with no
reason.
1 mark if Class A or B is stated referring to
either the mean or the spread
OR
1 mark if discussion of mean and spread but
no choice of class.
1.4
No marks for mode.
1.5
1 mark for mention of shape of the
distribution
e.g. most values around £250,000 with a
few much more expensive, not symmetrical.
1 mark for stating the median is not affected
by extreme values as it is the middle value
OR
1 mark for stating the modal value or group
is not affected by extreme values as it is the
most frequent value or group.
4.1
1 mark for each appropriate statement.
[2]
1
(1)
2
(2)
2
(2)
[5]
Level 3 Certificate in Statistical Problem Solving using software ESAMs 12
Specimen Assessment Materials
Statistical Problem Solving
4. (a) Box plots or histograms plus variables
used: Baby's Birth Weight (lbs) and Mother
Smokes.
4. (b) Box plots to compare medians and
spread, or histograms to compare the shape of
the distribution and the spread
4. (c) Two clear statements on why the two
sample t-test is appropriate in this case.
e.g. the data are not paired;
different sample sizes;
the assumptions of normally distributed
populations and equal variances hold (given in
question); parametric test.
4. (d) Clear expressed null hypothesis including
correct notation Ho or Null Hypothesis and
relating to the context of the problem.
e.g. Ho or Null Hypothesis: The population
mean birth weight of babies born to mothers
who smoke is the same as the mean birth
weight of babies born to mothers who do not
smoke.
Clear expressed alternative hypothesis
including correct notation H1 or Alternative
Hypothesis and relating to the context of the
problem.
e.g. H1 or Alternative Hypothesis: The
population mean birth weight of babies born to
mothers who smoke is different to the
population mean birth weight of babies born to
mothers who do not smoke.
4. (e) Correct interpretation of the p-value at
the 0·05 significance level
e.g. since the p-value is less than 0·05 we
reject the null hypotheses (Ho) and conclude
there is evidence to suggest there is a
difference in the means for the weights of
babies born to mothers who smoke and those
who do not smoke.
4. (f) Two examples of what other
factors/variables could be considered
e.g. Mother's weight, age, height, life style
Comments
Page 2
1 mark for box plot or histogram with no
mention of variables used or
1 mark for variables used i.e. Baby's Birth
Weight (lbs) and Mother Smokes
Mark
AC
2
1.4
(2)
1
1.5
(1)
2
4.2
1 mark for each correct statement
(2)
2
1.1
1 mark for partial explanation for null
hypothesis
e.g. Ho or Null Hypothesis: Means are
equal.
Ho or Null Hypothesis: Mean baby
weight are equal.
1 mark for correct hypothesis but H0 missing.
2
1.1
1 mark for partial explanation for alternative
hypothesis
e.g. H1 or Alternative Hypothesis: Means
are not equal.
H1 or Alternative Hypothesis: Mean
baby weight are not equal.
1 mark for correct hypothesis but H1 or
Alternative Hypothesis missing.
Full marks cannot be awarded if population
is not mentioned or implied.
(4)
3
3.3
1 mark for stating the p-value is less than
0·05. (No marks for saying the p-value is
equal to zero.)
1 mark for rejecting the null hypotheses (Ho)
or accepting the alternative hypotheses (H1).
1 mark for concluding there is evidence to
suggest there is a difference in the means
for the weights of babies born to mothers
who smoke and those who do not smoke.
(3)
2
4.2
1 mark for each appropriate example.
(2)
[14]
Level 3 Certificate in Statistical Problem Solving using software ESAMs 13
Specimen Assessment Materials
Statistical Problem Solving
5. (a) All children in the UK.
5. (b) Three weaknesses stated
e.g. No randomisation within the classes;
teacher has to decide which class received
each treatment;
the abilities of the classes may vary;
the classes may have different teachers;
only one school.
6. (a) Explanation of the appropriateness of
Chi-squared test
e.g. “it is appropriate because counts are being
used”
6. (b) Clear expressed null hypothesis including
correct notation Ho or Null Hypothesis and
relating to the context of the problem
e.g. H0 or Null Hypothesis: the number of times
a day children brush their teeth is independent
of age
OR
H0 or Null Hypothesis: there is no association
between the number of times a day children
brush their teeth and age
Clear expressed alternative hypothesis
including correct notation H1 or Alternative
Hypothesis and relating to the context of the
problem
e.g. H1 or Alternative Hypothesis: the number
of times a day children brush their teeth is not
independent of age
OR
H1: or Alternative Hypothesis there is an
association between the number of times a day
children brush their teeth and age
6. (c) Correct interpretation of the p-value at
the 0.05 significance level
e.g. “since the p-value is less than 0.05, there
is evidence to suggest we reject the null
hypothesis (Ho) (or accept the alternative
hypotheses (H1))”
6. (d) For relating the conclusion to the original
problem
e.g. “there is evidence to suggest that age
does have an effect on how often children
brush their teeth a day”.
Mark
AC
1
(1)
3
1.3
2.2
Comments
Page 3
1 mark for each weakness stated.
(3)
[4]
1
1.5
(1)
2
1.1
1 mark for partial explanation for null
hypothesis.
e.g. reference to independence or
categories of number of times children clean
their teeth but not both.
e.g. “H0 or Null Hypothesis : the variables
are independent”
OR
1 mark for correct hypothesis but H0 or Null
Hypothesis missing.
2
1.1
1 mark for partial explanation for alternative
hypothesis
e.g. reference to
association/non-independence or categories
of number of times children clean their teeth
but not both.
e.g. “H1 or Alternative Hypothesis: the
variables are associated”.
OR
1 mark for correct hypothesis but H1 or
Alternative Hypothesis missing.
Full marks cannot be awarded if population
is not mentioned or implied.
(4)
2
3.3
(2)
1
Must include “reject H0” or equivalent for 2
marks
1 mark for comparison of p-value with 0.05
but no conclusion OR
1 mark for “reject H0” or equivalent with no
reference to 0.05
4.1
(1)
[8]
Level 3 Certificate in Statistical Problem Solving using software ESAMs 14
Mark and Assessment Criteria grid
ESAMs LEVEL 3 Certificate In Statistical Problem Solving Using Software/HT/04 11 2014