Download Stats Practical 3 2006

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Eigenstate thermalization hypothesis wikipedia , lookup

Confidence interval wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
GY2170
Statistics Practical 3
STATISTICS PRACTICAL 3
The Normal Distribution, Student’s t Test & Confidence Levels
1. If the population of shell length-to-width ratios of a species of bivalve is normally distributed with a
mean of 1.65 and a standard deviation of 0.05, what is the probability that any one shell picked at
random has a length-to-width ratio: (i) less than 1.65; (ii) greater than 1.75; (iii) within two standard
deviations of the mean; and (iv) between 1.58 and 1.69?
2. Assess whether a correlation coefficient value r = 0.9308 (N=18) represents a statistically significant
relationship between two variables at the 99% confidence level (i.e.  = 0.01) by testing your value
against the hypotheses: Ho (null hypothesis)  = 0 and Ha (alternate hypothesis)  > 0 (N.B. your lecture
handout will help you with this question).
3. This question follows on from Q1 of the Statistics 1 practical.
Maximum pebble lengths were also measured at a second point-bar locality 5km upstream
first. A sample of 31 pebbles yielded the following data (lengths in cm.):
9.7
12.6
14.8
10.6
12.6
14.9
10.9
12.8
15.1
11.0
12.9
15.6
11.5
12.9
16.2
11.8
13.0
17.4
11.9
13.1
17.9
12.2
13.1
18.2
12.2
13.4
19.6.
12.3
13.8
of the
12.4
14.3
Your task is to establish whether or not the pebble lengths at this locality are significantly larger than
those at the first locality, i.e. whether or not the two samples are likely to have been taken from different
populations. If the means of the two samples are x 1 and x 2 ,the null hypothesis (Ho) may be stated
thus: “ x 2 is not significantly larger than x 1 ”. The alternative hypothesis (Ha) is: “ x 2 is significantly
larger than x 1 ”.
Student’s t test may be applied to this problem. When comparing two sample means, the test statistic, t,
must be calculated using the expression:x1  x 2
t
S
S
1
N1 
1
N2
 N 1  1s12   N 2  1s22
N1  N2  2
where N1 and N2 are the sizes of samples 1 and 2, respectively, x 1 and x 2 are their means, s1 and s2
are their standard deviations, and the number of degrees of freedom is given by  = N1+N2-2.
(a) Calculate x 2 and s2 ( x 1 =10.51; s1=1.91 cm).
(b) Supposing you wish to be 99% confident in your final decision, fix , the size of the critical region.
(c) Decide whether you need to use the one-tailed or two-tailed test.
(d) Refer to the table listing critical values of t as a function of  and , and define the critical value of t
in this case.
(e) Calculate the t-statistic as given by the formula above.
(f) Do you accept Ho or Ha? State your conclusion (hint: ignore the sign of the t value that you calculate,
a negative sign just means that x 1 is smaller than x 2 ).
4. Volcanologists have measured the hydrogen content (in % of total number of atoms) of samples of gases
collected from the 1970 and 1971 Mount Etna volcanic eruptions. Values are given in the following table:
GY2170
Statistics Practical 3
1970
Hydrogen content (%)
35.8
45.5
35.5
32.0
50.0
39.0
37.0
47.0
38.5
36.0
40.5
35.5
45.5
37.0
36.0
53.0
1971
Hydrogen content (%)
42.0
57.0
42.0
54.5
35.0
52.5
43.5
48.0
45.0
44.6
48.5
63.0
55.0
40.0
37.5
53.7
(a) Calculate a mean hydrogen value for the 1970 eruption and use Student’s t-distribution to find the 95%
confidence limits for the true value.
(b) Use the Student's t-test for comparing means to determine whether there is a difference in the hydrogen
content of the gases between the two eruptions at the 99% confidence level.
GY2170
Statistics Practical 3
Statistics Practical 3: Answers
1. This question is best answered through the use of z-values and use of the standard normal curve:
(i) The probability is 0.5 (i.e. 50% chance) that a shell will have a L:W ratio <1.65.
z
(ii) z 
165
.  165
.
0
0 .05
i.e. 50% of the normal curve
175
.  165
.
 2 i.e twice the standard deviation. 97.73% of the normal curve is under z = 2, so
0.05
the probability of picking a shell at random with L:W > 1.75 is 1-0.9773 = 0.0227.
(iii) The probability that L:W is within 2 standard deviations of the mean = 0.9545.
(iv) Probability of the L:W ratio being between 1.58 and 1.69
z
1.58  165
.
 1.4
0.05

z
169
.  165
.
 0.8
0.05
Cumulative probability z = +0.8 = 0.7881

Cumulative probability z = -1.4 = 0.0808


Pr = 0.7881-0.0808 = 0.7073
2. The correlation coefficient for r = 0.9308, N = 18.
The test statistic is Student's t given as:
 N  2
t r 

1 r2 
t = 10.19
with (N - 2) degrees of freedom
with 16 degrees of freedom
The level of significance  = 0.01 and, from the table, the critical value of t with  = 16 is 2.583. We are
dealing with a one tailed test so we reject Ho if t > critical value of t. Since this is true, we accept Ha and
say that there is less than 1 in 100 chances of such an extreme correlation coefficient coming from a
population coefficient  = 0.
3. (a) For the additional pebble data x 2 = 13.57 cm and s2 = 2.32 cm (31 observations).
(b) The critical region  = (100-99)/100 = 1%.
GY2170
Statistics Practical 3
(c) We are testing to see if x 2 is significantly larger than x 1 so a one-tailed test is used. (If we were
testing to see if x 2 is significantly different from x 1 then we would use a two-tailed test).
(d) The number of degrees of freedom,  = N1 + N2 - 2 = 51 + 31 - 2 = 80. Reading from the table of t
values for 80 degrees of freedom and = 0.01, then the critical value of t is 2.37.
(e)
x1  x 2
t
S
S
S
1
N1 
1
N2
 N 1  1s12   N 2  1s22
N1  N 2  2
. 2    31  1  2.32 2 
 51  1  189

51  31  2
t
340.077
 2.062
80
10.51  13.57
 3.06

 6 .52
2.061 151  131 0.4694
(g) The calculated value of t is negative, but all this means is that x 2 is less than x 1 (which we know
already), when comparing t values we use the absolute value (i.e. 6.52). Therefore the calculated t value
is greater than the critical value, thus Ho is rejected and Ha is accepted. We can state that there is a 99%
certainty that x 2 is greater than x 1 and that the pebbles from the second locality are significantly larger,
on balance, than those at the first locality.
4. (a) For the 1970 sample of gas:
x1  40.24
s1  6.07
N 1  16
The interval containing the true hydrogen content is given by:
1     Pr  x  t SE( x )    x  t SE( x )
2
2


The significance level  = 5% for a two-tail distribution and the number of degrees of freedom,  = N-1 =
15. From the table the critical value of t = 2.131.
GY2170
The standard error, SE( x ) of the mean is:
Statistics Practical 3
s
SE( x ) 
 1.518
N
The 95% confidence interval for the value hydrogen content is therefore:
  40.24  2.131 1.518
  40.24  3.23%
(b) For the 1971 eruption
x 2  47.61
s 2  7.76
N 2  16
For comparing means the test statistic is
x1  x 2
t
S
1
1

N1 N 2
( N 1  1 )s12  ( N 2  1 )s 22
S
N1  N 2  2
Substituting values gives
S
t
( 16  1 )  6.07 2  ( 16  1 )  7.76 2
 6.97
16  16  2
40.24  47.61
 2.99
1
1
6.97 

16 16
The calculated value of t is negative, but all this means is that x 1 is less than x 2 (which we know already),
when comparing t values we use the absolute value (i.e. 2.99).
The critical value of t for =1%, =N1+N2-2 = 30 and two-tailed distribution is 2.75. Therefore the
calculated t value is greater than the critical value, thus Ho is rejected and Ha is accepted. We can state
that there is a 99% certainty that x 2 is greater than x 1 and that the amount of hydrogen from the 1971
eruption was significantly higher, on balance, than that released during 1971.