Download Suggested solutions for Trial exam

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Some solutions to the trial exam in HMM4101, fall 2005
Part A
1.
2.
3.
4.
5.
6.
7.
8.
See pages 66-70 in Kumar.
See Chapter 8 of Kumar.
See Chapter 9 of Kumar.
See page 147 of Newbold.
See Section 9.1 of Newbold.
See page 62 and page 166 of Newbold.
See lecture notes from 2005-11-14.
See Section 12.6 in Newbold.
Part B
1.
a. By using the definition of sample correlation, it can be computed as
sxy
8609.9
rxy 

 0.45774  0.46
2 2
241.73 1463600
sx s y
b. According to standard formulas for the regression coefficients, we get
sy
1463600
b1  rxy  0.45774 
 35.618  35.6
sx
241, 73
b0  y  b1 x  1621.8  35.618  46.6  37.9988  38.0
That b1 has the value 35.6 means that under the fitted regression model, expected
dental expenses increase with 35.6 kroner for each year the person gets older.
c. As the coefficient of determination for simple regression is equal to the square of
the correlation coefficient, we get R2  rxy2  0.457742  0.20953  0.21 . This
number tells us something about the amount of variance “explained” by the
regression; it is equal to the regression sum of squares divided by the total sum of
squares. Only 21 percent of the variance is explained by the regression which
means that the data do not follow a straight line very closely. This is also apparent
from the scatter plot.
SSE
d. As R 2  1 
, we get SSE  SST  1  R 2  . We can compute SST from the
SST
1 n
1
2
SST . We get
variance of y, as s y2 
 yi  y  

n  1 i 1
n 1
SSE  SST  (1  R2 )  (n 1)sy2 (1  R2 )  49 1463600  (1  0.20953)  56689663  56689765
We can now compute the error variance estimate as
SSE 56689765

 1181037. This means that the estimated model error
n2
48
standard deviation becomes 1181037  1087 , which seems reasonable when
looking at the scatter plot.
e. The null hypothesis would be that the slope 1 of the population regression
model Y  0  1 x   is equal to zero. What the natural alternative hypothesis
could depend on your previous knowledge. But in general, it seems at least
possible that dental expenses could both increase with age, and decrease with age.
Thus, the alternative hypothesis should be that 1  0 or 1  0 , and we will use a
two-sided test, so that we divide the significance level of 0.05 by two and use the
b 0
number 0.05/2=0.025 below. The test statistic is given by t  1
, where we
sb1
se2 
can compute sb1 by sb1 
t
se2
1181037

 9.98547 . Thus, we get
2
(n  1) sx
49  241.73
b1
35.618

 3.567 . Under the null hypothesis, the test statistic has a
sb1 9.98547
Student’s t distribution with 48 degrees of freedom. We search in the table on
page 811 of Newbold to find a number x such that a variable with a Student’s t
distribution with 48 degrees of freedom has a probability 0.025 of being larger
than x. We can see from the table that this number is somewhere between 2.021
and 2.000. In any case, it is smaller than our computed value of 3.567. Thus, we
reject the null hypothesis at the 5% significance level.
f. A 95% confidence interval for the slope 1 can be computed as
b1  tn 2, / 2 sb1  35.618  t48,0.025 9.98547  35.618  2.01 9.98547 which gives the
interval [15.5 , 55.7].
g. It seems from the scatter plot that dental expenses are fairly independent of age
until people reach their 50’s, and then they increase rapidly. This could be a
reasonable hypothesis. If it is true, a model where the expected rise in dental
expenses is the same for every year the age increases (i.e., a linear regression
model) is then not fitting the data very well. One possibility could then be to split
the data into two groups, one consisting of elderly, and one of other adults, and
compare dental expenses in the two groups. One could also investigate how dental
expenses varied with age within each such group. More complex answers to this
question are of course possible. Another problem with using linear regression in
for this data, is that the variation around a fitted line is clearly not normally
distributed in one respect: A number of people have zero dental expenses, and noone can have negative expenses. However, this effect seems to be limited to only
a few people, and so is unlikely to influence our conclusions too much.
h. By calling random phone numbers, you end up with data from people who
i. Have a phone
ii. Are willing to answer your questions
This might bias your selection somewhat. For example, among the very old, you
might get hold of only the most healthy ones. Another serious problem with the
experiment is that most people will have problems remembering, or estimating,
their dental expenses during the last three years, when asked over the phone.
2.
a. Of the 26-8=18 cases where there is a difference in the cost, one would expect
that half was more expensive at A, and half less expensive at A. Thus, we
would expect 9 cases to be less expensive at A compared to B.
b. It seems reasonable to assume that the 18 cases where there is a difference in
cost represent 18 independent trials, each with a probability of “success” (i.e.,
A is less expensive than B) of 0.5. Thus, we would expect the actual number
of cases where A is less expensive to have a Binomial distribution with
parameters n=18 and p=0.5.
c. If the null hypothesis is that A is equally expensive as B, then the reasonable
alternative hypothesis would be that there is a difference in costs, in general.
Thus, we should do a two-sided test, so that we have to look at the probability
of observing 14 or more successes, plus the probability of observing 4 or
fewer successes. The probability of observing 14 or more successes in a
Binomial distribution with parameters n=18 and p=0.5 can be found in the
table on page 790 of Newbold as 1-0.985=0.015 (the table gives the
probability of observing 13 or fewer successes). So the probability of
observing 14 or “something at least as extreme” is 0.015  2  0.03 . This is the
p-value of the test. As it is smaller than 0.05, we can reasonably reject the
null-hypothesis, and conclude that A administers significantly less expensive
treatments than B.
d. The sign test.
3.
a. A control group is essential in this situation, as men with beginning hair loss
are likely to lose hair at some rate over the coming years. What determines the
efficacy of the treatment is whether it slows down the hair loss compared to
what it would have been without treatment. The control group measures the
expected hair loss without treatment.
b. The answer is ii: The indicator variable can then be used to define the two
groups in the statistical tests.
c. The purpose of a normal q-q plot is to visualize the degree to which the data is
normally distributed. To obtain the plot, the observed values are ranked, and
the corresponding quantiles of the normal distribution are computed. Each
observation is then plotted as a point, with the actual observed value on the xaxis, and the quantile on the y-axis. The mean and the variance of the data are
computed, and the line in the plot indicates the relationship between the
quantiles of the normal distribution with this mean and variance (x-axis), and
the quantiles of the standard normal distribution (y-axis). The closer the points
are to the line, the more closely the data follows a normal distribution. In our
case, we see that there may be some deviance from a normal distribution for
low values. Indeed, the histogram shows a large number of low values. When
we consider how the data has been obtained, we realize that a number of men
have lost all their hair, so that the value -100 percent is relatively frequent.
This observation, together with the histogram and the normal q-q plot, puts
doubt on whether we get valid results with a test based on a normal
assumption on these data.
d. If we decide to use a T test, we must decide on whether to assume that the
variances in the two groups are equal, or that they may be unequal. One way
to decide this is to compare the sample variances in the two groups. The
“Group Statistics” table shows that their standard deviations, and thus their
variances, are almost identical. And indeed, in the “Test for Equality of
Variances” in the “Independent Samples Test” table, we see that the p-value
for this test is 0.513, indicating that we have no reason at all to doubt that the
groups have the same variances. Thus, we should use the first line of this table
(although the second line gives almost identical results). The test statistic
xy
is computed to 2.700. It is then compared with a Student’s t
t
s p n1x  n1y
distribution with 38 degrees of freedom, and a two-sided p-value of 0.010 is
found. (This means that the probability of observing 2.700 or a larger value, or
-2.700 or a smaller value, in a Student’s t distribution with 38 degrees of
freedom, is 0.010). So, assuming that we accept the assumptions of the T test,
we can reject the null hypothesis that the groups have the same mean, and we
can consider that the test has shown a significant effect of the hair treatment.
The table also shows that the mean difference between the groups is 33.35,
and that the standard error of this difference is 12.3539. The standard error of
the difference is the denominator of the test statistic above, so it is in our case
s p 201  201  s p / 10 . Comparing with the standard deviations from the
“Group Statistics” table, we know that the pooled standard deviation, s p , is a
value between these two, so roughly 39. We see that the standard error
difference given in the “Independent Samples Test” table is roughly a third of
this, which seems OK. Finally, a 95% confidence interval for the difference is
computed. It is derived from the mean difference and the standard error for the
difference, together with the value t38,0.025 from the Student’s t-distribution,
i.e., the value so that the a variable with a Student’s t-distribution with 38
degrees of freedom has probability 0.025 of being larger than this value. We
see that the confidence interval does not contain zero. In fact, as the
relationships between the numbers above show, it will contain zero if and only
if the p-value computed in the table is above 0.05.
e. The Mann-Whitney U Test is a non-parametric test, with the null hypothesis
that the two groups of numbers come from exactly the same distribution. It
gives a p-value of 0.024, showing that we can reject the null hypothesis, and
conclude that the hair treatment has an effect. The test is based on comparing
the sum of ranks in the two groups: All observations are listed together,
ordered according to size, and the ranks in each group are summed. The sums
of ranks are not always whole numbers, because of ties: When two or more
observations are equal, an “average ranking” is used for them.
f. The hair loss treatment has been tested on 40 men with beginning hair loss,
randomly divided into 20 men who received the treatment, and 20 men who
did not. The hair loss after three years has been measured for each person, and
the data analyzed with both a T-test and a Mann-Whitney U Test. The T-test
resulted in a p-value of 0.03, and the Mann-Whitney in a p-value of 0.024,
indicating that there is significant evidence that the treatment works.