Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Some solutions to the trial exam in HMM4101, fall 2005 Part A 1. 2. 3. 4. 5. 6. 7. 8. See pages 66-70 in Kumar. See Chapter 8 of Kumar. See Chapter 9 of Kumar. See page 147 of Newbold. See Section 9.1 of Newbold. See page 62 and page 166 of Newbold. See lecture notes from 2005-11-14. See Section 12.6 in Newbold. Part B 1. a. By using the definition of sample correlation, it can be computed as sxy 8609.9 rxy 0.45774 0.46 2 2 241.73 1463600 sx s y b. According to standard formulas for the regression coefficients, we get sy 1463600 b1 rxy 0.45774 35.618 35.6 sx 241, 73 b0 y b1 x 1621.8 35.618 46.6 37.9988 38.0 That b1 has the value 35.6 means that under the fitted regression model, expected dental expenses increase with 35.6 kroner for each year the person gets older. c. As the coefficient of determination for simple regression is equal to the square of the correlation coefficient, we get R2 rxy2 0.457742 0.20953 0.21 . This number tells us something about the amount of variance “explained” by the regression; it is equal to the regression sum of squares divided by the total sum of squares. Only 21 percent of the variance is explained by the regression which means that the data do not follow a straight line very closely. This is also apparent from the scatter plot. SSE d. As R 2 1 , we get SSE SST 1 R 2 . We can compute SST from the SST 1 n 1 2 SST . We get variance of y, as s y2 yi y n 1 i 1 n 1 SSE SST (1 R2 ) (n 1)sy2 (1 R2 ) 49 1463600 (1 0.20953) 56689663 56689765 We can now compute the error variance estimate as SSE 56689765 1181037. This means that the estimated model error n2 48 standard deviation becomes 1181037 1087 , which seems reasonable when looking at the scatter plot. e. The null hypothesis would be that the slope 1 of the population regression model Y 0 1 x is equal to zero. What the natural alternative hypothesis could depend on your previous knowledge. But in general, it seems at least possible that dental expenses could both increase with age, and decrease with age. Thus, the alternative hypothesis should be that 1 0 or 1 0 , and we will use a two-sided test, so that we divide the significance level of 0.05 by two and use the b 0 number 0.05/2=0.025 below. The test statistic is given by t 1 , where we sb1 se2 can compute sb1 by sb1 t se2 1181037 9.98547 . Thus, we get 2 (n 1) sx 49 241.73 b1 35.618 3.567 . Under the null hypothesis, the test statistic has a sb1 9.98547 Student’s t distribution with 48 degrees of freedom. We search in the table on page 811 of Newbold to find a number x such that a variable with a Student’s t distribution with 48 degrees of freedom has a probability 0.025 of being larger than x. We can see from the table that this number is somewhere between 2.021 and 2.000. In any case, it is smaller than our computed value of 3.567. Thus, we reject the null hypothesis at the 5% significance level. f. A 95% confidence interval for the slope 1 can be computed as b1 tn 2, / 2 sb1 35.618 t48,0.025 9.98547 35.618 2.01 9.98547 which gives the interval [15.5 , 55.7]. g. It seems from the scatter plot that dental expenses are fairly independent of age until people reach their 50’s, and then they increase rapidly. This could be a reasonable hypothesis. If it is true, a model where the expected rise in dental expenses is the same for every year the age increases (i.e., a linear regression model) is then not fitting the data very well. One possibility could then be to split the data into two groups, one consisting of elderly, and one of other adults, and compare dental expenses in the two groups. One could also investigate how dental expenses varied with age within each such group. More complex answers to this question are of course possible. Another problem with using linear regression in for this data, is that the variation around a fitted line is clearly not normally distributed in one respect: A number of people have zero dental expenses, and noone can have negative expenses. However, this effect seems to be limited to only a few people, and so is unlikely to influence our conclusions too much. h. By calling random phone numbers, you end up with data from people who i. Have a phone ii. Are willing to answer your questions This might bias your selection somewhat. For example, among the very old, you might get hold of only the most healthy ones. Another serious problem with the experiment is that most people will have problems remembering, or estimating, their dental expenses during the last three years, when asked over the phone. 2. a. Of the 26-8=18 cases where there is a difference in the cost, one would expect that half was more expensive at A, and half less expensive at A. Thus, we would expect 9 cases to be less expensive at A compared to B. b. It seems reasonable to assume that the 18 cases where there is a difference in cost represent 18 independent trials, each with a probability of “success” (i.e., A is less expensive than B) of 0.5. Thus, we would expect the actual number of cases where A is less expensive to have a Binomial distribution with parameters n=18 and p=0.5. c. If the null hypothesis is that A is equally expensive as B, then the reasonable alternative hypothesis would be that there is a difference in costs, in general. Thus, we should do a two-sided test, so that we have to look at the probability of observing 14 or more successes, plus the probability of observing 4 or fewer successes. The probability of observing 14 or more successes in a Binomial distribution with parameters n=18 and p=0.5 can be found in the table on page 790 of Newbold as 1-0.985=0.015 (the table gives the probability of observing 13 or fewer successes). So the probability of observing 14 or “something at least as extreme” is 0.015 2 0.03 . This is the p-value of the test. As it is smaller than 0.05, we can reasonably reject the null-hypothesis, and conclude that A administers significantly less expensive treatments than B. d. The sign test. 3. a. A control group is essential in this situation, as men with beginning hair loss are likely to lose hair at some rate over the coming years. What determines the efficacy of the treatment is whether it slows down the hair loss compared to what it would have been without treatment. The control group measures the expected hair loss without treatment. b. The answer is ii: The indicator variable can then be used to define the two groups in the statistical tests. c. The purpose of a normal q-q plot is to visualize the degree to which the data is normally distributed. To obtain the plot, the observed values are ranked, and the corresponding quantiles of the normal distribution are computed. Each observation is then plotted as a point, with the actual observed value on the xaxis, and the quantile on the y-axis. The mean and the variance of the data are computed, and the line in the plot indicates the relationship between the quantiles of the normal distribution with this mean and variance (x-axis), and the quantiles of the standard normal distribution (y-axis). The closer the points are to the line, the more closely the data follows a normal distribution. In our case, we see that there may be some deviance from a normal distribution for low values. Indeed, the histogram shows a large number of low values. When we consider how the data has been obtained, we realize that a number of men have lost all their hair, so that the value -100 percent is relatively frequent. This observation, together with the histogram and the normal q-q plot, puts doubt on whether we get valid results with a test based on a normal assumption on these data. d. If we decide to use a T test, we must decide on whether to assume that the variances in the two groups are equal, or that they may be unequal. One way to decide this is to compare the sample variances in the two groups. The “Group Statistics” table shows that their standard deviations, and thus their variances, are almost identical. And indeed, in the “Test for Equality of Variances” in the “Independent Samples Test” table, we see that the p-value for this test is 0.513, indicating that we have no reason at all to doubt that the groups have the same variances. Thus, we should use the first line of this table (although the second line gives almost identical results). The test statistic xy is computed to 2.700. It is then compared with a Student’s t t s p n1x n1y distribution with 38 degrees of freedom, and a two-sided p-value of 0.010 is found. (This means that the probability of observing 2.700 or a larger value, or -2.700 or a smaller value, in a Student’s t distribution with 38 degrees of freedom, is 0.010). So, assuming that we accept the assumptions of the T test, we can reject the null hypothesis that the groups have the same mean, and we can consider that the test has shown a significant effect of the hair treatment. The table also shows that the mean difference between the groups is 33.35, and that the standard error of this difference is 12.3539. The standard error of the difference is the denominator of the test statistic above, so it is in our case s p 201 201 s p / 10 . Comparing with the standard deviations from the “Group Statistics” table, we know that the pooled standard deviation, s p , is a value between these two, so roughly 39. We see that the standard error difference given in the “Independent Samples Test” table is roughly a third of this, which seems OK. Finally, a 95% confidence interval for the difference is computed. It is derived from the mean difference and the standard error for the difference, together with the value t38,0.025 from the Student’s t-distribution, i.e., the value so that the a variable with a Student’s t-distribution with 38 degrees of freedom has probability 0.025 of being larger than this value. We see that the confidence interval does not contain zero. In fact, as the relationships between the numbers above show, it will contain zero if and only if the p-value computed in the table is above 0.05. e. The Mann-Whitney U Test is a non-parametric test, with the null hypothesis that the two groups of numbers come from exactly the same distribution. It gives a p-value of 0.024, showing that we can reject the null hypothesis, and conclude that the hair treatment has an effect. The test is based on comparing the sum of ranks in the two groups: All observations are listed together, ordered according to size, and the ranks in each group are summed. The sums of ranks are not always whole numbers, because of ties: When two or more observations are equal, an “average ranking” is used for them. f. The hair loss treatment has been tested on 40 men with beginning hair loss, randomly divided into 20 men who received the treatment, and 20 men who did not. The hair loss after three years has been measured for each person, and the data analyzed with both a T-test and a Mann-Whitney U Test. The T-test resulted in a p-value of 0.03, and the Mann-Whitney in a p-value of 0.024, indicating that there is significant evidence that the treatment works.