Download hsa523.hw8key

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
HSA 523 Health Data Analysis
Dr. Robert Jantzen
Answer key for Homework 8
1. A. Compute the mean MAAC score (and its standard deviation) for the sample of men
and the sample of women. Are they the same?
Report
maac scores (hw8)
gender (hw8)
male
Mean
N
Std. Deviation
6.0000
10
1.15470
female
8.0000
10
2.74874
Total
7.0000
20
2.29416
Separate Variance T Test for 2 Population Means
B. Using the data above, what null and alternative (research) hypotheses could a Separate
Variance t-test for the difference between 2 population means test? (Use a two-tailed test)
Null Ho: population mean of men – population mean of women =0
Alternative Ha: population mean of men – population mean of women not equal 0
C. What assumptions about the data must hold true if the t test for the difference between 2
population means is conducted?
Random samples of two independent groups and each group’s numbers must be normally
distributed or the sample size is >= 30 and the #s aren’t highly skewed.
D. What would it mean if a statistician says that the average MAAC score of men differs
significantly from the average MAAC score of women (at the 5% level).
It means that there’s only a 5% chance that the population mean scores are the same.
E. Conduct the appropriate t test to assess whether the population mean MAAC scores of
men and women differ (2 tailed test). Show all work, including the hypotheses, sample
statistic w/ degrees of freedom, critical statistic and decision rule. Interpret the results of the
test.
Ho: pop.mean men – pop.mean women =0
Sample t = -2.12
Ha: pop.mean men – pop.mean women not =0
Critical t (2 tailed) = 2.18
Since the |sample t| is < |critical t|, we can’t reject the null hypothesis. There’s insufficient
evidence to conclude that the difference in the two population means is not zero, i.e., that the
means of all men and women differ.
F. What does the estimated p value (also called the sig. level) of the test show?
The p-value of .055 is the probability of rejecting the null hypothesis when it’s true. It is the
probability of observing a difference between two groups that is as large as this data
analyzed, when the null is true.
Programming Notes:
G. If the sample means, standard deviations and sample sizes are known for two independent
groups, you can use the 2samplettest.xls Excel spread sheet to find the sample and critical t
values (click here). Just edit the given values for the means, standard deviations and sample
sizes for the data that you want to analyze.
H. Bonus SPSS Programming Note: if you want to use SPSS to conduct the separate
variance t-test for two population means with the above data, you must first create a data file
that contains two columns (gender and score) and twenty observations. That is, for the first
ten cases, the gender variable would be coded as “male” and the last ten would be coded as
“female.” Remember to actually code the genders as numbers (e.g., 1=male, 2=female) and
assign appropriate value labels. The corresponding anxiety scores would be typed in the
score variable column.
To generate the appropriate statistics, click on Analyze, then Compare Means, and then
Independent Samples T-Test. Then click on the score variable name and move it to the
Test Variables box. Then click on gender and move it to the Grouping Variables box.
Then click on Define Groups and type in a 1 for Group 1 and a 2 for Group 2, if males
and females are coded as ones and twos. Then click on Continue and OK. Print out the
results and do the appropriate t-tests. Show all work and hypotheses.
X1 Mean:
X1 Variance:
X1 sample size:
6
1.33333209
10
X2 Mean:
X2 Variance:
X2 sample size:
Hypothesized difference between the population means (1-2):
Desired significance level of the
test:
sample t:
degrees of freedom:
-2.1213186
12.0805281
critical t: (two-tail)
p-value:
Don't Reject the Null
2.17881279
0.0554048
8
7.5555716
10
0
0.05
critical t: (one-tail)
1.78228674
p-value:
0.0277024
Reject the Null if the Sample Means Conform to the Alternative Hypothesis
SPSS Independent Samples t test
t-test for Equality of Means
t
maac scores (hw8)
df
Sig. (2-tailed)
Equal variances
assumed
-2.121
18
.048
Equal variances not
assumed
-2.121
12.081
.055
One Way Analysis of Variance (ANOVA)
B. Using the data above, what null and alternative (research) hypotheses could an ANOVA
test for the difference between 2 population means test?
Null Ho: population mean of men = population mean of women
Alternative Ha: population mean of men not equal population mean of women
C. What assumptions about the data must hold true if the ANOVA test for the difference
between 2 population means is conducted?
Random samples of two independent groups and each group’s numbers must be normally
distributed and their standard deviations must be the same.
D. What would it mean if a statistician says that the average MAAC score of men differs
significantly from the average MAAC score of women (at the 5% level).
It means that there’s only a 5% chance that the population mean scores are the same.
E. Conduct the appropriate ANOVA F test to assess whether the population mean MAAC
scores of men and women differ. Show all work, including the hypotheses, sample statistic
w/ degrees of freedom, critical statistic and decision rule. Interpret the results of the test.
Ho: pop.mean men = pop.mean women
Sample F =4.5
Critical F = 4.41
Ha: pop.mean men not equal pop.mean women
Since the sample F is > critical F, we can reject the null hypothesis. There’s sufficient
evidence to conclude that the two population means are not the same, i.e., that the means of
all men and women differ.
F. What does the estimated p value (also called the sig. level) of the test show?
The p-value of .048 is the probability of rejecting the null hypothesis when it’s true. It is the
probability of observing a difference between two groups that is as large as this data
analyzed, when the null is true.
Programming Notes:
G. If the sample means, standard deviations and sample sizes are known for two or more
independent groups, you can use the anovatest.xls Excel spread sheet to find the sample and
critical F values (click here). Just edit the given values for the means, standard deviations
and sample sizes for the data that you want to analyze.
Significance level
of test:
Number of
groups:
Group:
1
2
3
4
5
6
7
8
9
10
0.05
2
Sample
Mean
6
8
Sample Std.Dev.
1.1547
2.74874
Calculations:
Grand mean:
Grand sample
size:
Ho:
Ha:
sample F statistic
7
20
All
population
group
means are
equal
Ho is False
4.499992513
Sample size
10
10
=
critical F statistic
=
Decision Rule:
p-value of sample
F:
4.413863053
Reject the Null
0.048037692
2. Assume that a clinical drug trial of a cholesterol reducing medication randomly assigns 25
people to receive the drug, and 25 to receive a placebo. Assume that the average cholesterol
count of the treated group is 205 with a standard deviation of 15, while the placebo group has
a average of 235 with a standard deviation of 20.
A. Does the above data satisfy the requirements for conducting a separate variance t test for
the difference between 2 population means?
Random samples of two independent groups and each group’s numbers must be normally
distributed or the sample size is >= 30 and the #s aren’t highly skewed.
B. Test whether the medication works, i.e., the population mean cholesterol level of the
treated group is less than the population mean level of the placebo group. Show all work,
including the hypotheses, sample statistic w/ degrees of freedom, critical statistic and
decision rule. Interpret the results of the test.
Ho: pop.mean treated – pop.mean placebos > = 0
Ha: pop.mean treated – pop.mean placebos < 0
Sample t = -6
Critical t (1 tailed) = 1.68
Since the |sample t| is > |critical t|, we can reject the null hypothesis (given the sample
means agree with the alternative hypothesis). There’s sufficient evidence to conclude that
the mean cholesterol level of the population treated is < than that of the population
untreated. We’re 95% confident because w/ a 5% significance level there’s only a 5%
chance of rejecting the null hypothesis when it’s true.
C. What does the estimated p value (also called the sig. level) of the test show?
The p-value of .00000017 is the probability of rejecting the null hypothesis when it’s true. It
is the probability of observing a difference between two sampled groups that is as large as
this data analyzed, when the null is true.
X1 Mean:
X1 Variance:
X1 sample size:
205
225
25
X2 Mean:
X2 Variance:
X2 sample size:
Hypothesized difference between the population means (1-2):
Desired significance level of the
test:
sample t:
degrees of freedom:
-6
44.5103858
critical t: (two-tail)
p-value:
Reject the Null
2.0153675
3.3747E-07
235
400
25
0
0.05
critical t: (one-tail)
1.68023007
p-value:
1.6873E-07
Reject the Null if the Sample Means Conform to the Alternative Hypothesis
4. Assume that you have the following length of stay information for random samples of
patients of three physicians:
Doctor 1
0 days
1 day
1 day
1 day
2 days
Doctor 2 Doctor 3
1 day
2 days
2 days
3 days
2 days
3 days
2 days
3 days
3 days
4 days
A. Compute the mean length of stay (and its standard deviation) for each physician and the
grand mean length of stay. Is the length of stay the same across physicians?
Report
length of stay(hw8)
doctor# (hw8)
1.00
Mean
N
Std. Deviation
1.0000
5
.70711
2.00
2.0000
5
.70711
3.00
3.0000
5
.70711
Total
2.0000
15
1.06904
B. Using the data above, what null and alternative (research) hypotheses could a oneway
ANOVA test?
Ho: population mean #1 = population mean #2 = population mean #3
Ha: Ho is False
C. What assumptions about the data must hold true if the ANOVA test is conducted?
Random samples of each independent group, normally distributed numbers in each group
and equal variances across groups.
D. What would it mean if a statistician says that there is a statistically significant difference
in the length of stay across physicians (at the 5% level).
There’s a 5% chance that an error has been made in rejecting the null hypothesis that the
population mean LOS’s are all the same. We’re 95% sure that they differ, with a 5% chance
that we’re wrong.
E. Conduct the appropriate ANOVA test to assess whether the population mean lengths of
stay for the three doctors differ. Show all work, including the hypotheses, sample statistic w/
degrees of freedom, critical statistic and decision rule. Interpret the results of the test.
Ho: population mean #1 = population mean #2 = population mean #3
Ha: Ho is False
Sample F = 10
Critical F = 3.89
Since the sample F is >= critical F, we reject the null hypothesis. We’re 95% sure that the
population mean LOSs differ between the three physicians (w/ a 5% chance we erred in
rejecting the null).
F. What does the estimated p value (also called the sig. level) of the test show?
The p-value of .0028 is the probability of rejecting the null hypothesis when it’s true. It is the
probability of observing differences between three sampled groups that are as large as this
sample’s, when the null is true.
Significance level
of test:
Number of
groups:
Group:
0.05
3
Sample Mean
1
2
3
4
5
6
7
8
9
10
1
2
3
Sample
Std.Dev.
0.70711
0.70711
0.70711
Calculations:
Grand mean:
Grand sample
size:
Ho:
Ha:
2
15
All population group means are
equal
Ho is False
sample F statistic
=
9.999908959
critical F statistic
=
3.885290312
Decision Rule:
p-value of sample
F:
Reject the Null
0.002781009
Sample
size
5
5
5
5. Conduct the appropriate ANOVA test to assess whether average hospital size (size),
occupancy (occup) and managed care share (mcpercen) differed significantly between
hospitals with differing risk-sharing activities (ids).
Data Needs: Random samples of numerical variable of >= 2 independent groups; normally
distributed numbers; same variances in each group.
Hypotheses about population mean size:
Ho: population mean size is the same for hospitals that aren’t networked, partially
networked and fully networked
Ha: Ho is false
Sample F = 10.76 (from calculator)
Critical F = 3.03 at 5% significance (from calculator)
Decision: Reject Ho. We can conclude w/ 95% confidence that the population mean size
isn’t the same for all three groups of hospitals.
Hypotheses about population mean occupancy:
Ho: population mean occupancy is the same for hospitals that aren’t networked, partially
networked and fully networked
Ho: Ho is false
Sample F = 8.02 (from calculator)
Critical F = 3.03 at 5 % significance (from calculator)
Decision: Reject Ho. We can conclude w/ 95% confidence that the population mean
occupancy rate isn’t the same for all three groups of hospitals.
Hypotheses about population mean managed care percentages:
Ho: population mean managed care percentage is the same for hospitals that aren’t
networked, partially networked and fully networked
Ha: Ho is false
Sample F = 11.28 (from calculator)
Critical F = 3.03 at 5% significance (from calculator)
Decision: Reject Ho. We can conclude w/ 95% confidence that the population mean
managed care percentage isn’t the same for all three groups of hospitals.
SPSS Means & Std Deviations Report (put these #s in the ANOVA calculator)
ids-Integrated
Delivery System
None
size-No. of
licensed beds
Mean
N
Std. Deviation
Partial
Mean
N
Std. Deviation
Full
Mean
N
Std. Deviation
Total
Mean
N
Std. Deviation
140.52
occupOccupancy
Rate
46.12
mcpercenPercent
Managed Care
Business
10.61
62
60
60
159.684
20.061
11.472
219.09
55.72
18.79
80
79
77
202.039
17.100
15.096
308.02
57.69
23.55
93
92
85
270.850
17.372
19.596
233.55
54.01
18.40
235
231
222
232.033
18.560
16.917