Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Chavez, Eric 1 ClinRes 490 Midterm Exam Instructions for Midterm Exam: 1. Please enter your last name and first name in the header above. 2. Save your document using the following naming convention: lastname_firstname_clinres409_midtermexam.doc (or .docx) 3. For SPSS output, please paste the appropriate tables in the Word document and make reference to each table and why you are including it. If a table is associated with one of the hypothesis testing steps, it must be clear which step and why you are including it there. Please also discuss and interpret as appropriate. 4. Please submit exam through Blackboard by the date and time listed on your syllabus. Thank you – I can only grade exams I can read and interpret clearly! 1. You have just received an excel dataset called “CardioStudy.xls” This dataset contains data from a cardiovascular study, which looked at the relationship between cardiovascular disease and a set of risk factors. The data set contains one record per subject and has the following variables: Variable ID Data Type Numeric Range of Values 101-96017 CVD Description 5-Digit Unique Subject Identifier Cardiovascular Disease Numeric TRT Treatment Group Numeric AGE HDL LDL SMOKER Subject’s Age HDL Cholesterol LDL Cholesterol Smoking Status Numeric Numeric Numeric Numeric No = 0 Yes = 1 0 = Placebo 1 = Drug 34-60 18-98 155-428 0= No 1 = Yes Using the dataset CardioStudy Run the appropriate analysis to answer each of the following questions. Paste the appropriate tables or graphs into your response as appropriate. 1 Chavez, Eric 2 a. What percent of the subjects are in the PLACEBO treatment group? (1 pt) 50% of subjects are in the placebo group TRT Frequency Valid Percent Valid Percent Cumulative Percent placebo 10 50.0 50.0 50.0 drug 10 50.0 50.0 100.0 Total 20 100.0 100.0 b. How many subjects have cardiovascular disease? (1 pt) 12 subjects have cardiovascular disease CVD Frequency Valid 2 Percent Valid Percent Cumulative Percent no CVD 8 40.0 40.0 40.0 yes CVD 12 60.0 60.0 100.0 Total 20 100.0 100.0 Chavez, Eric c. Select the best way to describe HDL cholesterol by smoking status graphically. Paste the graphic descriptions here. Interpret the graphs (4 pts). This box and whisker plot describes the HDL values of non-smokers and of smokers. The bottom of each box represents the first quartile of the HDL values: for the non-smokers this is approximately 20 and for the smoker is 56. The band inside the box represents the second quartile or the median: for the nonsmokers this is approximately 38 and for the smokers is 73. The top of each box represents the third quartile: for the non-smokers this is 60 and for the smokers it is 87. The whiskers which extend above and below the boxes represent the range of HDL values with minimum on the bottom and maximum on the top: for non-smokers the range is 62 with minimum of 18 and maximum of 80, for the smokers the range is 62 with a minimum of 33 and maximum of 95.With these descriptive statistics it appears that non-smokers have lower HDL values than smokers. 3 3 Chavez, Eric 4 d. Select the best way to describe LDL by smoking status numerically. Paste the SPSS tables with the numeric descriptions here. Interpret the output (3 pts). Descriptives SMOKING_STATUS LDL NON-SMOKER Statistic Mean 95% Confidence Interval for Mean 226.70 Lower Bound 165.31 Upper Bound 288.09 5% Trimmed Mean 222.44 Median 168.00 Variance 27.140 7365.789 Std. Deviation SMOKER Std. Error 85.824 Minimum 155 Maximum 375 Range 220 Interquartile Range 150 Skewness .693 .687 Kurtosis -1.455 1.334 Mean 284.30 30.368 95% Confidence Interval for Mean Lower Bound 215.60 Upper Bound 353.00 5% Trimmed Mean 283.28 Median 291.50 Variance Std. Deviation 9222.456 96.034 Minimum 159 Maximum 428 Range 269 Interquartile Range 202 Skewness .061 .687 -1.216 1.334 Kurtosis This table shows descriptive statistics for LDL values for the subjects divided by smoking status. In the table we can see that the mean LDL for non-smokers is 226.7 with a standard deviation of 85.8. For the non-smokers the minimum LDL is 155 and maximum is 375 which gives a range of 220. The 95% confidence interval for LDL values for non-smokers is (165.3, 288.1). 4 Chavez, Eric The mean LDL for smokers is 284.3 with a standard deviation of 96.0. For the smokers, the minimum LDL is 159 and maximum is 428 which gives a range of 269. The 95% confidence interval for LDL values for smokers is (215.6, 353.0). This table shows the quartiles for LDL values divided by smoking status. For the non-smokers the first quartile is 158, the second quartile is 168, and the third quartile is 305. For the smokers the first quartile is 178, the second quartile is 291, and the third quartile is 370. With these descriptive statistics it appears that non-smokers have lower LDL values than do smokers. 5 5 Chavez, Eric 6 d. What is the mean age in the data set for subjects who are smokers? What is the mean age for non-smokers (2 pts)? The mean age for smokers is 49.1. The mean age for non-smokers is 48.6. Descriptives SMOKING_STATUS AGE NON-SMOKER Statistic Mean 95% Confidence Interval for Mean 48.60 Lower Bound 41.70 Upper Bound 55.50 5% Trimmed Mean 48.78 Median 51.50 Variance 93.156 Std. Deviation Minimum 35 Maximum 59 Range 24 Interquartile Range 19 Kurtosis Mean 95% Confidence Interval for Mean .687 -1.716 1.334 49.10 3.096 Lower Bound 42.10 Upper Bound 56.10 49.33 Median 50.00 Variance 95.878 9.792 Minimum 34 Maximum 60 Range 26 Interquartile Range 21 Skewness Kurtosis 6 -.378 5% Trimmed Mean Std. Deviation 3.052 9.652 Skewness SMOKER Std. Error -.296 .687 -1.537 1.334 Chavez, Eric e. Test the hypothesis that there is a difference in HDL cholesterol between the treatment and control groups at the 5% confidence level (5 pts). 1. Set up the hypothesis. H0: μ1 = μ2 (mean HDL in the treatment group = mean HDL of the control group) H1: μ1 ≠ μ2 (mean HDL in the treatment group is not equal to the mean HDL of the control group) Set the level of significance, α=0.05 2. Select the appropriate test statistic. Will use independent samples t test since the HDL is measured on two independent samples, the population σ is unknown and the n<30. Will use SPSS to calculate the test statistic t since the population variances are unknown and may not be equal. 3. Generate the decision rule. Assuming degrees of freedom is 18 (n-2 = 20-2 = 18) then for a two-sided test at α=0.05 the critical value of t is 2.101. Reject H0 if t ≤ -2.101 or if t ≥ 2.101. Do not reject H0 if -2.101 < t < 2.101. 4. Compute the value of the test statistic. The output from SPSS pasted below. Group Statistics TRT HDL N Mean Std. Deviation Std. Error Mean placebo 10 58.60 29.530 9.338 drug 10 55.20 23.470 7.422 In the Levene’s test, the F value is not significant. Therefore population variances are assumed equal (we do not reject the null hypothesis that states that the population variances are equal). This means that we use the top line of the output in the table above. The test statistic is computed as t=0.285. 5. Draw a conclusion about H0 by comparing the test statistic to the decision rule. Because the test statistic lies within the critical t values, -2.101 ≤ 0.285 ≤ 2.101 we do not reject the null hypothesis. We do not have significant evidence, α=0.05, to show that there is a difference in HDL values between the treatment (drug) and control (placebo) groups. The mean HDL for the treatment and control groups are equal. The treatment does not affect HDL levels compared to placebo. 7 7 Chavez, Eric 8 2. A healthy eating program was held in a college dorm. When freshmen students moved into the dorm, they completed a survey that included the number of servings of vegetables typically eaten. Following the program, they repeated the survey. Test the hypothesis that the program improved vegetable consumption at a 5% level of significance (5 pts). Subject ID # servings # servings before after program program 1 5 6 2 4 6 3 7 4 4 3 4 5 9 5 6 6 7 7 3 6 8 8 6 9 5 7 10 6 8 1. Set up the hypothesis. H0: μbefore = μafter or the mean difference = 0 (mean #servings_before = mean #servings_after) H1: μbefore < μafter or mean difference > 0 (mean #servings_before < mean #servings_after) Set the level of significance, α=0.05 2. Select the appropriate test statistic. We will use the dependent samples t test since we are comparing two paired measures taken on the same subject. Will use SPSS to calculate the test statistic t. 3. Generate the decision rule. Degrees of freedom is 9 (n-1 = 10-1 = 9), then for a one-sided test at α=0.05 the critical value of t is 1.833. Reject H0 if t ≥ 1.833. Do not reject H0 if t < 1.833. 4. Compute the value of the test statistic. The output from SPSS is pasted below. Paired Samples Statistics Mean Pair 1 N Std. Error Mean Std. Deviation servings_before 5.60 10 2.011 .636 servings_after 5.90 10 1.287 .407 Paired Samples Correlations N Pair 1 8 servings_before & servings_after Correlation 10 -.017 Sig. .962 Chavez, Eric 9 The test statistic t = -0.394. 5. Draw a conclusion about H0 by comparing the test statistic to the decision rule. Since the calculated t of -0.394 < 1.833 the critical value of t, we do not reject the null hypothesis. The mean difference is zero. We do not have significant evidence, α=0.05, that the mean difference of number of servings before and after is greater than zero. The healthy eating program did not improve vegetable consumption. 9