Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bio 286: Worksheet 3-Replication, transformations and power – answer key Worksheet 3: Replication, transformations and power Replication 1ai: The number of replicates per county is 12 because sites are the independent replicates for each county. Degrees of freedom are 11 (12-1) for each county. 1aii: The replicates for the comparison of sites within Monterey county are 25. Here each plots is a replicate for the site. 1bi: Ho – there is no difference in diving depths for otters before and after eating. 1bii: Paired t-test 1biii: replicates are the 50 otters. The degrees of freedom are 49. Transformations 2ai: A two sample T-test 2aii: use DISTRIBUTION and put ‘mil’ in the Y box. Then look at the box plot and frequency distribution. Click on the Mil red triangle then on CONTINUOUS FIT, then on NORMAL. Now click on NORMAL and then on DIAGNOSTIC FIT. Look at the plot – does it look normal? If not what type of transformation might help? Try the following click on Mil then on CONTINUOUS FIT then on LOGNORMAL. Now click on LOGNORMAL and then on DIAGNOSTIC FIT. Look at the plot – Now does it look like the data fit the distribution? Now lets look at homogeneity of variance. Here you want to use TABLES, SUMMARY and click on ‘mil’ then on STATISTICS, VARIANCE. Now put ‘Urban 2’ in the GROUP box. Click ok. Check the VARIANCE term. Then on the MEAN term. Remember one of the assumptions for Two sample t-test is homogeneity of variance. Are the two variance terms similar? Perhaps try a log transformation 2aiiia: Log transform 2aiiib: The box plot and probability plots suggest a log normal distribution; also the variance terms differ in a way suggestive of log normal data (variance scales with the mean). Try making a new variable call it ‘Logmil’. First – go to the data window. Click on COLS then on NEW COLUMN and enter ‘Logmil’. Now go to that column – and right click it the variable name, then on FORMULA. A window will open. On the right side is the FUNCTIONS window. Click on TRANSCENDENTAL then on LOG10. That function will show up in the open window at the bottom. Now click on ‘mil’ from the TABLE COLUMNS window. This will insert ‘mil’ in the LOG10 function. Now click OK and you will have transformed ‘mil’ to Log base 10 ‘mil’. 1|Page Bio 286: Worksheet 3-Replication, transformations and power – answer key 2aiiic: Look at the tables and graphs below. Look at the probability graphs and the compare the variance terms for untransformed and transformed variables. ‘Logmil’ meets the assumptions of normality and homogeneity of variance (with urban2 being the grouping variable) better than ‘Mil’ 2|Page Bio 286: Worksheet 3-Replication, transformations and power – answer key 2aiiid: To conduct a t-test: ANALYZE > FIT Y BY X > (add the variables) > OK > [red triangle] > MEANS ANOVA POOLED T ~or~ T TEST. Based on the results below we can reject the null hypothesis. Urban countries spend more on military than do rural ones. Difference Std Err Dif Upper CL Dif Lower CL Dif Confidence Mean(Logmil) with 95% CI ( pooled) -1.0 -0.5 -0.8421 0.1622 -0.5168 -1.1674 0.95 0.0 0.5 t Ratio DF Prob > |t| Prob > t Prob < t -5.19043 54 <.0001* 1.0000 <.0001* 1.0 2 1.5 1 0.5 0 city rural Urban 2 3|Page Bio 286: Worksheet 3-Replication, transformations and power – answer key 2bi: These data are more complicated than earlier. The box plot looks ok but the probability plot is very strange. It has the characteristic look of data that are in need of an ARCSIN transformation (ask why this is the case if you are uncertain) Prop 2bii: 4|Page Bio 286: Worksheet 3-Replication, transformations and power – answer key aprop 5|Page Bio 286: Worksheet 3-Replication, transformations and power – answer key 2biii: They are much more appropriate for a t-test. The data are now much more normally distributed 2biv: There is more free space low in the tide zone. Here I am showing both the raw data (prop) and the transformed data (aprop). I am also showing the confidence interval for aprop as this is the variable used in the analysis and am using the within group standard error for prop to show the variability in the data. t Test High-Low Assuming equal variances Difference -0.37650 t Ratio Std Err Dif 0.08400 DF Upper CL Dif -0.20761 Prob > |t| Lower CL Dif -0.54540 Prob > t Confidence 0.95 Prob < t -4.48208 48 <.0001* 1.0000 <.0001* -0.4 -0.2 0.0 0.1 0.2 0.3 0.4 Mean(PROP) & Mean(aprop) vs. TIDEHEIGHT Mean(PROP) Mean(aprop) 1.0 aprop 0.8 0.6 0.4 0.2 0.0 0.7 0.6 PROP 0.5 0.4 0.3 0.2 0.1 0.0 Low High TIDEHEIGHT 6|Page Bio 286: Worksheet 3-Replication, transformations and power – answer key 3b: The power is very low (.1236). Hence we had a very low likelihood of getting a significant result. Now you need to ask yourself the single most important question “What size effect did I want to be able to detect?” 3c: Now ask yourself – was the power of the test high or low. 7|Page Bio 286: Worksheet 3-Replication, transformations and power – answer key 4. The graphs below show a cumulative distribution of means and the frequency distribution of means. The cumulative distribution is usually easier to understand. Ere it show that 95% of means (from resampling) occur between 66 and 68.136. This is the two tailed confidence interval and it contains (just barely) the null value – 68. 95% 66 68.136 8|Page