Download Guide on Writing Methods and Results Sections

Method Sections The Method section is a detailed breakdown of the experiment, including your subjects, research design, stimuli, equipment used, and what the subjects actually did (the procedure). The idea is to give the reader enough information to be able replicate the experiment. Requirements The Method section is often divided into subsections, such as Subjects, Design, Stimuli, Equipment, and Procedure. Each subsection should provide only the essential information needed to understand and reasonably replicate the experiment. Very short subsections can be combined (e.g., Stimuli and Equipment). Subjects/Participants. State the number of participants (if human) or subjects (if animals), who they were, and how they were selected. Examples: Participants We randomly selected 16 UAB students from a Master’s level biostatistics course to participate in exchange for extra credit. Subjects Subjects were 30 male pigtailed macaques (Macaca nemestrina) bred at the Wisconsin National Primate Research Center (WNPRC) Breeding Colony, Madison, Wisconsin. All animals were bred specifically for this project and were shipped to the laboratory at 3-5 days of age. We randomly assigned subjects to each condition. Materials. This subsection may also be called Stimuli, Equipment, or Apparatus. It briefly describes the equipment/materials used in the experiment. Examples: Eye movements were recorded using an NEC model 120 Eyetracker. Procedure. Describe in sequence the procedures used. Subjects were seated at a computer work station. After completing a demographic questionnaire, they received written instructions that differed by condition. All subjects were instructed to read a business letter and write a reply. Subjects in the multiple draft condition were told to write an outline of a reply letter before writing a final draft. 1 Design and Analysis. Identify and explain variables and their levels, and state whether the variables are between-groups factors, within-subjects factors, continuous predictors, or covariates. Examples: A 2 x 3 (Sex by Treatment) factorial design with Age as a covariate was a used. Both Sex and Treatment were between-subjects factors. We used a 2 x 4 repeated measures design with Sex as a between-subjects factors and time of measurement as the within-subjects factors. Multiple linear regression analysis was conducted to assess the influence of each individual VKORC1 SNP on log-transformed maintenance dose after adjustment for age, gender, BMI, clinic, income, education, health insurance, smoking status, level of physical activity, alcohol intake, vitamin K intake, comorbid conditions (e.g. CHF, renal failure and cancer) and drug interactions (e.g. amiodarone, statins, NSAIDs, antiplatelet agents). We performed logistic regression for data at the 10-week, 6-month, and 12-month time points separately. We controlled for gender, ethnicity, prior smoking (three ordinal categories), and baseline levels of motivation and expectancy to quit smoking. The cognitive readiness factor was regressed on all sociodemographic risk factors, the parental involvement factor, the cumulative number of culture-related activities, center-based programs attendance, and several higher order interactions. All variables were centered to reduce multicollinearity among the predictors and their interactions and to obtain more interpretable standardized regression coefficients (Aiken & West, 1991). Because of the large sample size, very small partial relations could be statistically significant. Therefore, only extremely small p values were considered statistically significant (i.e., α = 0.001). Based on an ordinary least squares (OLS) regression analysis of this model, nonsignificant interactions and predictors were removed. However, nonsignificant predictors remained in the model if their interactions with other predictors were statistically significant (Aiken & West, 1991). 2 Results This section presents the statistical analysis of the data collected. It is often less than a page long. Requirements Condensed format. The Results section is the most condensed and standardized of all the sections in the text of a lab report. No data interpretation. Statistical results are presented but are usually not discussed in this section. Discuss results in the Discussion section. • Keep your hypotheses in mind while you write. Each result must refer to a stated hypothesis. • Describe all results that are directly related to your research questions or hypotheses. Start with hypotheses you were able to support with significant statistics before reporting nonsignificant trends. Then describe any additional results that are more indirectly relevant to your questions. • If you present many results (i.e., many variables or variables with many levels), write a brief summary, then discuss each variable in separate subsections. • Report main effects before reporting contrasts or interactions. Briefly mention problems such as reasons for missing data, but save discussion of the problems for the discussion section. Use tables and figures to summarize data. Include descriptive statistics (such as means and standard deviations or standard errors), and give significance levels of any inferential statistics. The goal is to make your results section succinct and quantitatively informative, with no extra words. • For each test used, provide degrees of freedom, obtained value of the test, and the probability of the result occurring by chance (p-value). Here are examples of the results of a t-test and an F-test, respectively: t(23) = 101.2, p < .001; F(1,3489) = 7.943, p < .001 3 Reporting Results of Common Statistical Tests The goal of the results section in an empirical paper is to report the results of the data analysis used to test a hypothesis. The results section should be in condensed format and lacking interpretation. Avoid discussing why or how the experiment was performed or alluding to whether your results are good or bad, expected or unexpected, interesting or uninteresting. This document is specifically about how to report statistical results. Every statistical test that you report should relate directly to a hypothesis. Begin the results section by restating each hypothesis, then state whether your results supported it, then give the data and statistics that allowed you to draw this conclusion. If you have multiple numerical results to report, it’s often a good idea to present them in a figure (graph) or a table. In reporting the results of statistical tests, report the descriptive statistics, such as means and standard deviations, as well as the test statistic, degrees of freedom, obtained value of the test, and the probability of the result occurring by chance (pvalue). Test statistics and p-values should be rounded to two decimal places. All statistical symbols that are not Greek letters should be italicized (M, SD, t, p, etc.). When reporting a significant difference between two conditions, indicate the direction of this difference (i.e. which condition was more/less/higher/lower than the other condition(s). Assume that your audience has a professional knowledge of statistics. Don’t explain how or why you used a certain test unless it is unusual. p-values There are two ways to report p-values. One way is to use the alpha level (the a priori criterion for the probability of falsely rejecting your null hypothesis), which is typically 0.05 or 0.01. Example: F(1, 24) = 44.4, p < 0.01. You may also report the exact p value (the a posteriori probability that the result that you obtained, or one more extreme, occurred by chance). Example: t(33) = 2.10, p = 0.03. If your exact p-value is less than .001, it is conventional to state merely p <.001. If you report exact p-values, state early in the results section the alpha level used as a significance criterion for your tests. Example: “We used an alpha level of 0.05 for all statistical tests.” 4 EXAMPLES Reporting a significant single sample t-test (μ ≠ μ0): Students taking statistics courses in Public Health at the University of Alabama at Birmingham reported studying significantly more hours for tests (M = 121, SD = 14.2) than did UAB graduate students in general, t(33) = 2.10, p = 0.034. Reporting a significant t-test for dependent groups (μ1 ≠ μ2): Results indicate a significant preference for pecan pie (M = 3.45, SD = 1.11) over cherry pie (M = 3.00, SD = 0.80), t(15) = 4.00, p = 0.001. Reporting a non-significant t-test for independent groups (μ1 ≠ μ2): UAB students taking statistics courses in the School of Public Health had higher Anxiety scores (M = 121, SD = 14.2) than did those taking statistics courses in other graduate majors (M = 117, SD = 10.3); however, this difference was not statistically significant t(44) = 1.23, p = 0.09. Reporting a significant t-test for independent groups (μ1 ≠ μ2): Over a two-day period, participants drank significantly fewer drinks in the experimental group (M= 0.667, SD = 1.15) than did those in the wait-list control group (M= 8.00, SD= 2.00), t(4) = -5.51, p = 0.005. Reporting a significant omnibus F-test for a one-way ANOVA: An analysis of variance showed that the effect of noise was significant, F(3,27) = 5.94, p = 0.007. Post hoc analyses using the Scheffé’s criterion for significance indicated that the average number of errors was significantly lower in the white noise condition (M = 12.4, SD = 2.26) than in the other two noise conditions (traffic and industrial) combined (M = 13.62, SD = 5.56), F(3, 27) = 7.77, p = 0.042. A one-way analyses of variance (ANOVA) showed that the gene/status groups significantly differed in their responses to the total scores on the 5-item “Attitudes toward genetic research scale” [F(3,178) = 3.57, p = 0.0153]. Table 3 shows the means and standard deviations (SD) for the responses to the questions and the total score for each gene/status group. Tukey’s HSD was used to make post hoc pairwise comparisons. These follow-up analyses showed that the NEG and SYMPT groups responded in a very similar fashion (p > 0.05) while the ASYMPT (p = 0.014) and AR (p = 0.008) groups have lower total score means. The AR group responded lower than the ASYMPT group (p =0.024) indicating that they would be less likely than all other groups to allow their own children to participate in observational HD research that included a yearly neurological examination. 5 Reporting tests of a priori hypotheses in a multi-group study: Tests of the four a priori hypotheses were conducted using Bonferroni adjusted alpha levels of 0.0125 per test (0.05/4). Results indicated that the average number of errors was significantly lower in the silence condition (M = 8.11, SD = 4.32) than were those in both the white noise condition (M = 12.4, SD = 2.26), F(1, 27) = 8.90, p = 0.011 and in the industrial noise condition (M = 15.28, SD = 3.30), F(1, 27) = 10.22, p = 0.007. The pairwise comparison of the traffic noise condition with the silence condition was non-significant. The average number of errors in all noise conditions combined (M = 15.2, SD = 6.32) was significantly higher than those in the silence condition (M = 8.11, SD = 3.30), F(1, 27) = 8.66, p = 0.009. Reporting results of major tests in factorial ANOVA; non-significant interaction: Attitude change scores were subjected to a two-way analysis of variance having two levels of message discrepancy (small, large) and two levels of source expertise (high, low). All effects were statistically significant at the 0.05 significance level. The main effect of message discrepancy was statistically significant, F(1, 24) = 44.4, p < 0.001, indicating that the mean change score was significantly greater for large-discrepancy messages (M = 4.78, SD = 1.99) than for small discrepancy messages (M = 2.17, SD = 1.25). The main effect of source expertise was also significant, F(1, 24) = 25.4, p < 0.01, indicating that the mean change score was significantly higher in the high-expertise message source (M = 5.49, SD = 2.25) than in the low-expertise message source (M = 0.88, SD = 1.21). The interaction effect was non-significant, F(1, 24) = 1.22, p > 0.05. Reporting results of major tests in factorial ANOVA; significant interaction: A two-way analysis of variance yielded a main effect for the diner’s gender, F(1,108) = 3.93, p < 0.05, such that the average tip was significantly higher for men (M = 15.3%, SD = 4.44) than for women (M = 12.6%, SD = 6.18). The main effect of touch was non-significant, F(1, 108) = 2.24, p > 0.05. However, the interaction effect was significant, F(1, 108) = 5.55, p < 0.05, indicating that the gender effect was greater in the touch condition than in the non-touch condition. Reporting the results of a chi-square test of independence: A chi-square test of independence was performed to examine the relation between religion and college interest. The relation between these variables was significant, χ2 (df = 2, N = 170) = 14.14, p < 0.01. Catholic teens were less likely to show an interest in attending college than were Protestant teens. 6 Reporting the results of a chi-square test of goodness of fit: A chi-square test of goodness-of-fit was performed to determine whether the three sodas were equally preferred. Preference for the three sodas was not equally distributed in the population, χ2 (df = 2, N = 55) = 4.53, p < 0.05. Reporting the results of a Multiple Regression Analysis: A multiple regression analysis yielded a statistically significant [F(11,2148) = 30.48, p < 0.0001; R2 = 0.52] model. Of the sociodemographic risk factors, the parents’ language and education–income factors were statistically significant, accounting for 14.7% and 7.2% unique variance, respectively. Also, center-based preschool program attendance and the parental involvement factor were significantly related to cognitive readiness and accounted for 11.8% and 5.2% of unique variance, respectively. A statistically significant two-way interaction between the education–income factor and number of culture-related activities accounted for 0.3% of unique variance (see Table 1). Logistic Regression The results of the logistic regression analysis showed that after controlling for other confounding factors males were twice as likely to be unavailable for a 10week follow-up interview as females (OR=2.01; 95% CI, 1.37-2.94; p = 0.002). The Completers Only results (Table 4) show that the Treatment led to significantly more reports of smoking cessation at 10-weeks. At 10-weeks, students who completed the N-O-T program were approximately 5.7 times more likely to report that they had quit smoking than counterparts in the comparison condition, not including those who dropped out. These results translated into predicted quit rates of 11.0% and 5.0% for the N-O-T and comparison groups, respectively. 7 Power Analysis Based on preliminary data, we assume the standard deviation (SD) of the DAS is 1.1 at both baseline and follow-up and the base rate change in DAS is from 5.0 to 4.5. If there were zero correlation between the baseline and the follow-up DAS, then the SD of the change score would be 1.56. Assuming a correlation of 0.5, the change score SD would be 1.10. For a stronger of 0.7, the change score SD would be 0.85. For a weaker of 0.3, the change score SD would be 1.30. Assuming that the genetic variant will lead to a change in DAS from 5.0 to 4.0 and the genetic effect is additive, then the wild type genotype would be expected to have the base rate change from 5.0 to 4.5, the homozygote variant genotype would be expected to change from 5.0 to 4.0, and the heterozygote genotype would be expected to change from 5.0 to 4.50. The 1 degree-of-freedom (df) test for the additive effect would have statistical power of 79.6% assuming a change score correlation of 0.5. For the most conservative situation (zero correlation) the power could be as low as 50.2%. For a stronger change score correlation of 0.7 the power would be 95.0%. For a weaker change score correlation of 0.3 the power would be 65.5%. The loading dose, maintenance dose, stability (INR > 4) will be compared across the 3 established CYP2C9 genotype groups (i.e., Extensive, Intermediate, and Poor metabolizers) using traditional analysis of variance (ANOVA) procedures. Based on the maintenance dosage and allele frequency data reported by Higashi et al. (2002), a power analysis was conducted. The results of the power analysis indicate that in order to detect differences among these three groups at a significance level of α = 0.05 with 0.80 statistical power, sample sizes of 171 Caucasians and 200 African-American patients will be necessary. It should be noted that these power analyses were based on parametric (normal theory) statistics. If there are substantial departures from normality, non-parametric procedures, which are often more powerful with non-normal data, may be used. 8

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Guide on Writing Methods and Results Sections