Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

1 Sample size and MCA Always use the largest samples possible, regardless of the research design or statistical analyses to be applied. Large samples yield statistics that are closer in value to true, though unknown, population values, a reflection of The Law of Large Numbers. Large samples contain more members of the population, and therefore have potential to harbor better approximations to the population values of quantities under study than do small samples. Sample size for MCA in particular can be considered from several different perspectives. It is known that MCA results (the values of Beta and R) are more trustworthy when n = 200 or more. It is said that "the Betas bounce" or are unstable from sampling to sampling when n is smaller than 200, and especially so when n is trivially small. When true R is known (as it can be when a simulation is conducted or a population has been surveyed), small samples will yield values of R that fluctuate away from true R, greater than do samples of, say 200 and larger. The number of people (n) must exceed the number of independent variables (k) in the analysis. As k approaches n, the value of R becomes artificially large, reaching 1.0 when k = n. Plan to have no fewer than three people for every variable in the analysis, or n k X 3. This number may not provide adequate statistical "power", however. Another consideration does pertain to "power", or the so-called probability of rejecting a false null hypothesis. The topics of Type-II Error, accepting a false null hypothesis when R <> 0, and Type-I Error, rejecting a true null hypothesis when R = 01 can be seen as controversial, because (a) lacking a population census the value of R is unknowable, and (b) p-levels do not indicate probabilities that null hypotheses are either true or false. Assigning mathematical probabilities to these two Errors may not be as important as using a large sample to provide as high a likelihood as possible of a sample that yields an R value close to that in the population. However, we present a table of required n's for achieving statistical power to detect R of varying sizes. Each of the tables below presents minimum required n to achieve adequate (.70) to excellent (.90) statistical power for varying numbers of variables (k). Select the table that provides the information for the minimum value of R to detect; that is, decide upon the smallest R that would be of practical interest to discover. The indicated sample size will the null hypothesis value of R2 is equal to (k-1)/(n-1), not zero, unless there is only one predictor variable, i.e., a simple Pearson r between two variables. 1actually, 2 provide statistical power to detect R of the minimum size or larger. For example, to detect R no smaller than .20, with power of .70, and 2 variables in the analysis, n must be no less than 187. Sample size table for R, k, and power Power R k .70 .80 .90 ----------------------------------------.20 1 150 190 254 .20 2 187 234 306 .20 3 214 265 344 .20 4 237 291 374 .20 5 256 313 401 .20 10 332 400 503 .20 15 391 467 581 Sample size table for R, k, and power Power R k .70 .80 .90 ----------------------------------------.30 1 64 81 108 .30 2 80 100 130 .30 3 92 114 147 .30 4 102 125 160 .30 5 111 135 172 .30 10 146 175 218 .30 15 174 206 254 Sample size table for R, k, and power Power R k .70 .80 .90 ----------------------------------------.40 1 34 43 57 .40 2 43 53 69 .40 3 50 61 78 .40 4 55 67 85 .40 5 60 73 92 .40 10 81 96 118 .40 15 98 114 139 Sample size table for R, k, and power Power R k .70 .80 .90 ----------------------------------------.50 1 20 25 33 .50 2 26 31 40 .50 3 30 36 46 .50 4 34 40 51 .50 5 37 44 55 .50 10 51 59 72 .50 15 62 72 86 3 Sample size table for R, k, and power Power R k .70 .80 .90 ----------------------------------------.60 1 12 15 20 .60 2 16 20 25 .60 3 19 23 29 .60 4 22 26 32 .60 5 24 28 35 .60 10 34 39 47 .60 15 43 49 57 Sample size table for R, k, and power Power R k .70 .80 .90 ----------------------------------------.70 1 8 10 12 .70 2 11 13 16 .70 3 13 15 18 .70 4 15 17 21 .70 5 16 19 23 .70 10 24 27 32 .70 15 32 35 40 Sample size table for R, k, and power Power R k .70 .80 .90 ----------------------------------------.80 1 5 6 7 .80 2 7 8 10 .80 3 8 10 11 .80 4 10 11 13 .80 5 11 13 15 .80 10 18 20 22 .80 15 24 26 29 4 About statistical significance and statistical power The descriptive statistic that we have been considering is the group mean. It is used to characterize a group or sample overall as having a small to large amount of a characteristic. Other descriptive statistics are frequencies or proportions, standard deviations, and correlation coefficients. ANOVA and the t-test are vehicles for assessing the statistical significance of differences between two or more means. Statistical significance testing (SST) is a mathematical simulation that we use as an aid in making a decision to report that observed mean differences are or are not representative of "real" differences. Upon obtaining a set of group means, the researcher will observe either manifestly trivial differences or those that she "feels" and believes are substantial. SST offers the researcher one vehicle for modifying or solidifying her beliefs about observed mean differences. The enterprise of statistics is firstly about organizing and summarizing data so that states, trends, and relationships can be revealed. The second concern of statistics is testing mathematical hypotheses about what we see in the data--doing SST--so that we can form beliefs that we report to the public; beliefs that what we see is fluky, or beliefs that what we see is "real". SST, like all other simulations, yields information that in this case is used by the researcher in deciding to report either that she has found "something", or that "nothing" has been found. SST does not determine or define this decision. SST asserts that the observed differences are due to sampling error, while the true mean difference is precisely zero. A mathematical model is used as a hypothetical population in which the true mean difference is zero, though from which a virtual infinitude of nonzero differences can and will be sampled if the sample size is less than extremely large (infinite!). When using SST, the researcher temporarily asserts that a no-difference universe is the origin of her set of observed differences; that her differences are fluky, random noise, insubstantial. SST reveals the probability of sampling a set of differences of the size one has observed when, not if the true difference is zero; or when the null 5 hypothesis is true, as it is said. When an observed set of differences has a small probability of being sampled when it has been in fact sampled from a truly null population, the researcher concludes that such a population is a poor model for the origin of the sampled differences. She rejects the null hypothesis. When the observed set of differences has a relatively large probability of being sampled when it has been sampled in fact from the null population, the researcher will accept the null model as a tenable origin of the sampled differences; she accepts the null hypothesis. SST deals with probabilities for samples from the null population, not with probabilities that the null population was sampled! SST is never an attempt to prove any hypothesis whatsoever. SST is an attempt to obtain information that can assist in making a decision about the origin of one's sample findings. Is it fluky or is it not? The researcher must balance and consider the quality of the research design, the size of a given difference, and the probability of sampling such differences even though (when, not if) the true difference is zero in finally deciding to report "a significant difference" or not. The coefficient of statistical significance is the probability or p-level. The most commonly used value is p .05; decide to declare a statistically significant--that is, real--difference if the probability of sampling one's difference(s) when the universe is null is no greater than five percent. Large samples will yield "significant" findings more often than small samples. This is so because sample size appears in the mathematical equations used in calculating F- and t-ratios in a manner that tends to make these ratios large when n is large, and make these ratios small when n is small. The larger the F- or tratio, the more likely it will have a small associated value of p; perhaps as small is .05 or .01. By contrast to the mechanical impact of large n on the value of p, there is a theory that large samples tend to give studies greater statistical power than do small samples. This is the theory that if the origin of one's findings was not a null universe, but instead 6 was a universe in which the true difference is not zero, then a large sample provides a better chance of sampling sufficient data points from the region of that universe, namely its center, in which the population means actually differ. This is an expression of The Law of Large Numbers, which states that the larger the sample, the more closely the sample statistics will resemble the population statistics. Researchers wish to maximize this "power" by using a sample of a certain minimum size or larger. "Power" is said to be the probability of rejecting a false null hypothesis, whereas α (alpha) is said to be the probability of rejecting a true null hypothesis. The arithmetic of SST can guarantee p .05 with a certain value for n, when a specific minimum "effect size" is being sought. This is not to say that a certain sample size will cause a desired minimum effect size to emerge from the data. If the data do contain this minimum set of differences (or larger), then the p-level will be .05 or smaller if and only if the sample size is no less than a predetermined number. Below are abbreviated sample size tables for t- and F- tests. Find the minimum sample size per group that will provide a certain level of power for detecting an effect of a certain minimum size or larger by locating n at the intersection of a given power row and effect size column. 7 Sample size for t-tests according to desired effect size and power Power .70 .80 .90 small Effect Size medium large 233 313 425 38 50 68 16 20 28 Sample size for F-tests relative to numerator Degrees of Freedom,2 effect size, and power Power .70 .80 .90 Power .70 .80 .90 Power .70 .80 .90 2DF small DF = 2 Effect Size medium large 258 322 421 42 52 68 17 21 27 small DF = 3 Effect Size medium large 221 274 354 36 45 58 15 18 23 small DF = 4 Effect Size medium large 195 240 309 32 39 50 13 16 20 are generally equal to the number of means being compared minus one. DF = 2 implies that three means are being compared. 8 Power .70 .80 .90 Power .70 .80 .90 Power .70 .80 .90 small DF = 5 Effect Size medium large 175 215 275 29 35 45 12 14 18 small DF = 10 Effect Size medium large 123 148 187 20 24 31 8 10 13 small DF = 15 Effect Size medium large 98 118 148 16 20 24 7 8 10