Download Significane Testing and the Correlation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Statistical Significance of a Correlation
Running Head: SIGNIFICANCE TESTING AND CORRELATIONS
Testing the Statistical Significance of a Correlation
R. Michael Furr
Wake Forest University
Address correspondence to:
Mike Furr
Department of Psychology
Wake Forest University
Winston-Salem, NC 2706
[email protected]
336-758-5024
1
Statistical Significance of a Correlation
2
Testing the Statistical Significance of a Correlation
Researchers from Psychology, Education, and other social and behavioral sciences are very
concerned with statistical significance. If a researcher conducts a study and finds that the results are
“statistically significant,” then the he or she has greater confidence in the effects revealed by the study.
When results are statistically significant, researchers are more likely to believe that the effects are “real”
and not likely to have occurred by chance. The goal of science is to understand our physical or social
world, and this occurs in part by being able to judge which research findings are real and which are flukes
and red herrings. This paper describes the procedures through which researchers determine whether the
results of a study are statistically significant. It presents the logic, technical steps, and interpretation of a
test of statistical significance, specifically for researchers examining a correlation between two variables.
Many textbooks provide in-depth introductions to statistical significance, but there appear to be
no sources that provide such an introduction in the context of correlations. Most introductory statistics
textbooks in Psychology provide concepts and procedures for significance testing in the context of means,
but the extension of significance testing to correlations is usually very slight. Typically, the coverage of
significance testing for correlations, if it is discussed at all, focuses on computational procedures,
bypassing the conceptual foundations and interpretations. In fact, the organization of some introductory
statistics textbooks implies that correlations and significance testing are completely separate issues. For
example, a chapter on correlation might be in a section labeled “Descriptive Statistics” and the chapters
related to significance testing might be included in a section labeled “Inferential Statistics.”
Although general statistics textbooks omit deep coverage of the conceptual and practical
foundations of significance testing for correlations, one might suspect that such coverage could be found
in sources that focus on correlational procedures specifically (e.g., Archdeacon, 1994; Bobko, 2001; Chen
& Popovich, 2002; Cohen & Cohen, 1983; Edwards; 1984; Ezekiel, 1941; Miles & Shevlin, 2001;
Pedhazur, 1997). Unfortunately, these sources also omit in-depth discussions of basic concepts in
significance testing. The more advanced sources naturally assume that readers already have a solid grasp
Statistical Significance of a Correlation
3
of basic concepts in significance testing. Unfortunately, even the more introductory sources provide little
background in basic concepts in statistically significance as related to correlations.
A number of potential problems arise from the fact that no sources provide in-depth discussions
of significance testing as related to correlations. First, some budding researchers might be left with the
mistaken and potentially confusing belief that correlations and significance tests are unrelated issues.
Although the computation and interpretation of a correlation can proceed without reference to a
significance test, correlations are rarely reported without an accompanying significance test. Second,
even if researchers are aware that correlations can be tested for statistical significance, they might have
difficulty connecting fundamental concepts in “significance testing” (e.g., parameters, confidence
intervals, distributions of inferential statistics) to correlations. The existing sources make little effort t to
generalize concepts articulated in the context of means or frequencies to correlational analyses. Third, the
existing sources create difficulty for course instructors who cover correlational analyses before other
kinds of analyses. For example, some Psychology Departments divide their “Research Methods and
Statistics” courses into a “correlational” semester and an “experimental” semester. If the correlational
course is taken before the experimental course, then instructors who teach the correlational course face a
dilemma. They can ignore significance testing of correlations, they can provide a cursory coverage of
significance testing of correlations, or they can assign readings that present significance testing in the
context of means or frequencies.
A solid understanding of significance testing as related to correlations may be particularly
important as the field evolves in two ways. First, researchers are increasingly aware of the importance of
effect sizes, such as correlations (American Psychological Association, 2001; Capraro & Capraro, 2003;
Furr, 2004; Heldref Foundation, 1997; Kendall, 1997; Murphy, 1997; Rosenthal, Rosnow, & Rubin,
2000; Thompson, 1994, 1999; Wilkinson & APA Task Force on Statistical Inference, 1999). Second,
many in the field recognize that regression, based on a correlational foundation, is a general approach to
data analysis that can incorporate much that is typically conceptualized as Analysis of Variance. As the
awareness and use of effect sizes and correlational analytic procedures continue to grow, and as advanced
Statistical Significance of a Correlation
4
correlational procedures continue to emerge, researchers should have a solid understanding of the
connections between correlations and significance testing.
The current paper is intended to partially fill this hole in the basic statistical literature. It
describes what statistical significance is about, presents fundamental concepts in evaluating statistical
significance, and details the procedures for testing the statistical significance of a correlation.
Samples and Populations: Inferential Statistics
Imagine that Dr. Cartman wants to know whether the Scholastic Aptitude Test (SAT) is a valid
predictor of college freshman performance at the local university. To address this issue, he recruits a
sample of 200 freshmen from the university. The students give their consent for Dr. Cartman to have
access to their academic records, from which he records their SAT scores and their first-year college
Grade Point Average. Based on these data, Dr. Cartman finds that the correlation between SAT scores
and GPA is .40, which is a positive correlation of moderate size. This correlation tells him that, within
the sample, the students with relatively high SAT scores tend to have relatively high GPAs (and that
students with relatively low SAT scores tend to have relatively low GPAs). Based on this finding in his
sample, Dr. Cartman is tempted to conclude that the SAT is indeed a useful predictor of freshman GPA at
the University. But how much confidence should Dr. Cartman have in this conclusion? He might be
hesitant to use the results found in a sample of 200 students to make an inference about whether SAT
scores are correlated with GPAs in the entire freshman student body.
The question of statistical significance arises from the fact that scientists would like to make
conclusions about psychological phenomena, effects, differences, or relationships between variables as
they exist in a large population (or populations) of people (or rats, monkeys, etc, depending on the
scientist’s area of expertise). For example, Dr. Cartman would like to make conclusions about whether
SAT scores are correlated with GPAs in the entire freshman student body. Similarly, a clinical
psychologist might be interested in whether a new drug generally helps to alleviate depression, within the
“population” of all people who might take the drug. Or a social psychologist hypothesizes that romantic
Statistical Significance of a Correlation
5
couples in which the partners have similar profiles of personality traits tend to be happier than couples in
which the partners have dissimilar personalities. This researcher would be interested in concluding
whether similarity and romantic happiness are generally correlated with each other within the
“population” of all couples in romantic relationships.
Despite their desire to make conclusions about large populations, researchers generally study only
samples of people recruited from the larger population of interest. In our example, Dr. Cartman would
like to make conclusions about the entire freshman class at the university, but he is able to recruit a
sample of only 200 students from the student body. Similarly, a clinical psychologist cannot study all
people who might ever take a drug, and a Social Psychologist cannot study all romantic couples.
Researchers such as Dr. Cartman gather data from relatively small samples of people, and they use the
sample data to make inferences about the existence and size of psychological phenomena in the larger
population from which the sample was drawn.
Researchers must be concerned about the accuracy with which they can use data from samples to
make inferences about the population from which the sample was recruited. Dr. Cartman recognizes that
the 200 students who happened to be in his study might not be perfectly representative of the entire
freshman class. It is possible that, in the student body as a whole, there is no association between SAT
scores and GPA. That is, in the population from which the sample of 200 students was drawn, the
correlation between SAT and GPA could be zero. Even if the correlation in the population is exactly zero,
Dr. Cartman could potentially obtain a random sample of students in which the correlation between SAT
and GPA is not zero. His particular sample of 200 students might be unusual in some subtle way. Just by
chance, Dr. Cartman might have recruited a sample in which people who scored relatively high on the
SAT also tend to have relatively high GPAs. Thus researchers must be concerned about sampling error –
the fact that a particular sample might not be perfectly representative of the population from which they
were randomly drawn. The potential presence of sampling error means that researchers can never be
totally confident that results found in a sample are perfectly representative of what is really “going on” in
the population.
Statistical Significance of a Correlation
6
The procedures for evaluating statistical significance help researchers determine how well their
sample’s results represent the population from which the sample was drawn. Roughly speaking, when we
find that results are statistically significant, we have confidence in inferring that the effects observed in
the sample represent the effects in the population as a whole. In our example, Dr. Cartman would want to
know whether the correlation he found in the sample’s data is statistically significant. If it is, then he will
feel fairly confident in concluding that SAT scores are correlated with GPA in the student body as a
whole. If his correlation is not found to be statistically significant, then he would not feel confident
concluding that SAT scores are correlated with GPA in the student body as a whole. Because we use this
process to help us make inferences about populations, the statistical terms and procedures involved in this
process are called inferential statistics.
There are standard terminologies describing inferential statistics and the connections between
samples and populations. We use the sample data to calculate a correlation between two variables, such
as SAT and GPA. This correlation is labeled with an r, and it is called a descriptive statistic because it
describes some aspect of the sample that is actually observed in our study. We then might use the
correlation observed in the sample to estimate the correlation in the population from which the sample is
drawn. Because we cannot study the entire population, we can only make an informed guess about the
correlation as it exists in the population. The correlation in the population is labeled with the Greek letter
rho (ρ), and it is called a parameter.
Statistical Hypotheses: Null and Alternative
Consider again Dr. Cartman, who wishes to know if the SAT is correlated with GPA in the entire
freshman class. If Dr. Cartman is conducting a traditional significance test of a correlation, then he will
consider two possibilities.
One possibility, called the null hypothesis, is that the SAT is not correlated with GPA within the
Freshman student body. More technically, the null hypothesis states that the population correlation
parameter (ρ) is zero. The null hypothesis is often labeled as H0, written as:
Statistical Significance of a Correlation
7
H0: ρ = 0
This expresses exactly what Dr. Cartman is doing – he will be testing the null hypothesis that the
correlation in the population is equal to zero.
The second possibility, called the alternative hypothesis, is that the SAT is correlated with GPA
within the freshman student body. More technically, the alternative hypothesis states that the population
correlation parameter (ρ) is not zero. The alternative hypothesis is often labeled as H1 or HA, written as:
H1: ρ ≠ 0
The Decision About the Statistical Hypotheses
For traditional significance testing of a correlation, Dr. Cartman’ faces a decision between two
competing hypotheses. In making this decision, Dr. Cartman has two options. First, he might reject the
null hypothesis, thereby concluding that, in the population, the two variables are correlated with each
other. In other words, the results of the analysis of his sample’s data make him feel confident enough to
conclude that the population correlation is some value other than zero. Second, he might fail to reject the
null hypothesis, thereby concluding that, in the population, the two variables are not correlated with each
other. In other words, the results of the analysis of his sample’s data do not make him feel confident
enough to conclude that the population correlation is a value other than zero.
Note that, strictly speaking, both options are phrased in terms of rejecting the null hypothesis – Dr.
Cartman can either reject the null or he can fail to reject the null. Researchers generally do not phrase the
decisions in terms of the “accepting the null” or in terms of the alternative hypothesis. For these reasons,
the traditional procedures are call “null hypothesis significant testing.”
The decision regarding the null hypothesis is tied to the notion of statistical significance. If a
researcher rejects the null hypothesis, then the result is said to be “statistically significant.” If a
researcher fails to reject the null hypothesis, then the result is said to be not statistically significant.
Practically speaking, the default decision is to fail to reject the null hypothesis – to conclude that
the variables are uncorrelated in the population. Researchers reject the null hypothesis only when their
Statistical Significance of a Correlation
8
sample data make them confident enough to override the default decision and guess that the null
hypothesis is incorrect. In his sample of 200 freshmen, Dr. Cartman found a correlation of .40 between
SAT and GPA. The question that Dr. Cartman faces is, do his sample findings make him confident
enough to reject the null hypothesis that the correlation in the entire freshman student body is zero?
Two issue arise when determining whether the sample findings make Dr. Cartman confident
enough to reject the null hypothesis. First, how confident is Dr. Cartman that the null hypothesis is false?
In other words, how confident should he be that, in the entire freshman class, SAT truly is correlated with
GPA? The second issue is how confident does he need to be in order to actually reject the null hypothesis?
Psychology and related sciences have reached a consensus regarding the degree of confidence that a
researcher should have before rejecting a null hypothesis. These two issues are considered in turn, as part
of a process called a t-test.
Testing the Null Hypothesis: What Affects Our Confidence?
Two main factors make Dr. Cartman more or less willing to conclude that there is a non-zero
correlation between SAT and GPA in the entire Freshman student body. One factor affecting his
confidence is the size of the correlation in his sample. In his sample’s data, Dr. Cartman found a
correlation of r = .40, which represents a positive correlation of moderate size. But what if he had found
that the correlation in his sample was much weaker, say only r = .12? Dr. Cartman recognizes that a
correlation of only r = .12 is not very different from a correlation of zero. Therefore, he probably would
not be very confident in concluding that the population correlation was anything but zero. In other words,
if the correlation in the population is indeed zero (i.e., if ρ = 0), then it would not be very surprising to
randomly draw a sample in which the observed correlation is small – only slightly different from zero.
But what if Dr. Cartman had found that the correlation in sample was very strong, say r = .80? A
correlation of r = .80 is very far from zero – it expresses a very strong association between two variables.
Therefore, he probably would be much more confident in concluding that the population correlation was
not zero. In other words, if the correlation in the population is indeed zero (i.e., if ρ = 0), then it would be
Statistical Significance of a Correlation
9
very surprising to randomly draw a sample in which the correlation is so far away from zero. In sum, the
size of the correlation in the sample will affect Dr. Cartman’s confidence in concluding that the
population correlation is anything but zero – larger sample correlations will increase his confidence that
the population correlation is not zero.
The second factor affecting his confidence in rejecting the null is the size of the sample itself. In
his study, Dr. Cartman was able to recruit 200 participants. But what if he had been able to recruit a small
sample of only 15 participants? Dr. Cartman probably would not be very confident in making inferences
about the entire freshman student body based on a study of only 15 participants. On the other hand, if he
had been able to recruit a sample of 500 students (a much larger proportion of the population), then Dr.
Cartman would be more comfortable in making inferences about the entire student body. Therefore,
larger samples increase his confidence in making inferences about the population, and smaller samples
decrease his confidence.
We can quantify the amount of confidence that a researcher should have in rejecting the null
hypothesis that the correlation in the population is zero (i.e., H0: ρ = 0). We compute a t value, which is
an inferential statistic that can be conceptualized roughly as an index of “degree of confidence in rejecting
the null hypothesis.” The formula for computing the t value reflects the two factors discussed above – the
size of the correlation and the size of the sample:
tOBSERVED
=
r
1− r
2
x
N −2
Equation 1
The tOBSERVED is the t value derived for the data that was observed in the actual sample of participants in
the study, r is the correlation in the sample, and N is the number of participants in the sample. In Dr.
Cartman’ data:
.40
tOBSERVED
=
tOBSERVED
=
.436
tOBSERVED
=
6.135
1 − .40 2
x
200 − 2
x
14.071
Statistical Significance of a Correlation
10
Large t values reflect more confidence in rejecting the null hypothesis. Consider the t value that would be
obtained for a sample in which the correlation is only .12 and the sample size is only 15:
.12
tOBSERVED
=
tOBSERVED
=
.121
tOBSERVED
=
.436
1 − .12 2
x
15 − 2
x
3.606
This t value is noticeably lower than the t value found in the larger sample with the larger correlation, and
the lower t value reflects the lower confidence that we would have in rejecting the null hypothesis.
In sum, effect size (i.e., the size of the correlation) and sample size are the two key factors
affecting a researcher’s confidence in rejecting the null hypothesis. These two factors are part of what
researchers call the “power” of a significance test (Cohen, 1988). Larger effect sizes (correlations farther
away from zero) and larger sample sizes increase our confidence in rejecting a null hypothesis – reflecting
a “powerful” significance test.
Testing the Null Hypothesis: How Confident Do We Need to Be?
Once we know the factors influencing Dr. Cartman’s confidence in rejecting the null hypothesis,
we can consider the question of how confident he needs to be in order to reject the null. We have seen
that larger correlations and larger samples produce greater confidence, as reflected in larger t values. But
how large does a t value need to be in order for Dr. Cartman to decide to reject the null hypothesis that the
population correlation between SAT and GPA is zero?
Confidence can be framed in terms of the probability that we would be making an error if we
rejected the null hypothesis. Recall that a researcher never really knows if the null hypothesis is true or if
it is false (because researchers typically cannot include entire populations in their studies). Researchers
collect data on a sample that is drawn from the population of interest, and then they use the sample’s data
to make educated guesses about the population. But even the most well-educated guess could be
incorrect. Significance testing is most directly concerned with what is called a “Type I Error.” A Type I
Statistical Significance of a Correlation
11
error is made when a researcher rejects the null hypothesis when in fact the null hypothesis is true. That
is, a researcher makes a Type I error when he or she concludes that two variables are correlated with each
other in the population, when in reality the two variables are not correlated with each other in the
population. If Dr. Cartman rejects the null hypothesis in his study, then he is saying that there is a very
low probability that he will be making an incorrect rejection.
The probability of an event occurring (i.e., the probability that a mistake will be made) ranges
from 0 to 1.0, with probabilities near zero meaning that the event is very unlikely to occur. Thus, a
probability of 0 means that there is absolutely no chance that a mistake will be made, and a probability of
1.0 means that a mistake will definitely be made. Values between these two extremes reflect differing
likelihoods of the event. A probability of .50 means that there is a 50% chance that the a mistake will be
made, and a probability of .05 means that there is only a 5% chance (a pretty remote chance) that a
mistake will be made.
By convention, psychologists have adopted the probability of .05 as the criterion for determining
how confident a researcher needs to be before rejecting the null hypothesis. Put another way, if Dr.
Cartman finds that his study gives him a level of confidence associated with less than a 5% chance of an
incorrect rejection of the null hypothesis, then he is “allowed” to reject the null hypothesis. Traditionally,
psychologists have assumed that, if researchers are so confident in their results that they have such a small
chance of making a Type I Error, then they are allowed to reject the null hypothesis. Researchers often
use the term “alpha level” when referring to the degree of confidence required to reject the null
hypothesis. By convention, most significance tests in psychology are conducted with an alpha level
of .05.
Statisticians have made connections between the observed t values computed earlier and the p
value (alpha level) associated with incorrectly rejecting the null hypothesis. How can Dr. Cartman
determine if his observed t value allows him to be confident enough to assert that he has less than a 5%
chance of making a Type I Error? To do this, Dr. Cartman must identify the appropriate “critical” t value,
which will tell him how large his observed t value must be in order for him to reject the null hypothesis in
Statistical Significance of a Correlation
12
his study. The critical t value that Dr. Cartman will use reflects a .05 probability of incorrectly rejecting
the null hypothesis. It is the t value that is exactly associated with a 5% chance of making a Type I Error.
To identify the appropriate critical t value, Dr. Cartman can refer to a Table of critical t values.
Many basic statistics textbooks and research method textbooks include tables of critical t values, such as
that presented in Table 1 (see the end of this paper). Dr. Cartman must consider only two issues when
identifying the critical t value for his study. These two issues are reflected in the columns and rows of the
Table.
Table 1 presents several columns of t values. These columns represent different degrees of
confidence required, in terms of the probability of making a Type 1 Error. Because psychology and
similar sciences have traditionally adopted a probability level of .05 as the criterion for rejecting a null
hypothesis, Dr. Cartman will typically only be concerned about the values in the column labeled “.05.”
Table 1 also presents several rows, and each row represents a different sized study. The rows are
labeled “df,” which stands for degrees of freedom. “Degrees of freedom” is linked to the number of
participants in the sample. Specifically, df = N – 2. Dr. Cartman determines that the degrees of freedom
for his study is df = 198 (200 – 2 = 198).
Referring to a Table of critical t values, Dr. Cartman pinpoints the intersection of the .05 column
and the appropriate row. The entry at this point in the Table is 1.972. he will use this critical t value to
help decide whether to reject the null hypothesis.
Testing the Null Hypothesis: Making the Decision
The decision about the null hypothesis is made by comparing an observed t value to the
appropriate critical t value. If Dr. Cartman finds that the absolute value of his observed t value is larger
than the critical t value, then he will decide to reject the null hypothesis. If Dr. Cartman finds that the
absolute value of his observed t value is not larger than the critical t value, then he fails reject the null
hypothesis. In shorthand terms,
If |tOBSERVED| > tCRITICAL then reject H0
Statistical Significance of a Correlation
13
If |tOBSERVED| < tCRITICAL then fail to reject H0
In his case, Dr. Cartman rejects the null hypothesis, because the absolute value of his observed t value is
larger than the critical t value (|6.135| > 1.972). These “statistically significant” results tell Dr. Cartman
that it is highly unlikely to find a correlation of .40 (a moderate effect size) in a sample of 200 participants
(a fairly large sample), if the correlation in the population is zero. Therefore, he rejects the null hypothesis
and concludes with confidence that the correlation in the population is probably not zero. That is, he
concludes that in the entire freshman student body at his University, SAT scores are indeed correlated
with GPA.
For a more general perspective, it might be worth considering other patterns of results. As a
second example, imagine that a second researcher, Dr. Marsh, had obtained a correlation of .12 from a
sample of 15 participants. In this case, the observed t value would be tOBSERVED = .437. Looking at Table
1, in the .05 column and the row for df = 13 (df = 15 - 2), he finds that the critical t value is tCRITICAL =
2.160. Here, the absolute value of tOBSERVED is less than tCRITICAL, so Dr. Marsh would fail to reject the
null hypothesis. The effect size is small (i.e., it is not very different from zero), and the sample is small
(only 15 people). With such weak results and such a small sample, Dr. Marsh is not confident enough to
reject the idea that the correlation in the population is zero. Thus, his correlation is not statistically
significant.
As a third example, imagine that Dr. Broflovski had obtained a negative correlation (say, r = -.40)
from a sample of 200 participants. In this case, the observed t value would be tOBSERVED = -6.135 (note
that this is a negative observed t value). Looking at Table 1, in the .05 column and the row for df = 198,
he finds that the critical t value is tCRITICAL = 1.972. Here, the absolute value of tOBSERVED is greater than
tCRITICAL (|-6.135| > 1.972), so Dr. Broflovski would reject the null hypothesis in this case. Dr. Broflovski
has found a moderately-sized correlation (i.e., it is fairly different from zero) in a fairly large sample of
participants. Dr. Broflovski feels that he is highly unlikely to find a moderate effect size in a fairly large
sample, if the correlation in the population is zero. Note that the direction of the correlation (positive or
negative) does not make a difference in this example. Dr. Broflovski has conducted what is known as a
Statistical Significance of a Correlation
14
two-tailed test or a non-directional test. This means that he is testing the null hypothesis that the
correlation in the population is zero. This hypothesis can be rejected if the correlation in the sample is
positive or if it is negative – either way could convince him that the correlation in the population is not
likely to be zero.
Table 2 presents a summary of the steps in conducting a typical null hypothesis significance test
of a correlation.
Interpreting the Decision
A significance test comes down to a decision between two choices. Usually this decision
concerns whether or not the correlation is zero in the population from which the sample has been drawn.
Inferential statistics help us determine the likelihood that a given sample’s results might have occurred
either: a) because the sample is drawn from a population in which the correlation is not zero, or b)
purely by chance, with the sample being drawn from a population in which the correlation is zero.
We reject the null hypothesis when the probability level associated with our results suggests that
our results are unlikely to have occurred if the null hypothesis were true. Again, the primary example of
Dr. Cartman shows this situation – he obtained a moderate correlation in a large sample. His significance
test tells him that this result is unlikely to have occurred in this sample, if indeed the correlation in the
population is zero. He therefore concludes that the null hypothesis is false (i.e., he concludes that the
population correlation is not zero), and decides to reject it.
We fail to reject the null hypothesis when the probability level associated with our results
suggests that our results are not unlikely to have occurred if the null hypothesis were true. In the second
example, Dr. Marsh found a weak correlation in a small sample. The significance test indicates that the
results might very well occur even if the correlation in the population is zero. Dr. Marsh therefore
concludes that the null hypothesis might not be false (i.e., the population correlation might well be zero)
and so he decided not to reject it.
Statistical Significance of a Correlation
15
You are likely to hear a variety of different interpretations of a correlation that is statistically
significant. For example, Dr. Cartman’s results (r = .40, p < .05) from his sample of N = 200 might lead
him to make statements such as:
•
The correlation is “significantly different from zero.”
•
It’s unlikely that the sample came from a population in which the correlation is zero.
•
In the population from which the sample was drawn, the two variables are probably associated
with each other.
•
The observed data are unlikely to have occurred by random chance.
•
There is a less than a .05 probability (i.e., a very small chance) that the results could have been
obtained if the null hypothesis is true.
•
He is 95% confident that the population correlation is not zero
•
If this study was done 100 times (each with a random sample of N = 200, drawn from a
population in which the correlation is zero), we would get a correlation of magnitude .40 or
stronger (ie, r ≥ |.40|) fewer than 5 times.
•
Given that the results are unlikely to have occurred if the null were true, then the null is probably
not true.
You are also likely to hear a variety of different interpretations of a correlation that is not
statistically significant. For example, the second example (r = .12, p > .05, N = 15) might lead to
statements such as:
•
The sample’s correlation is “not significantly different from zero.”
•
It’s not unlikely that the sample came from a population in which the correlation is zero.
•
In the population from which the sample was drawn, the variables are likely to be uncorrelated
with each other.
•
The observed data might very well have occurred by random chance.
Statistical Significance of a Correlation
•
16
There is a more than a .05 probability (i.e., not a small chance) that the results could have been
obtained even if the null hypothesis is true.
•
She cannot be 95% confident that the population correlation is not zero.
•
If this study is done 100 times (each with a random sample of N = 15, drawn from a population in
which the correlation is zero), we would get a correlation of magnitude .12 or stronger (ie, r ≥
|.12|) more than 5 times.
•
Given that the results are not unlikely to have occurred if the null were true, the null might very
well be true.
Experts in probability might take issue with some of the above interpretations, depending on their
perspective on probability and logic. Nevertheless, many of the interpretations above or close variations
are often used.
While considering the appropriate interpretations of significance tests, we should also consider at
least two potential confusions. One point of confusion might concern a rejection of the null hypothesis.
Dr. Cartman’s sample correlation was r = .40, which was statistically significant. By rejecting the null
hypothesis that ρ = 0, Dr. Cartman can conclude that the sample is probably not drawn from a population
with a correlation of 0. But he should not conclude that the sample was drawn from a population with a
correlation of ρ = .40. The sample might come from a population with a correlation of ρ = .40, but it also
might come from a population with a correlation of ρ = .35 or ρ = .50 and so on. So, rejecting the null
hypothesis means that the correlation in the population is probably not zero, but it does not indicate what
the correlation in the population is likely to be.
A second potential point of confusion concerns the failure to reject the null hypothesis. In the
second example, Dr. Marsh’s sample correlation was r = .12, which was not statistically significant.
Recall that the failure to reject the null hypothesis tells Dr. Marsh that the sample’s results might very
well have occurred if ρ = 0. So, Dr. Marsh can assume that the population correlation might be zero. In
this case, Dr. Marsh should not conclude that the correlation in the population is zero. The sample’s
Statistical Significance of a Correlation
17
results (r = .12) might also have occurred if ρ = .02, ρ = -.07, or ρ = .20. So, just because the population
correlation might be zero, that does not mean that it is zero or that all other possibilities are less likely.
Confidence Intervals
As outlined above, a null hypothesis test is a very specific test. The results of the typical test
allow us to make one inference about the population, specifically that the population correlation is either
unlikely to be zero or it might well be zero. That is, are two variables likely to be associated with each
other in the population or not?
Although it is useful to evaluate the likelihood that the population correlation is zero, we can ask
many other questions about the correlation in the population from which a sample was drawn. For
example, what is our best guess about the actual correlation in the population? If the correlation in Dr.
Cartman’s sample is r = .40, then what is Dr. Cartman’s best guess about the size of the correlation
among the entire freshman student body? All that Dr. Cartman knows is that the sample correlation is .40,
therefore his most reasonable guess about the student body correlation is that it is ρ = .40. This guess
about the specific value of the population correlation is called a point estimate because he is estimating a
single, specific point at which the population correlation lies.
Although the point estimate of the population correlation is an “educated guess,” Dr. Cartman is
not sure that the population correlation is .40. He recognizes that his particular random sample of
students might be different from the entire freshman student body in some ways, and these differences
might mean that the correlation he finds in his sample is different from the correlation in the entire student
body. Dr. Cartman might say that, although he is not sure that the population correlation is .40, he is
fairly confident that the population correlation lies somewhere between .28 and .51.
A confidence interval (CI) for a correlation is the range in which the population correlation is
likely to lie, and it is estimated with a particular degree of confidence. For example, Dr. Cartman’s range
(.28 ≤ ρ ≤ .51) is a 95% CI. That is, he is 95% confident that the population correlation (ρ) is between .28
and .51.
Statistical Significance of a Correlation
18
Although a discussion of the calculation of a CI is beyond the scope of this paper, three important
issues must be considered in interpreting a CI. First, the “width” of the CI reflects the precision of the
estimate. Dr. Cartman’s 95% CI ranges from .28 to .51, which is a span of 23 “points” on the
correlational metric. But consider the second example, in which Dr. Marsh found a correlation of r = .12
in a sample of 15 participants. Dr. Marsh’s 95% CI ranges from -.42 to .60 (ie, -.42 ≤ ρ ≤ .60), which is a
span of 102 “points’ on the correlational metric. Note the difference between the two examples,
illustrated in Figure 1. Dr. Cartman’s estimate of the population correlation is a narrower range than is Dr.
Marsh’s estimate. A narrower range reflects a more precise and informative estimate. For a more
familiar example, consider two weather predictions. One meteorologist predicts that the high temperature
tomorrow will be somewhere between 60 and 70 degrees – a range of 10 degrees. Another meteorologist
predicts that the high temperature tomorrow will be between 30 and 100 degrees – a much wider range of
70 degrees. Obviously, the first meteorologist’s narrower range is a much more precise and useful
prediction. Narrow CI’s are more precise and informative than are wide CI’s.
A second important point regarding CI’s is the effect of sample size on a CI. In computing a CI
based on a sample’s data, the size of the sample is directly related to the precision (i.e., width) of the CI.
Large samples allow researchers to make relatively precise CI estimates. Consider again the fact that Dr.
Cartman’s CI is much more precise than Dr. Marsh’s CI. The difference in precision is primarily due to
the difference in the sample sizes from the two studies. Dr. Cartman’s CI is based on 200 participants,
but Dr. Marsh’s CI is based on only 15 participants. The link between sample size and the width of a CI
is conceptually related to the link between sample size and our confidence in rejecting a null hypothesis,
discussed earlier. A relatively large sample includes a relatively large proportion of the population.
Therefore, an estimate about the population is more precise when based on large samples than when
based on smaller samples.
A third point regarding CI’s is the link between a CI and the typical null hypothesis test. Recall
that the typical hypothesis test of a correlation is the test of the null hypothesis that the correlation in the
population is zero (H0: ρ = 0). We reject the null hypothesis when we are confident that there is less than
Statistical Significance of a Correlation
19
a 5% chance that the correlation in the population is indeed zero. A CI might be seen as the flip side of
the significance test. Dr. Cartman’s CI tells us to be 95% confident that the population correlation is
within the range of .28 to .51. Put another way, Dr. Cartman’s CI tells us that there is only a 5% chance
that the population correlation is outside of the range of .28 to .51. Note that Dr. Cartman’s CI does not
include zero. The interval includes only positive values (i.e., it is entirely above zero), as illustrated in
Figure 1. Therefore, the CI tells us to be 95% confident that the correlation in the population is not zero.
In other words, it tells us that there is less than a 5% chance that the population correlation is zero. This
parallel’s the outcome of Dr. Cartman’s null hypothesis test, in which he rejected the hypothesis that the
correlation in the population is zero. In contrast, consider Dr. Marsh’s CI, also illustrated in Figure 1. Dr.
Marsh’s CI does include zero – the CI ranges from a negative value at one end to a positive value at the
other end. The fact that zero is within Dr. Marsh’s 95% CI indicates that the correlation in the population
from which his sample was drawn might very well be zero. Although Dr. Marsh is 95% confident that
the population correlation is not -.50, -.85, .62 and so on (because these values are outside of his CI), he
cannot be confident that the population correlation is not zero. In sum, the traditional null hypothesis
significance test is directly related to CI’s, and this relationship hinges on whether a CI includes zero.
Advanced Issues
The concepts, procedures, and examples in this paper reflect the most typical kind of significance
test of a correlation, in which a researcher tests the null hypothesis that the population correlation is zero,
at an alpha level of .05. Although this is the most typical kind of significance test, options exist for
conducting other kinds of statistical tests. Details of such options and advanced issues are beyond the
scope of this paper, but an overview might be useful.
Using Statistical Software
Statistical packages such as SPSS usually provide exact probability values associated with each
significance test. Figure 2 presents SPSS output for Example 2 (Dr. Marsh’s results), with labels
Statistical Significance of a Correlation
20
provided to aid interpretation. Note that SPSS labels the p value as “Sig. (2-tailed)”. As shown in Figure
2, the exact p value is .67.
The p values reported by the statistical software are used to make decisions about the null
hypothesis. If the p value is larger than .05 (as in Figure 2), then we would fail to reject the null
hypothesis. If the p value is smaller than .05, then we would reject the null hypothesis. Therefore, if you
use statistical software for correlational analysis, then you will not need to refer to a table of t values.
Instead, you simply examine the exact p value and gauge whether it is greater than or less than .05.
Additional Significance Tests for a Correlation
By far, the most typical significance test of a correlation is a test of the null hypothesis that the
population correlation is zero (H0: ρ = 0). This is most commonly reported in the psychological literature,
and it is the “default” test, as reflected in the p values reported by statistical software packages such as
SPSS or SAS. Despite this, we could test other null hypotheses involving correlations.
We could test a null hypothesis that the population correlation is a specific value other than zero.
For example, previous research might indicate the correlation between Conscientiousness and Work
Performance is .30 in the population, but we might hypothesize that some professions, might have an
even stronger correlation between Conscientiousness and Work Performance. We could recruit a sample
of accountants, measure their Conscientiousness and measure their Work Performance, and we might test
the null hypothesis that the correlation in the population of accountants (ie, the population from which our
sample is drawn) is .30 (i.e., H0: ρ = .30). In this case, we believe that the correlation among accountants
is not .30, which is reflected in the alternative hypothesis (H1 ρ ≠ .30). The significance test for this
example would be conducted somewhat differently than the much more typical test outlined earlier, and
many statistics textbooks describe the procedures.
Other p Values Besides .05
Statistical Significance of a Correlation
21
As described above, researchers have traditionally allowed themselves to reject null hypotheses
when their analyses suggest that there is less than a 5% chance of making a Type I Error (i.e., incorrectly
rejecting a null hypothesis). Although the p value (alpha level) of .05 is the conventional point at which
researchers reject a null hypothesis, researchers could consider using different p values.
Researchers sometimes use an even more strict criterion, such as an alpha level .01. Researchers
who decide to use a p value of .01 would reject the null hypothesis only when their analyses suggest that
there is less than a 1% chance of making a Type I Error. Using a different p value changes only Step 3 in
the process of statistical significance tests, as illustrated in Table 2. In Step 3, the researcher would select
a critical t value associated with a .01 alpha level. To identify the appropriate critical value, the
researcher would refer to a table such as Table 1, and examine the column labeled “.01” instead of the
column labeled “.05.” The researcher would then proceed to Step 4 and Step 5, comparing their observed
t value to the critical t value associated with the .01 alpha level. As shown in Table 1, the critical t value
for a study conducted with an alpha of .01 is larger than the critical t value for a study conducted with an
alpha of .05. In terms of the significance test, this difference means that researchers must be even more
confident that the null hypothesis is incorrect. That is, a larger observed t value is required in order to
reject the null hypothesis when using an alpha of .01.
Two-tailed vs One-tailed Tests
The examples described in this paper are based on “two-tailed” significance tests. The tests are
designed to evaluate the null hypothesis that the population correlation is zero (H0: ρ = 0), in comparison
to the alternative hypothesis that the population correlation is not zero (H1: ρ ≠ 0). For such two-tailed
tests, the null hypothesis could be rejected if the sample correlation is positive or negative – if the
correlation is on either of the two sides of zero. These hypotheses are non-directional – they do not
reflect any kind of expectation that the population correlation is positive, for example.
But researchers might have strong reasons to suspect that the correlation is in a specific direction.
For example Dr. Cartman might suspect that the correlation between SAT and GPA is positive. In such
Statistical Significance of a Correlation
22
cases, researchers could consider using a “one-tailed” significance test. For one-tailed tests, the
hypotheses are framed differently. If Dr. Cartman hypothesized that the population correlation is positive,
then he might conduct a one-tailed test in which he tests that null hypothesis that the population
correlation is less than or equal to zero (H0: ρ ≤ 0), in comparison to the alternative hypothesis that the
population correlation is greater than zero (H1: ρ > 0). These are known as directional hypotheses.
Conducting a one-tailed test changes Step 3 in the process of statistical significance tests, as
illustrated in Table 2. In Step 3, the researcher would select a critical t value associated with a one-tailed
(at the alpha level that he or she has chosen, usually .05). To identify the appropriate critical value, the
researcher would refer to a table of critical t values. For the sake of simplifying the earlier discussion of
critical values, Table 1 does not include information for one-tailed tests; however, many textbooks
include tables with columns that guide researchers to the appropriate critical t values for one-tailed tests.
The researcher would then proceed to Step 4 and Step 5, comparing their observed t value to the critical t
value associated with the one-tailed test.
Although researchers might use one-tailed tests, two-tailed tests are probably more common.
One-tailed tests are often perceived as more liberal that two-tailed tests (e.g., Gravetter & Wallnau, 2004),
allowing researchers to reject the null hypothesis more easily (although, in fact the two approaches have
equal probabilities of producing a Type I error). This perception arises from the fact that the critical t
values used in one-tailed tests are smaller than are the critical t values used in two-tailed tests.
Consequently, researchers must meet a lower degree of confidence before rejecting the null hypothesis in
a one-tailed test. Researchers tend to shy away from procedures that make is easier to reject a null
hypothesis, preferring to take a more conservative approach. Put another way, researchers are reluctant to
adopt procedures that might increase the probability of making a Type I error, and the use of one-tailed
tests is often perceived as potentially increasing such errors. Therefore, despite the logic of one-tailed
tests and directional hypotheses, two-tailed tests and non-directional hypotheses are used frequently.
An Alternative Conceptualization of an Inferential Statistic
Statistical Significance of a Correlation
23
The conceptual approach to significance testing that is adopted in the current paper emphasizes
the importance of effect size and sample size in determining statistical significance (see Equation 1,
above). Textbooks usually present an alternative approach to significance testing. The alternative
approach is very similar to the one outlined in the current paper, in that it proceeds through the same
Steps listed in Table 2 and produces the same result. However, the alternative approach uses a slightly
different conceptual framework. Again, a full description of the alternative approach is beyond the scope
of this paper, but a general familiarity could be useful.
The difference between the two approaches lies in the conceptualization of Step 2 (computing the
observed t value). The alternative approach includes two components. First, we have greater confidence
that the null hypothesis is incorrect when our sample’s statistic (i.e., our observed correlation) is far away
from what is predicted by the null hypothesis. Second, we have greater confidence in making inferences
about the population from which the sample was drawn when our sample’s statistic is a precise estimate
of the population parameter. As outlined in many textbooks, the alternative approach conceptualizes an
inferential statistic in the following way:
Observed t value
=
Observed value of - Expected value of
the correlation
the correlation
(in the sample)
under the null hypothesis
Standard Error of the correlation
Equation 2
or
tOBESRVED
=
r - ρ
sr
In this approach, the observed t value again reflects our confidence that the null hypothesis is
incorrect – larger observed t values make us more likely to reject the null hypothesis. We will assume
that we are conducting a test of the typical null hypothesis (H0: ρ ≤ 0) – that the correlation is zero in the
population from which the sample was drawn. As in the approach described earlier, we are more likely to
reject the null hypothesis when the observed correlation is far away from zero than when the observed
correlation is close to zero. This is reflected in the numerator of the equation above – the difference
between the observed correlation and the correlation that is proposed by the null hypothesis.
Statistical Significance of a Correlation
24
Equation 2 differs from Equation 1 in the concept of the “standard error” of the correlation.
Although there are very technical and highly abstract ways of defining the standard error of the
correlation, it can generally be interpreted as indicating how imprecise the sample correlation is as an
estimate of the correlation in the population from which it was drawn (Gravetter & Wallnau, 2004). A
large standard error indicates that the correlation found in the sample is a poor estimate of the correlation
in the population. A small standard error indicates that the correlation found in the sample is a good
estimate of the correlation in the population. As discussed earlier, sample size is a key factor affecting
one’s confidence in using the sample results to make inferences about the population. Therefore, it is not
surprising to find that the sample size is a component of the standard error:
sr =
1− r2
N −2
With this equation representing the standard error of the correlation, and with the understanding
that the null hypothesis specifies a correlation of zero (H0: ρ = 0), then the equation for the observed t
value is:
t=
r−ρ
r
=
sr
1− r2
N −2
Equation 3
Equation 3 is often found in textbooks that explain the computations for the significance test of a
correlation coefficient. Notice that this equation includes exactly the same components as does Equation
1. The two approaches are mathematically identical – leading to the same observed t value. In addition,
they are conceptually similar in that they define the inferential statistics (the observed t value) as a
product of the size of the effect (how large is the correlation?, how far away from zero is it?) and the
degree to which the sample’s data is a good approximation of the population’s properties (how large is the
sample?).
Statistical Significance of a Correlation
25
A Broader Perspective on Significance Testing
The significance test of the correlation is an example of significance testing more generally. As
Rosenthal and Rosnow (1991) point out, most significance tests can be conceptualized as:
Inferential Statistic =
Effect Size
x
Size of Study
The t value is one kind of inferential statistic, and different inferential statistics are used for significance
tests of different descriptive statistics. For example, the test of the difference among three or more group
means uses an F value (i.e., ANOVA), and the test of differences in frequencies uses a Chi Square value.
Roughly speaking though, inferential statistics indicate the confidence that a researcher should have in
rejecting the null hypothesis. Larger values for inferential statistic reflect stronger confidence.
Similarly, the correlation coefficient is but one kind of effect size. Other statistics that represent
various effect sizes include the degree of difference between two groups (e.g., Cohen’s d) or the
proportion of variance accounted for (e.g., R Squared or Eta Squared). Effect sizes represent the strength
of the findings in the sample – how strong is the correlation between variables or how different are two
groups? The stronger the findings in the sample, the more confident we are in rejecting the null
hypothesis, where the null hypothesis states that there is no correlation between variables in the
population or there is no difference between two populations of people.
Finally, sample size is but one facet of the size of study. Although the number of people in the
sample is generally the most important components of the “size of the study,” some inferential statistics
also consider the number of variables involved in the analysis. A procedure called multiple regression is
used to examine the degree to which two or more predictor variables (e.g., SAT, IQ, and Academic
Motivation) are related to an outcome variable (e.g., Freshman GPA). The inferential statistics associated
with multiple regression take into account the number of predictor variables being examined.
In sum, the equation above expresses an important point. The degree to which any significance
test will lead to rejection of a null hypothesis is a function of some kind of effect size and of the size of
Statistical Significance of a Correlation
26
the study that was conducted. A large effect size and a large study give the study greater power. Power is
a concept that essentially reflects the likelihood of accurately rejecting a null hypothesis.
Conclusion
Individuals who are being introduced to the concept of correlational analysis and to significance
testing might have difficulty making clear connections between the two. In textbooks and other sources
that might be used for teaching basic statistics, the conceptual foundation of significance testing of
correlations has traditionally been neglected. Hopefully, the current paper begins to remedy this neglect
and helps enhance a deeper understanding of this fundamental statistical procedure.
Statistical Significance of a Correlation
27
References
Aberson, C. (2002). Interpreting null results: Improving presentation and conclusions with confidence
intervals. Journal of Articles in Support of the Null Hypothesis, 1, 36–42.
American Psychological Association. (2001). Publication manual of the American Psychological
Association (5th ed.). Washington, DC: Author.
Archdeacon, T. J. (1994). Correlation and Regression Analysis: A Historian's Guide. Madison, WI:
University of Wisconsin Press.
Bobko, P. (2001), Correlation and regression, 2nd edition. Thousand Oaks, CA: Sage Publications.
Capraro, M. M., & Capraro, R. M. (2003). Exploring the APA fifth edition publication manual’s impact
on the analytic preferences of journal editorial board members. Educational and Psychological
Measurement, 63, 554-565.
Chen, P. Y., & Popovich, P.M. (2002). Correlation: Parametric and Nonparametric Measures.
Thousand Oaks, CA: Sage Publications.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). New Jersey: Lawrence
Erlbaum.
Cohen, J. & Cohen, P. (1983). Applied Multiple Regression/Correlation Analysis for the Behavioral
Sciences, Second Edition. Hillsdale, NJ: Erlbaum.
Edwards, A. L. (1984). An Introduction to Linear Regression and Correlation (2nd ed.), New York:
W.H. Freeman.
Ezekiel, M. (1941). Methods of correlation analysis. New York: Wiley.
Furr, R. M. (2004). Interpreting effect sizes in contrast analysis. Understanding Statistics: Statistical
Issues in Psychology, Education, and the Social Sciences, 3, 1-25.
Gravetter, F.J., & Wallnau, L.B. (2004). Statistics for the behavioral sciences (6th ed.). Belmont, CA:
Wadsworth.
Heldref Foundation. (1997). Guidelines for contributors. Journal of Experimental Education, 65, 95-96.
Statistical Significance of a Correlation
28
Kendall, P.C. (1997). Editorial. Journal of Consulting and Clinical Psychology, 65, 3-5.
Miles, J.N.V. & Shevlin, M.E. (2001). Applying regression and correlation: A guide for students and
researchers. London: Sage Publications
Murphy, K.R. (1997). Editorial. Journal of Applied Psychology, 82, 3-5.
Pedhazur, E.J. (1997). Multiple regression in behavioral research, third edition. New York: Harcourt
Brace College Publishers.
Rosenthal, R. & Rosnow, R. L. (1991). Essentials of behavioral research: Methods and data analysis
(2nd ed.). New York: McGraw Hill.
Rosenthal, R., Rosnow, R. L., & Rubin, D. B. (2000). Contrasts and effect sizes in behavioral research: A
correlational approach. New York: Cambridge University Press.
Thompson, B. (1994). Guidelines for authors. Educational and Psychological Measurement, 54, 837-847.
Thompson, B. (1999). Improving research clarity and usefulness with effect size indices as supplements
to statistical significance tests. Exceptional Children, 65, 329-338.
Wilkinson, L., & APA Task Force on Statistical Inference. (1999). Statistical methods in psychology
journals: Guidelines and explanations. American Psychologist, 54, 594-604.
Statistical Significance of a Correlation
Table 1
DF
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
70
80
90
100
120
140
160
180
198
∞
.10
.05
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
1.684
1.671
1.667
1.664
1.662
1.660
1.658
1.656
1.654
1.653
1.653
1.645
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
2.021
2.000
1.994
1.990
1.987
1.984
1.980
1.977
1.975
1.973
1.972
1.960
Alpha Level
.01
63.657
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.750
2.704
2.660
2.648
2.639
2.632
2.626
2.617
2.611
2.607
2.603
2.601
2.576
.001
636.619
31.599
12.924
8.610
6.869
5.959
5.408
5.041
4.781
4.587
4.437
4.318
4.221
4.140
4.073
4.015
3.965
3.922
3.883
3.850
3.819
3.792
3.768
3.745
3.725
3.707
3.690
3.674
3.659
3.646
3.551
3.460
3.435
3.416
3.402
3.390
3.373
3.361
3.352
3.345
3.340
3.291
29
Statistical Significance of a Correlation
Table 2
Steps in conducting a typical significance test of a correlation
Step
Description
1. Compute the observed statistic (i.e., compute the correlation), based on the sample’s data
2. Compute the observed t value, based on the sample correlation and the sample size
3. Obtain the critical t value by referring to a table of the t distribution, based on a two-tailed
significance level of .05 and df = N-2
4. Compare t observed to t critical
5. Make a decision about the null hypothesis.
30
Statistical Significance of a Correlation
Figure 1
Illustrating Confidence Intervals
-1.0
-.50
0
.50
.28
-.42
.51
Dr. Cartman’s CI
.60
Dr. Marsh’s CI
1.0
31
Statistical Significance of a Correlation
Figure 2
SPSS output of correlational analysis
Correlations
SAT
SAT
GPA
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
1
15
.120
.670
15
GPA
.120
.670
15
1
15
Sample’s Correlation (r = .12)
P value (p = .67)
Sample Size (n = 15)
32