Download Methods in Education (2) Correlational Approaches

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Statistics for Education
Research
Lecture 2
Normal Distributions & Sampling
Distribution of Means
Instructor: Dr. Tung-hsien He
[email protected]
Normal Distributions
1. Theoretically, any variable being measured by
infinite times would tend to display a normal
distribution.
2. Normal Distributions are described by a
mathematical equation (4.1, p. 86)
3. Not every distribution of a variable will match the
normal distribution.
4. Distributions of variables, however, will
approximate the normal distributions.

5. Normal distributions have various shapes and
curves (p. 88. Noted: these distributions are normal
distributions]
6. Shapes of normal distributions will be determined
by means and standard deviation
Features of Normal Distributions
1. Unimodal: Only with one mode
2. Symmetrical
3. Bell-Shaped
4. Maximum Height as Mean
5. Values on X axis are continuous
6. Asymptotic (漸進) to X axis: the curses never touch
the X axis

7. Shapes are determined by mean and SD (See p. 88:
Figure B/C: Which distributions have larger values
of SD? Why they are normal distributions?).
8. The number of normal distributions is infinite
(since the requirements for a normal distribution
can be easily met).
9. After a variable has been tested for an infinite
number of times, the scores of all the tests will
approximate the normal distribution.
Standard Normal Distribution:
1. The distribution of normally distributed standard
scores: z score
2. Formula: See 4.2, p. 89
3. Mean = 0
4. SD = 1
5. The shape (proportion) of the distribution: See
figure 4.3, p. 90
6. Determining Proportions: See Table c.1, p. 618

7. Determining Percentiles by Using NSD: See Table c.2,
p. 621
E.g.: 70th PR with a mean = 85, SD = 20: (Think about
how z-score is computed?)
Step 1: Check Table C.2 on p. 621
Step 2: Find B (The Larger Area) = .70
Step 3: Find z that corresponds to B = .70 -> z = .5244
Step 4: Use z score formula: ?-85/20 = .5244 -> ? =
95.488
8. Determining Percentile Ranks:
E.g.: PR102 with mean = 85, SD=20: (i.e., PR of raw
score 102 is = ?)
Step 1: Use z score formula: z = 102-85/20 -> z = 0.85
Step 2: Check C.2 Table on p. 618
Step 3: Find z = .85, Area between mean and z = .3023
Step 4: 0.5+0.3023 = 0.8023 -> PR 102 = 80.23
Normal Distribution the fundamental assumption
for inferential statistics:
a. a variable will be normally distributed if it is tested
for an infinite number of times.
b. The selected sample(s) must represent the
characteristics of the population from which the
sample(s) is/are selected (i.e., sample(s) must
match the population)
c. Random selections (accompanied by random
assignments) are the only way to guarantee the
representation of the sample(s).

Chain of Reasoning in Inferential Statistics
a. Parameters: Populations
b. Statistics: Samples
c. Test hypothesis about parameters: There will be
two types of hypotheses that will be tested: the null
hypothesis and alternative hypothesis.
d. Estimate parameters based on statistics
e. Inferential statistics is all about “sampling” and
“hypothesis testing”

Two types of Samples: Probability Samples vs.
Nonprobability Samples
 Probability Samples (隨機非隨便樣本)
1. Simple Random Sampling: Every member in the
target population has identical chances to be
selected
a. Sampling with replacement: Selected subjects are
included in the next selection procedures
b. Sampling without replacement: Selected subjects
are excluded from next selection procedures

c. The two methods will yield different probabilities.
d. Using SPSS to random select cases (i.e., the random
seeds)
2. Systematic Sampling: Choosing every kth member
of a list that contains all members of a population.
3. Cluster Sampling: Clusters (naturally formed
groups) are randomly selected from the population
of clusters.
4. Stratified Random Sampling:
a. To select samples from a heterogeneous
population that contains several subpopulations
(strata);
b. Strata have to be defined well;
c. Random samples of members of each stratum are
selected.
Nonprobability Samples (No random selection is
involved) :
1. Purposive Samples:
a. Starting off with a large group of potential
subjects;
b. Using screening criteria to select those subject
who meet these criteria.
2. Convenience Samples (立意樣本): Selecting
whoever you want to be subjects.

3. Quota Samples:
a. Deciding X percent of a certain kind of subjects
and Y percent of another kind of subjects;
b. Going out to select whoever these kinds of
subjects to be subjects.
4. Snowball Samples:
a. Starting off with a convenience sample;
b. Recruiting more others related to this sample
like family members, friends, . . .
5. Returning Questionnaires in Surveys (Why?)
6. Volunteers
Sampling Distributions of Mean: Key Concept &
Assumption for Inferential Statistics
a. Meaning: the distributions of all possible means of
a certain size of samples that have been selected
and tested for an infinite number of times
(theoretically possible only)

b. Interpretation: Theoretically, a certain side of
samples can be chosen for infinite times from a
population, and each time when a sample is drawn,
there will be a corresponding mean for this
particular sample. Because samples can be
infinitely chosen (only theoretically possible),
there will be an infinite number of sample means.
Sampling distribution of mean represents the
distribution of these means.
c. Properties:
1. Shape:
(a) As sample size increases, the sampling
distribution of the mean for simple random
samples of n cases will approximates a normal
distribution
(b) if the sample size is no smaller than 30, the
sampling distribution of the mean (n=30) will
approximate normal distributions.
2. A normal Distribution;
3. Variance & Standard Deviation of Sampling
Distribution of Mean:
(a) Variance: 6.8, p. 150
(b) Standard Deviation (Standard Error): 6.9, p. 150
4. Mean of the Distribution of Sampling Means equals

5. As sample size (n) increases, variability of the
sampling distribution of the mean decreases
(Why?)
6. See figure 6.7, 6.8 on p. 152 & p. 153 for standard
sampling distribution of the mean (z score)
Hypothesis Testing: (A must-known concept for you
to understand inferential statistics):
1. Meaning: Making inferences about the nature of the
population on the basis of observations of a sample
drawn from the population
2. Logics: as the differences between hypothesized
value for the population mean and the sample mean
are computed and found to be very large, the
hypothesis is rejected.
3. Figure 7.1 on p. 166 for detailed explanations:

Types of Hypothesis
1. A conjecture about a or more population
parameters
2. Testing a specific hypothesis does not mean prove
or disprove the conjecture; it only tells how likely
(i.e., the probability) this hypothesis may be true.

a. Null Hypothesis [虛無假設]:
(1) Symbol: Ho
(2) It means “no relation between variables”: the
relation index is equal to 0
(3) It can also mean: “no difference in means”:
mean1 is equal to mean2
(3) E.g. 1: Ho:  = 455
(4) E.g. 2: the mean of the experimental group is
equal to the mean of the control group.
(5) If Ho is retained, it means:
(a) the null hypothesis is very likely to be true (or
happen) at certain level of confidence;
(b) the probability for the null hypothesis to be true
is very high;
(c) there is no relation between two variables;
(e) differences in two means are so small and
nonsignificant that the differences can be
discarded (i.e., the differences may be caused by
sampling error)
b. Alternative Hypothesis [對立假設] :
(1) Symbol: Ha
(2) Relations between variables exist (i.e., the
relation index is not equal to 0); two or more means
are different from each other or one another (i.e.,
mean1 is not equal to mean2)
(3) Against the Null Hypothesis
(4) The researchers’ expected outcome
(researchers’ hypothesis)
(5) Example: Ha:   455
(6) If Ha is retained, it means:
(a) The null hypothesis (Ho) is rejected.
(b) A significant relation or significant differences
are detected.
(c) Ha is very likely to be true (or to happen) at
certain level of confidence.
(d) The probability for the alternative hypothesis to
be true is very high.
(e) Differences in two means are so significant that
the two means are very likely to be different.
Meaning of Accepting Ho:  = 455, but Rejecting Ha:
  455 when the sample mean was found to be 454:
1. Condition: We assume (hypothesize) that  of a
population is 455. We formulate the following
hypotheses:
Ho:  = 455, Ha:   455
Then we select and test a sample, and find its
sample mean = 454.
Based on the sample mean, it is very likely that the
population mean will be 455. So, Ho is retained but
Ha is rejected.

2. Interpretation
a. Because the difference ( - X bar = 455-454 = 1)
between the expected mean and the observed
mean is very small, i.e., nonsignificant, Ho is
retained and Ha is rejected .
b. Since we formulate two hypothesis, namely, Ho: 
= 455, Ha:   455, and Ho is retained, we can draw a
conclusion: It is very likely that the population
mean is 455.
c. Where does the difference, that is, 1, come from?
It comes from sampling error!
Meaning of Rejecting Ho:  = 455, but Accepting Ha:
  455 when the sample mean was found to be
80,000:
1. Condition: We assume (hypothesize) that  of a
population is 455. We formulate the following
hypotheses:
Ho:  = 455, Ha:   455
Then we select and test a sample, and find its
sample mean = 80,000

Based on the sample mean, it is very unlikely that
the population mean will be 455. So, Ho is rejected
but Ha is retained.
2. Interpretation:
a. Because it is very unlikely that the population
mean will be 455. So, Ho is rejected in favor of Ha.
b. Because differences between the expected
population mean and the observed mean are very
huge, i.e., significant, Ho is rejected (i.e., Ho:  = 455,
Ha:   455). Thus, we can reach the following
conclusion: it is very unlikely that mean of
population is 455.

Rule of Thumb:
1. If differences between the expected mean and
the observed mean are very small, i.e.,
nonsignificant, you should retain Ho but reject Ha.
2. If differences between the expected mean and
the observed mean are very huge, i.e., significant,
your should reject Ho but retain Ha.

To retain or to reject Ho , that is a question (see p.
166):
a. Since it is impossible to know the true  of a
population, we can only hypothesize its value. If we
hypothesize  to be 455, and its standard deviation
of the population, , is hypothesized to be 100. Then
we select a sample of 144 subjects and find its
sample mean = 535. We formulate the following
hypotheses:
Ho:  = 455, Ha:   455
Q: Based on the sample mean, should we reject or
retain Ho?
(a) At first sight, we should reject Ho because the
difference between the observed 535 and
hypothesized 455 is 80, and it seems very huge. But,
that is our feeling only. How will statistics tell us?
There are a few key points that need to be taken into
account before we can answer this question:
(b) The sampling distribution of means is our
solution to this question because:
(1) the mean of the sampling distribution of means
is the population mean;
(2) the sampling distribution of means is a normal
distribution, and the standard deviation of the
sampling distribution of means (standard error, 標
準誤) =  /√n, when  (population SD) = 100
(hypothesized); n (number of subjects) =144
(3) When we select a sample and get its mean, we
don’t expect this mean to be perfectly equal to ,
particularly when we do not know what the  is. But
we are sure if we sample the population for infinite
times, one of the sample means will be equal to the
population mean. And this mean is exactly the mean
of the sampling distribution of means.
(4) Why may or may not the observed mean be equal
to the population mean? It is because when we
draw a sample, we will make errors called
“sampling errors” (any sampling procedure will
yield sampling errors, including random
selections). Thus, the observed sample mean stems
from two resources:
“sampling error + true population mean”
(c) In our example, the observed sample mean is
535 but we hypothesize the population mean to be
455. The difference between the two numbers is 80.
Since we know observed sample mean = sampling
error + true population mean, there are at least two
possible reasons to account for why the observed
mean is 535:
(1) Possible Explanation 1:
true population mean = 455, sampling errors = 80
Thus, Ho:  = 455
So, we retain Ho:  = 455, but reject Ha:   455
(2) Possible Explanation 2:
true population mean ≠455, sample errors ≠ 80 (i.e.,
either > or < 80), but sampling error + true
population mean = 535
Thus, Ha:   455
So, we reject Ho:  = 455, but retain Ha:   455
(d) Now we can test the possibility (i.e., probability)
of the two possible explanations by estimating
whether the probability of the sample error is 80 or
not.
(e) Remember:
[1) the mean of the sampling distribution of means
is the population mean and a normal distribution;
(2) the standard deviation (standard error) of the
sampling distribution of means is  /√n .
(f) Now, we can compute the probability of the
occurrence of 535 when it is put into the sampling
distribution of means whose mean is hypothesized
as 455.
Z = x1 –x2/SD -> Z score of 535:
x1 = 535
x 2=  = 455
SD = Standard Error =  /√n = 100 / √144 = 100/12 =
8.33
Z = x-  /( /√n) = 535-455/8.33= 80/8.33 = 9.6
(g) What does Z = 9.6 tell us?
(1) Check the standard normal distribution graph. Z
= 9.6 will fall extremely farther on the right end of
the distribution. In other words, the probability for Z
= 9.6 to take place in this standard normal
distribution is very extremely low.
(2) Check Z score Table to find the large portion of Z
= 9.6. 1- the large portion = probability for Z = 9.6 to
take place when  = 455. The value of probability
means the chance for Ho to be retained. The higher
the value is, the higher probability is and the more
likely Ho is retained.
(3) z = 3.2905 (see Table C.2, p. 621), the larger area
= .9995, probability = 1-.99995 = .0005. Probability
for z = 9.6 will be much smaller than .0005.
(4) In other words, for a population whose mean is
hypothesized as 455 (in the z score distribution, the
mean will be 0), the probability for this sample
mean, 535, (in the z score distribution is 9.6) is less
than 0.0005. Thus, based on our sample mean = 535,
it is extremely impossible that the population mean
will be 455.
(h) Results:
Look at the two hypotheses:
Ho:  = 455
Ha:   455
Probability for Ho to be retained is less than 0.0005.
(i) Conclusion:
(1) Based on the statistics of the sample mean, that
is, 535, the probability for the hypothesized
population mean = 455 to take place is less than
0.00003. In other words, the probability for Ho:  =
455 to be retained is less than 0.0005. Because this
probability is too low, Ho:  = 455 should NOT be
retained. Ho:  = 455 should be rejected. Instead, Ha:
  455 should be retained. Thus, based on our
sample mean, 535, it is extremely impossible that
the population mean will be 455.
(2) Hence, the difference between the hypothesized
population mean and the observed sample mean is
so huge and so significant that the sampling errors
can not be used to explain this difference (in this
case, the sampling error should be less than 80).
(3) The probability for  = 455 is smaller than .05 [p
<.05]; the difference between the sample mean and
assumed population mean is statistically
significant. The population mean should not be 455.
Criterion for Rejecting Ho:
1. Researchers can set up the region of rejection (in
the normal distribution) to reject a Ho.
2. This proportion of area is referred to “level of
significance” and notated as , i.e.,  is a
probability)
3. It equals the maximum probability of rejecting Ho.
4. In the field of language education,  is usually set
at 0.05.  can also be set at 0.1, 0.01, 0.001, or even
smaller.

5. Depending on the type of distribution used, a
cutting value, i.e., critical values in the statistic
term, will be computed.
6. For the z distribution, if  = .05, then the critical
value of z score for rejecting or retaining Ho will be
1.6449 (check Table C.2 on p. 621.)
Step 1: 1-.05 = .95
Step 2: Find B (large area) = .950 from Table C.2
Step 3: Find z score corresponding to B = .950
Step 4: z = 1.6449
If  = 0.025, the critical z score will be 1.96
7. The smaller value of an :
(1) the smaller rejecting area
(2) the more difficult to reject the Ho
(3) the less likely to reject the Ho
(4) the more likely to accept Ho
But
(5) the easier to reject Ha
(b) the more likely to reject Ha
(7) the more difficult to accept Ha
(8) the less likely to accept Ha
(9) a larger critical value.
8. The smaller an , the more conservative it is. In the
field of medicine,  is conventionally set a very
conservative level such as 0.01 or 0.001. It is
because an extremely conservative  will make it
very unlikely to reject Ho. Thus, in order to reject Ho
in favor of Ha, the differences between two means
must be very, very large. In medicinal experiments,
taking new drugs must make huge differences in
patients in order to reject the Ho and accept the Ha.
So, a very conservative  is used.
9. If the p values are lower than the level of
significance, i.e., , it means the probability to
retain Ho is too small. Thus, Ho should be rejected.
E.g., p = 0.00034 <  = 0.05, i.e., p < 0.05, it means the
probability to accept Ho is too small. So, rejecting
Ho and accepting Ha. Results can be written as: The
difference is statistically significant at the level of
0.05.
10. If the p value is larger than the level of
significance, i.e., , it means the probability to
retain Ho is high.
E.g., p = 0.45 <  = 0.05, , i.e., p > 0.05 it means the
probability to accept Ho is huge. So, retaining Ho
and rejecting Ha. Results can be written as: The
difference is not significant at the level of 0.05.

Important Note:
The p-value indicates the probability to accept Ho

Errors in Hypothesis Testing:
1. Meaning: No matter Ho is rejected or retained,
researchers will take the risk of making errors in
their decisions.
2. Two Types of Errors:
(1) Type I Error:
a. Meaning: Rejecting a true null hypothesis
b. Explanation: Researchers decide to reject a null
hypothesis, but this null hypothesis is actually true.
c. Reasons: Researchers reject the Ho because the
sample mean falls in the rejecting area. That is, the
probability for the Ho to be retained is very low (but
not 0). However, Ho may still stand true, although it is
very unlikely. Because the probability for the Ho to be
retained is not 0 (e.g., p-value = 0.0001), there should
be a very slightest chance (i.e., 0.0001) that the Ho
may be true and should be retained. Researchers
decide to reject the Ho because of its low probability,
not because the probability is zero. Thus, when the Ho
is rejected, researchers may make a Type I error
since the Ho can be true but it has been rejected.
(2) Type II Error:
a. Meaning: Retaining a false null hypothesis
b. Explanation: Researchers decide to retain a null
hypothesis, but this null hypothesis is actually false.
c. Reasons: Researchers retain the Ho because the
sample mean does not fall in the rejecting area.
That is, the probability for the Ho to be retained is
very high (but not 1). However, Ho may still stand
false, although it is very unlikely. Because the
probability for the Ho to be retained is not 1 (e.g., p =
0.99), there should be a very slightest chance (i.e.,
0.01) that the Ho should be rejected because it’s
false. Thus, when the Ho is retained, researchers
may make a Type II error since the Ho can be false
but it has been retained.

Level of Confidence [信心水準]: In a survey study,
the opposite of  value (1- ] .
(a) Meaning: the degree of confidence that
researchers have in not making the error when they
decide to reject Ho.
(b) Interpretation: if the  = .05, researchers realize
when a null hypothesis is rejected, they may make a
mistake 5 times out of 100 times. That is, they are
95% confident in their results of rejecting Ho.

Levels of Significance vs. Types of Errors:
(1) if the value of  is reduced from 0.05 to 0.01, the
probability of making Type I error decreases,
whereas the probability of making Type II error
increases.
(2) if the value of  is raised from 0.01 to 0.05, the
probability of making Type I error increases,
whereas the probability of making Type II error
decreases.
(3)  = 0.05 is more likely to reject Ho than  = 0.01
(Why?)
5. One-Tailed [單尾] vs. Two-Tailed [雙尾] Hypothesis
Testing:
(1) One-Tailed (Directional): Ho:   455; Ha:   455
(2) Two-Tailed (Non-Directional): Ho:  = 455; Ha:  
455
(3) Critical values [決斷值] for rejecting Ho is
different: Figures 7.6 (p. 178), Figure 7.7 & 7.8 (p.
180/181)
one-tailed critical value of z =  1.645 (critical value
=  1.645) as  = .05, whereas two-tailed z = 1.96
(4) One-tailed is more likely to reject Ho (Why?)

Standard Sampling Distribution of the Mean: Using
Student’s t distribution if  is unknown:
(1) Standard error: s/√n as s: standard deviation of
the sample
Student’s t distributions:
a. For small samples, sampling distribution of the
mean departs considerably away from normal
distributions;
b. a family of distributions;
c. as sample sizes increase (n=30), distributions of
sampling distribution of the mean approximates
normal distribution;
d. t distributions with a mean equal to 0 and SD = 1
(Exactly like z- score distribution)

e. All testing procedures are identical to z-score
distribution
f. Each t distribution is related to degree of freedom
(df] [自由度): n-1
(1) df: the number of elements of data that are free
to vary in calculating a statistic.
[2] why df: each t distribution responds to a df.
[3] if one restriction is added, the number of
freedom will be one less.
[4] x = Σ/n, but s = √Σ(x-x)2/n-1 -> x is added as a
restriction; thus, df = n-1
[5] when df increases, t distribution approximates
normal distribution; when df=29, t distribution
becomes a normal distribution. Thus, the statistic
techniques that use t distributions must have a df
over 29, that is, n at least is 30.
g. Critical values for t distributions can be found in
Table c.3, p. 622.

Statistical Precision:
(1) the inverse of a standard error;
(2) the smaller a standard error is, the greater the
statistical precision;
(3) as the sample size is increased, the precision is
increased accordingly.

Example:
Scenario: A researcher hypothesizes that GPA of
student athletes is less than 2.5. To test this
hypothesis, the researcher selects 20 subjects and
find the GPA mean of this sample is 2.45, s = 0.54, s2
= 0.29,  is set as 0.05 level. What inferences can
the researcher make?
Computation:
1. t distributions is used since population variance
is not known.
2. Hypotheses:
Ho:  = 2.5; Ha:  ≠ 2.5
3. t = x - /standard error
4. standard error = s /√n -> 0.54/ √20 = 0.12
5. t = 2.45-2.50/0.12 = -0.42
6.  = 0.05, df=19, one-tailed t critical value = -1.729
7. t = -0.42 > -1.729, outside the Ho rejection area;
thus, Ho should not be rejected but retained ->  =
2.5
8.  = 2.5, p > .05 means:
a. The difference in the observed mean and
hypothesized mean is very small (i.e., nonsignificant);
b. The probability for sampling error to account for
this difference is higher than 5%;
c. The probability for Ho to be true is larger than .05

Confidence Interval (CI: 信賴區間): The estimation
of 
(a) CI: A range of values that we are confident
contains the population parameter (i.e., ].
[b] CI = X± (tcv)*standard error
[c] E.g.: A researcher hypothesizes that GPA of
student athletes is less than 2.5. To test this
hypothesis, the researcher selects 20 subjects and
find the GPA mean of this sample is 2.45, s = 0.54, s2
= 0.29,  is set as 0.05 level. What is the CI?
Computation:
1. standard error = s /√n -> 0.54/ √20 = 0.12
2. Critical Values of t when n = 20, df= 19,  = 0.05,
two-tailed -> tcv= 2.093
3. CI95 = 2.45 ±[2.093]*0.12 = 2.45 ± 0.25 = (2.20, 2.70)
Interpretation:
We are 95% confident that the  will fall between
2.20 and 2.70.
Statistics’ A, B, & C
A: In theory, an infinite number of samples can be
selected from a population. All the possible sample
means will be normally distributed.
B: The mean of all the possible sample means is the
hypothesized population mean.
C: For a particular sample mean, there is a
corresponding probability. This probability
indicates the chance for the null hypothesis to be
true.
