Download Review of Basic Statistical Concepts

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Review of Basic Concepts
Some abbreviations widely used in statistics.^
Sample
Statistic
Preferred
symbol
Population
Acceptable
symbol
Arithmetic mean
µ
X
Chi-square
χ2
Correlation coefficient
r
Coefficient of multiple determination
R2
Coefficient of simple determination
r2
Coefficient of variation
CV
Degrees of freedom
df
DF
Least significant difference
LSD
Multiple correlation coefficient
R
Not significant
NS
Probability of type I error
α
Probability of type II error
β
Regression coefficient
b
β
Sample size
n
N
σx
sx
Standard error of mean
SE
Standard deviation of sample
SD
s
σ
Student’s t
t
Variance
s2
σ2
Variance ratio
F
^The symbols *, **, and *** are used to show significance at the P = 0.05, 0.01, and 0.001 levels,
respectively. Significance at other levels should be designated by a supplemental note.
From: Publications Handbook and Style Manual. 1988. Amer. Soc. Agron. Inc., Crop Sci. Soc.
Amer. Inc., Soil Sci. Soc. Amer., Madison, WI.
2.1
Review of Basic Concepts - Example Problem
Comparison of the speed (-2 minutes) in seconds of two calculating machines for computing sums
of squares (modified from Cochran and Cox)
Machine A
Replication
Time
(sec)
Deviation
from
Mean
Machine B
Dev.2
Time
(sec)
Deviation
from
Mean
Dev.2
A-B
Rep. or
Pair
Totals
1
30*
8
64
14
0
0
16
44
2
21
-1
1
21*
7
49
0
42
3
22*
0
0
5
-9
81
17
27
4
22
0
0
13*
-1
1
9
35
5
19
-3
9
14*
0
0
5
33
6
29*
7
49
17
3
9
12
46
7
17
-5
25
-6
36
9
25
8
14*
-8
64
16
2
4
-2
30
9
23*
1
1
8
-6
36
15
31
10
23
1
1
24*
10
100
-1
47
Total
Σx2=214
ΣX=220
8*
ΣX=140
where x = X - X
Sums of Squares = SS = Σx 2 = Σ(X -X)2
Mean = X =
ΣX
n
Standard Deviation = s = s 2
Variance = s 2 =
Σ(X - X)2
n-1
Standard Error = sx =
s2
n
2.2
Σx2=316
Coefficient of Variation = CV =
s
X
x 100%
Confidence Limits = CL = X + tsx
t Distribution
The t distribution was first presented by William S. Gosset who published under the
psuedonym “Student” in 1908. Thus the term Student’s t test. The t test compares the deviation
of the sample mean form the population mean measured against the standard error of the mean. It
is also used to compare the difference between two means measured against the standard error of
the difference. The t distribution follows the normal distribution and varies for different df. The
standard t tables are two-tailed tables in which the probability, i.e. 5%, is distributed on both
ends of the distribution. The differences between the results of a standard treatment and a new
treatment may be either positive or negative, i.e. the result of the new treatment may be either
larger of smaller than of the standard treatment. There are also one-tailed t tables in which the
difference between the result of a standard treatment and the new treatment can be only positive
or only negative.
The t test for the difference between the sample mean and the population mean is
calculated in the following manner.
t=
X-µ
where sx =
sx
s2
n
The t test for the difference between means from two different treatments is calculated in
the following manner.
X -Xb
t= a
where sd =
sd
2
sa
na
2
+
sb
nb
Degrees of freedom (df) for looking up t value in t table:
1. If samples are from two populations, the df is the sum of the df for the two
populations.
2. If pairs of values or replicated comparisons are being compared, the df is the number
of pairs - 1.
3. In an ANOVA, the df is the df for mean square for error.
2.3
Confidence Limits
The confidence limits of a mean may be calculated by the formula
CL = X + t sx
as shown in the example.
When a t value for the 0.05 probability is used in the CL, the true mean is expected to lie
within the confidence limits indicated with a probability of 95% unless a 1-in-20 chance has
occurred.
The 95% confidence limits may be calculated for the means of each of two treatments. If
the confidence limits of the two means do not overlap, it may be concluded that the two means
are significantly different at the 95% probability level.
Confidence Limits - Example Problem
Determine the confidence limits for µ for calculating machine A, given:
ΣX = 220
n = 10
Σx 2 = 214
s2 =
Σx 2
214
= 23.77
=
9
n - 1
X
16
17
18
19
20
21
22
23
24
25
26
27
P = 0.80
19.9
24.1
P = 0.95
18.5
25.5
P = 0.99
17.0
X =
27.0
ΣX
220
=
= 22.0
n
10
2.4
28 sec
CL = 22 + tsx
sX =
s2
=
n
23.77
= 1.54
10
For P = 0.80
t(.20,9) = 1.383
CL = 22 ±(1.383) (1.54)
= 22 ±2.1
= 19.9, 24.1
For P = 0.95
t(.05,9) = 2.262
CL = 22 ± (2.262) (1.54)
= 22 ± 3.5
= 18.5, 25.5
For P = 0.99
t(.01,9) =
CL =
Thus, with a confidence of 95%, we can say that the true mean, µ, is included in the
range 18.5 to 25.5. Or, to state this another way, there is 1 chance in 20 or 5 chances in 100 that
the true mean for machine A lies outside this range.
2.5
Related documents