Download POWDERS AND GRANULES ERT 430 SEM 2, 2013/14

Document related concepts

Pre- and post-test probability wikipedia , lookup

Effect size wikipedia , lookup

Transcript
ANALYTICAL
PROPERTIES
PART II
ERT 207 ANALYTICAL CHEMISTRY
SEMESTER 1, ACADEMIC SESSION 2015/16
Overview
2
CONFIDENCE INTERVALS
 STUDENT’S T / T STATISTICS
 STATISTICS AIDS TO HYPOTHESIS TESTING
 COMPARISON OF TWO EXPERIMENTAL
MEANS
 ERRORS IN HYPOTHESIS TESTING
 COMPARISON OF VARIANCES
 ANALYSIS OF VARIANCE

bblee@unimap
CONFIDENCE INTERVALS
3
The confidence interval for the mean is the range
of values within which the population mean (μ) is
expected to lie with a certain probability.
 Sometimes the limits of the interval are called
confidence limits.
 The size of the confidence interval, which is
computed from the sample standard deviation,
depends on how well the sample standard
deviation (s) estimates the population standard
deviation (σ).
bblee@unimap

CONFIDENCE INTERVALS
4

Figure 1 shows a series of five normal error curves.
 In each, the relative frequency is plotted as a
function of the quantity z, which is the deviation
from the mean divided by the population
standard deviation.
 The numbers within the shaded areas are the
percentage of the total area under the curve that
is included within these values of z.
bblee@unimap
CONFIDENCE INTERVALS
5
(a)
(d)
bblee@unimap
(b)
(e)
(c)
Figure 1:
Areas under
a Gaussian
curve for
various
values of ±z.
CONFIDENCE INTERVALS
6
From Figure 1 (a):
 50% of the area under any Gaussian curve is
located between -0.67σ and +0.67σ.
 We may assume 50 times out of 100 the true
mean μ will fall in the interval of x ± 0.67σ.
 Confidence level:
 The probability that the true mean lies within a
certain interval
 It is often expressed as a percentage.

bblee@unimap
CONFIDENCE INTERVALS
7
Figure 1 (a):
 The confidence level is 50% and the confidence
interval is from -0.67σ to +0.67σ.
 Significance level:
 the probability that a result is outside the
confidence interval.
 A general expression for the confidence interval
(CI) of the true mean based on measuring single
value x: CI for μ = x ± z σ

bblee@unimap
CONFIDENCE INTERVALS
8


For the experimental mean of N measurements:
zσ
CI for μ  x 
N
Table 1 shows the values of z at various
confidence
level.
 The relative size of the confidence interval as
a function of N is shown in Table 2.
bblee@unimap
CONFIDENCE INTERVALS
9

Table 1:
bblee@unimap
Table 2:
CONFIDENCE INTERVALS
10
EXAMPLE 1:
 Determine the 80% and 95% confidence intervals
for:
(a) A data entry of 1108 mg/L glucose
(b) A mean value for 1 week data of 1100.3 mg/L
(1 data is recorded per day).

Assume that in each part, s = 19 is a good
estimate of σ.
bblee@unimap
STUDENT’S T / T STATISTICS
11
The t statistics is often called Student’s t.
 To account for the variability of s, we use the
important statistical parameter t, which is defined
in exactly the same way as z except that s is
substituted for σ.
xμ
 For a single measurement with result x,

t

For the mean of N measurement,
bblee@unimap
xμ
t
s N
s
STUDENT’S T / T STATISTICS
12

The confidence interval for the mean of N
replicate measurements can be calculated from t,
ts
CI for μ  x 
N
bblee@unimap
STUDENT’S T / T STATISTICS

13
Table 3:
bblee@unimap
STUDENT’S T / T STATISTICS
Example 2:
 A clinical chemist obtained the following data for
the alcohol content of a sample of blood: %
C2H5OH: 0.084, 0.089, and 0.079.
 Calculate the 95% confidence interval for the
mean assuming that
(a) The three results obtained are the only indication
of the precision of the method
(b) From previous of experience on hundreds of
samples, we know that the standard deviation the
method s = 0.005% C2H5OH is a good estimate
of σ.
14
bblee@unimap
STATISTICS AIDS TO HYPOTHESIS
TESTING
15
Hypothesis testing is the basis for many decision
made in science and engineering.
 The hypothesis tests that we describe are used to
determine if the results from these experiments
support the model.
 If agreement is found, the hypothetical model
serves as the basis for further experiments.
 When the hypothesis is supported by sufficient
experimental data, it becomes recognized as a
useful theory until such time as data are obtained
that prove it.

bblee@unimap
STATISTICS AIDS TO HYPOTHESIS
TESTING
16
A null hypothesis postulates that two or more
observed quantities are the same.
 Specific examples of hypothesis tests that scientists
often use include the comparison of
(1) The mean of an experimental data set with
what is believed to be the true value,
(2) The mean to a predicted or cutoff (threshold)
value,
(3) The means or the standard deviations from two
or more sets of data.
bblee@unimap

STATISTICS AIDS TO HYPOTHESIS
TESTING
17
Comparing an experimental mean with a known
value:
 A statistical hypothesis test to draw conclusions
about the population mean (μ) and its nearness to
the known value (μ0).
 There are two contradictory outcomes that we
consider in any hypothesis test:
(1)The null hypothesis H0, states that μ = μ0.
(2)The alternative hypothesis Ha,
bblee@unimap
STATISTICS AIDS TO HYPOTHESIS
TESTING
18


We might reject the null hypothesis in favor of Ha
if is different than μ0 (μ ≠ μ0).
Other alternative hypotheses are μ > μ0 or μ <
μ0.
bblee@unimap
STATISTICS AIDS TO HYPOTHESIS
TESTING
19
Suppose we are interested in determining whether
the concentration of lead in an industrial
wastewater discharge exceeds the maximum
permissible amount of 0.05 ppm.
 Our hypothesis test would be summarized:
H0: μ = 0.05 ppm
Ha: μ > 0.05 ppm

bblee@unimap
STATISTICS AIDS TO HYPOTHESIS
TESTING
20
Large Sample Z test:
 If a large number of results are available so that s
is a good estimate of σ, the z test is appropriate.
1. State the null hypothesis: H0: μ = μ0
x  μ0
2. Form the test statistic:
z
σ
N
3. State the alternative hypothesis Ha and
determine the rejection region.
bblee@unimap
STATISTICS AIDS TO HYPOTHESIS TESTING
For Ha: μ ≠ μ0, reject H0 if z ≥ zcrit or if z ≤ -zcrit
(two-tailed test)
For Ha: μ > μ0, reject H0 if z ≥ zcrit
(one-tailed test)
For Ha: μ < μ0, reject H0 if z ≤ -zcrit
(one-tailed test)
21
Figure 2 (a):
 There is only a 5% probability that random error
will lead to a value of z ≥ zcrit or z ≤ -zcrit.
 The significance level overall is α = 0.05
 From Table 1, the critical value of z is 1.96 bblee@unimap

STATISTICS AIDS TO HYPOTHESIS
Figure 2: Rejection
TESTING
22
bblee@unimap
regions for the
95% confidence
level
(a) Two-tailed test
for Ha: μ≠ μ0.
STATISTICS AIDS TO HYPOTHESIS TESTING
23
Figure 2:
Rejection regions
for the 95%
confidence level
(c) One-tailed test
for Ha: μ< μ0.
Figure 2 (b):
 The probability that z exceeds zcrit to be 5% or the
total probability in both tails to be 10%.
 The significance level overall is α = 0.10.
 The critical value from Table 1 is 1.64.
bblee@unimap
STATISTICS AIDS TO HYPOTHESIS TESTING
Example 3
 A class of 30 students determined the activation
energy of a chemical reaction to be 116 kJ/mol
(mean value) and standard deviation of 22
kJ/mol.
 Are the data in agreement with the literature value
of 129 kJ/mol at
(a) The 95% confidence level
(b) The 99% confidence level
 Estimate the probability of obtaining a mean
equal to the student value.
bblee@unimap

24
STATISTICS AIDS TO HYPOTHESIS TESTING
For a small number of results, we use a similar
procedure to the z test except that the test statistics
is the t statistic.
 The null hypothesis H0: μ= μ0, where μ0 is a specific
value of μ such as an accepted value, a theoretical
value or a threshold value.
1. State the null hypothesis: H0: μ = μ0
x  μ0
2. From the test statistic:

25
t
s
N
3. State the alternative hypothesis Ha and
determine the rejection region.
bblee@unimap
STATISTICS AIDS TO HYPOTHESIS TESTING
For Ha: μ ≠ μ0, reject H0 if t ≥ tcrit or if t ≤ -tcrit
(two-tailed test)
For Ha: μ > μ0, reject H0 if t ≥ tcrit
(one-tailed test)
For Ha: μ < μ0, reject H0 if t ≤ -tcrit
(one-tailed test)
26
Figure 3:
 Curve A: If the analytical method had no
systematical error, or bias, random errors would
give the frequency distribution.

bblee@unimap
STATISTICS AIDS TO HYPOTHESIS TESTING

27
Figure 3:
 Curve B: The frequency distribution of results by
a method that could have a significant bias due
to a systematic error.
Figure 3:
Illustration of
systematic error
in an analytical
method.
bblee@unimap
STATISTICS AIDS TO HYPOTHESIS TESTING
Example 4:
 A new procedure for the rapid determination of
sulfur in kerosenes was tested on a sample known
from its method of preparation to contain 0.123%
S (μ0=0.123%S).
 The results for %S were 0.112, 0.118, 0.115 and
0.119.
 Do the data indicate that there is a bias in the
method at the 95% confidence level?

28
bblee@unimap
COMPARISON OF TWO EXPERIMENTAL MEANS
 Frequently scientists must judge whether a
difference in the means of two sets of data is real
or the result of random error. c
 The t-Test for differences in means:
 The test statistics t is could be found from:
29
x1  x2
t
N1  N 2
s pooled
N1 N 2
bblee@unimap
COMPARISON OF TWO EXPERIMENTAL MEANS
 If there is good reason to believe that the
standard deviations of the two data sets differ,
the two-sample t test must be used.
 Paired data:
 Scientists and engineers often make use of pairs
of measurements on the same sample in order to
minimize sources of variability that are not of
interest.
Specific difference (0)
μd  Δ 0
 The test statistic value:
30
Average difference
d
bblee@unimap
N
i
d  Δ0
t
sd N
μd  Δ 0
μd  Δ 0
COMPARISON OF TWO EXPERIMENTAL MEANS
Example 5:
 A new automated procedure for determining
glucose in serum (Method A) is to be compared to
the established method (Method B).
 Both methods are performed on serum from the
same six patients in order to eliminate patient-topatient variability.
 Do the following results confirm a difference in the
two methods at the 95% confidence level?
31
bblee@unimap
ERRORS IN HYPOTHESIS TESTING

32

Type I error:
 A type 1 error occurs when H0 is rejected
although it is actually true.
 In some sciences, a type I error is called a false
negative.
Type II error:
 A type II error occurs when H0 is accepted and it
is actually false.
 It is sometimes termed a false positive.
bblee@unimap
ERRORS IN HYPOTHESIS TESTING
 The consequences of making errors in hypothesis
testing are often compared to the errors made in
judicial procedures.
 Convicting an innocent person is usually
considered a more serious error than setting a
guilty person free.
 If we make it less likely that an innocent person
gets convicted, we make it more likely that a
guilty person goes free.
 It is important when thinking about errors in
hypothesis testing to determine the consequences
of making a type I or type II error.
33
bblee@unimap
ERRORS IN HYPOTHESIS TESTING

34
As a general rule of thumb, the largest α that is
tolerable for the situation should be used.
 This ensures the smallest type II error while
keeping the type I error within acceptable limits.
 For many cases in analytical chemistry, an α
value of 0.05 (95% confidence level) provides
an acceptable compromise.
bblee@unimap
COMPARISON OF VARIANCES
At times, there is a need to compare the variances
(or standard deviation) of two data sets.
 The normal t-test requires that the standard
deviations of the data sets being compared are
equal.
 F-test:
 A simple statistical test can be used to test this
assumption under the provision that the
populations follow the normal (Gaussian)
distribution.

35
bblee@unimap
COMPARISON OF VARIANCES

36
F-test is based on the null hypothesis that the two
population variances under consideration are
equal.
2
2
H 0 : σ1  σ 2

The test statistic F, which is defined as the ratio of
2
s1
the two samples variances.
F
s
2
2
It is calculated and compared with the critical
value of F at the desired significance level.
 The null hypothesis is rejected if the test statistic
differs too much from unity.
bblee@unimap

COMPARISON OF VARIANCES
 F-test is used in comparing > two means and in
linear regression analysis.
 Critical values of F at the 0.05 significant level are
shown in Table 4.
37
Table 4:
bblee@unimap
COMPARISON OF VARIANCES
Two degrees of freedom are given, one associated
with the numerator and the other with denominator.
 The F-test can be used in either a one-tailed mode
or in a two-tailed mode.

38
bblee@unimap
COMPARISON OF VARIANCES
Example 6
 A standard method for the determination of the
carbon monoxide (CO) level in gaseous mixtures is
known from many hundreds of measurements to
have a standard deviation of 0.21 ppm CO.
 A modification of the method yields a value for s
of 0.15 ppm CO for a pooled data set with 12
degrees of freedom.
 A second modification, also based on 12 degrees
of freedom, has a standard deviation of 0.12 ppm
CO. Is either modification significantly more precise
than the original?
39
bblee@unimap
ANALYSIS OF VARIANCE
40
ANOVA – the methods used for multiple
comparisons fall under the general category of
analysis of variance.
 ANOVA indicates a potential difference, multiple
comparison procedures can be used to identify
which specific population means differ from the
others.
 Experimental design methods take advantages
of ANOVA planning and performing experiments.

bblee@unimap
ANALYSIS OF VARIANCE
ANOVA detects difference in several population
means by comparing the variances.
 The following are typical applications of ANOVA:
1. Is there a difference in the results of five analysts
determining calcium by a volumetric method?
2. Will four different solvent compositions have
differing influences on the yield of a chemical
synthesis?
3. Are the results of manganese determination by
three different analytical method different?
4. Is there any difference in the fluorescence of a
complex ion at six different values of pH?

41
bblee@unimap
ANALYSIS OF VARIANCE
42
Figure 4 – a single factor, or one-way ANOVA.
 The basic principle of ANOVA is to compare the
variation between the different factor levels
(groups) to that within factor levels.
The groups are the different analysts, a
comparison of the variation between analysts to
the within-analyst variation (Figure 5).
bblee@unimap

ANALYSIS OF VARIANCE
43
Figure 4
bblee@unimap
ANALYSIS OF VARIANCE
44
Figure 5
bblee@unimap
ANALYSIS OF VARIANCE
45

ANOVA Table:
bblee@unimap
ANALYSIS OF VARIANCE
Example 7:
 Five analysts determined calcium by a volumetric
method and obtained the amount (in mmol Ca)
shown in the table below.
 Do the means differ significantly at the 95%
confidence level?
46
bblee@unimap
EXAMPLE 1
(a)
 From Table 1,
z = 1.28 & 1.96 for 80% and 95%
confidence levels.
80% CI = 1108 ± 1.28 x 19
= 1108 ± 24.3 mg/L
95% CI = 1108 ± 1.96 x 19
= 1108 ± 37.2 mg/L
47
bblee@unimap
EXAMPLE 1
 It
can be concluded that 80% probable
that the population mean (μ) lies in the
interval 1083.7 to 1132.3 mg/L
glucose.
 The probability is 95% that μ lies in the
interval between 1070.8 and 1145.2
mg/L.
48
bblee@unimap
EXAMPLE 1
(b)
 For the seven measurements,
1.28 x19
80% CI = 1100.3 
7
= 1100.3 ± 9.2 mg/L
1.96 x19
95% CI = 1100.3 
7
= 1100.3 ± 14.1 mg/L
49
bblee@unimap
EXAMPLE 1
 The
experimental mean (Ẋ = 1100.3
mg/L), it can be concluded that there is
an 80% chance that μ is located in the
interval between 1091.1 and 1109.5
mg/L glucose and a 95% chance that
it lies between 1086.2 and 1114.4
mg/L glucose.
 Note: the intervals are considerably
smaller when we use the
experimental mean instead of a
single value.
50
bblee@unimap
EXAMPLE 2
(a)
=
0.252
x

0
.
084

0
.
089

0
.
079
i
x
2
i
 0.007056  0.007921  0.006241
= 0.021218
0.021218  0.252 3
s
3 1
2
= 0.0050% C2H5OH
In this,
x  0.252 3 = 0.084
51
bblee@unimap
EXAMPLE 2
t
= 4.30 for two degrees of freedom &
the 95% confidence level.
ts
4.30 x0.0050
95% CI  x 
 0.084 
N
3
= 0.084 ± 0.012%C2H5OH
(b) Because s = 0.0050% is a good estimate of
σ, we can use z,
zσ
1.96 x0.0050
95% CI  x 
 0.084 
N
3
= 0.084 ± 0.006%C2H5OH
52
bblee@unimap
EXAMPLE 3
 μ0
is the literature value of 129 kJ/mol so that
the null hypothesis is μ = 129 kJ/mol.
 The alternative hypothesis is that μ ≠ 129
kJ/mol.
 This is a two-tailed test.
 From Table 1, zcrit = 1.96 for the 95%
confidence level, and zcrit = 2.58 for the 99%
confidence level.
 The test statistic is calculated as:
x  μ0 116  129
z

σ N
22 30 = - 3.27
53
bblee@unimap
EXAMPLE 3
 Since
z ≤ -1.96, we reject the null
hypothesis at the 95% confidence level.
 Since z ≤ -2.58, we also reject H0 at
the 99% confidence level.
 In order to estimate the probability of
obtaining a mean value μ = 116 kJ/mol,
the probability of obtaining a z value of
3.27.
 Table 1, the probability of obtaining a z
value this large because of random
error is only about 0.2%.
54
bblee@unimap
EXAMPLE 3
 All
of these results lead us to conclude
that the student mean is actually
different from the literature value
and not just the result of random
error.
55
bblee@unimap
EXAMPLE 4
 The null hypothesis is H0: μ = 0.123%
 The alternative hypothesis is Ha: μ ≠
S,
0.123%S.
x
i
 0.112  0.118  0.115  0.119 = 0.464
x  0.464 4 = 0.116%S
2
x
 i  0.012544  0.13924  0.013225  0.014161
= 0.53854
0.053854  ( 0.464 )2 4
0.000030
s

4 1
3
= 0.0032% S
56
bblee@unimap
EXAMPLE 4
 The
test statistic can be calculated as,
x  μ0 0.116  0.123
t

= -4.375
s N
0.0032 4
 From
Table 3:
 The critical value of t for 3 degrees of
freedom and the 95% confidence
level is 3.18.
 Since t ≤ -3.18, we conclude that
there is a significant difference at the
95% confidence level and thus bias in
the method.
57
bblee@unimap
EXAMPLE 4
 If
we were to do this test at 99%
confidence level, tcrit = 5.84.
 Since t =-4.375 is greater than -5.84,
we would accept the null hypothesis at
the 99% confidence level and conclude
there is no difference between the
experimental and the accepted values.
58
bblee@unimap
EXAMPLE 5
 If
μd is the true average difference
between the methods, we want to test
the null hypothesis:
H0: μd = 0
H a : μd ≠ 0
d 0
 The t-test statistic is: t 
sd
N
N
= 6, ∑di = 16+9+25+5+22+11=88
∑di2 =1592, ḋ = 88/6 = 14.67
59
bblee@unimap
EXAMPLE 5
 The
standard deviation of the
2
difference:

88
1592 
6 = 7.76
s 
d
 The
t-statistic:
6 1
14.67
t
= 4.628
7.76 6
 The
critical value of t = 2.57 for the
95% confidence level and 5 degrees of
freedom.
60
bblee@unimap
EXAMPLE 5
 Since
t>tcrit, we reject the null
hypothesis and conclude that the two
methods give different results.
61
bblee@unimap
EXAMPLE 6
 Null
hypothesis: H
0
:σ
2
std
σ
2
1
The variance of the The variance of the
standard method
modified method
 The
alternative
hypothesis:
Ha : σ  σ
2
1
2
std
 Because
an improvement is claimed,
the variances of the modifications are
placed in the denominator.
62
bblee@unimap
EXAMPLE 6
 For
 For
 For
1st

s
0.21
modification: F1 

2
s
0.15
2
2
std
2
2
= 1.96

0.21
= 3.06
F2 
2
0.12
2nd modification:
2
the standard procedure, sstd is a
good estimate of, and the number of
degrees of freedom from the numerator
can be taken as infinite.
Fcrit  2.30
bblee@unimap
63
EXAMPLE 6
 F1
< 2.30,
 We cannot reject the null hypothesis
& conclude that there is no
improvement in precision.
 F2 > 2.30,
 We reject the null hypothesis and
conclude that the second modification
does appear to give better precision
at the 95% confidence level.
• Is the precision of the 2nd modification
is significantly better the 1st ?
64
bblee@unimap
EXAMPLE 6
 The
F-test dictates that we must accept
2
2
the null hypothesis,
s
0.15
F
1
2
2
s



2
0.12
= 1.56
 In
this case, Fcrit = 2.69.
Since F < 2.69, we must accept H0
and conclude that the two methods
give equivalent precision.
65
bblee@unimap
EXAMPLE 7
 We
obtain the mean and standard
deviations for each analyst.
 The mean for analyst 1 is
x1  10.3  9.8  11.4 3 = 10.5 mmol Ca
 The
remaining means are obtained in
the same manner:
66
bblee@unimap
EXAMPLE 7
 The
results are summarized as follows,
 The
grand mean is found:
67
bblee@unimap
EXAMPLE 7
68
bblee@unimap
EXAMPLE 7
 The
F table, the critical value of F at the
95% confidence level for 4 and 10
degrees of freedom is 3.48.
 Since F exceeds 3.48, we reject H0 at
the 95% confidence level and conclude
that there is a significant difference
among the analysts.
69
bblee@unimap