Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Version 2012 Updated on 030212 Copyright © All rights reserved
Dong-Sun Lee, Prof., Ph.D. Chemistry, Seoul Women’s University
Chapter 6
Random Errors in Chemical
Analyses
Definition of Statistics
1. A collection of data or numbers.
2. Logic which makes use of mathematics in
the science of collecting, analyzing and
interpreting data for the purpose of making
decision.
Red blood cells (erythrocytes, Er) tangled in
fibrin threads (Fi) in a blood clot. Stacks of
erythrocytes in a clot are called a rouleaux
formation (Ro).
RBC count (normal)
5.0  106 cells / L
A glucose analyzer.
Normal value : 70 ~ 120 mg/dL
Three-dimensional plot showing absolute error in Kjedahl nitrogen determination
for four different analysts. The results of analyst 1 are both precise and accurate.
Frequency distribution(probability) of data
The results of a series of independent trials, random error often occur in fairly
regular and predictable patterns. These patterns can be expressed mathematically, and
the most important are normal, binominal, and Poisson distribution.
The normal curve was developed mathematically in 1733 by DeMoivre as an
approximation to the binomial distribution. His paper was not discovered until 1924
by Karl Pearson. Laplace used the normal curve in 1783 to describe the distribution of
errors. Subsequently, Gauss used the normal curve to analyze astronomical data in
1809. The normal curve is often called the Gaussian distribution. The term bellshaped curve is often used in everyday usage.
http://www.stat.wvu.edu/SRS/Modules/Normal/normal.html
http://www.ms.uky.edu/~mai/java/stat/GaltonMachine.html
1) Gaussian normal distribution
General equation :
3
5
0
3
0
0
2
e
1
/
2
(
(
x
m
)
/
s
)
f
(
x
)
=
(
k
/
s
)
k
=
4
0
3
3
.
7
s
=
1
6
.
0
m
=
5
0
.
5
2
5
0
2
0
0
1
5
0
1
0
0
5
0
0
1
0 2
0 3
0 4
0 5
0 6
0 7
0 8
0 9
0 1
0
0
H
i
s
t
o
g
r
a
m
o
f
G
a
u
s
s
i
a
n
Histogram of Gaussian normal distribution.
The histogram above illustrates well the concept of the normal curve. Note the symmetry of
the graph. Data values at the low and high extremes occur infrequently. As data values move
toward the mean, the frequency increases. Note the vertical bar in the middle of the histogram.
That interval represents the mode. Not so apparent is the fact that it also represents the median
and mean.
Characteristics of Gaussian normal curve
1> Bell shaped curve :The normal curve is symmetric around the mean, with
only one mode.
2> Maximum frequency in occurrence of zero indeterminate error. A normal
distribution where  =0 and s2 = 1 is called a standard normal distribution.
3> Few very large error
4> Symmetric and equal number of positive and negative errors
5> Exponential decrease in frequency as the magnitude of the error increases.
6>This implies that the mean, mode, and median are all the same for the
normal distribution.
7> Approximately 2/3 of the probability density is within 1 standard
deviation of the mean ( ±s).
8> Approximately 95% of the probability density is within 2 standard
deviations of the mean ( ± 2s).
http://www.bcu.ubc.ca/~whitlock/bio300/LectureNotes/Distributions/Distributions.html
Area under the Curve and Z-score
The area under the normal curve indicates the
proportion of observations obtaining a score
equivalent to the z-score on the x-axis. For example,
34% of the sample scored between the mean (z=0)
and one standard deviation above the mean (z=1).
The characteristics of the normal curve make it
useful to calculate z-scores, an index of the distance
from the mean in units of standard deviations.
z-score = (score - mean) / standard deviation
= (x – ) / s
Gaussian normal distribution curve:
 1s = 68.26%,   2s = 95.56%,
  3s =99.74%,   4s = 100%
http://research.med.umkc.edu/tlwbiostats/curve10.html
Normal error curves.
(a) The abscissa is the deviation from the mean in the units of measurement.
(b) The abscissa is the deviation from the mean in units of s.
Bar graph and Gaussian curve describing
the lifetime of hypothetical set of electric
light bulbs.
Gaussian curves for two sets of light bulbs,
one having a standard deviation half as
great as the other. The number of bulbs
described by each curve is the same.
An experiment that produces a small
standard deviation is more precise than one
that produces a large standard deviation.
A Gaussian curve in which  = 0 and
s = 1. A Gaussian curve whose area
is unity is called a normal error curve.
In this case, the abscissa, x, is equal to
z, defined as z = (x– ) / s .
This sample mean is an estimate of ,
the actual mean of the population. The
mean gives the center of the
distribution.
The population standard deviation s,
which is a measure of the precision of
a population of data, is given by the
equation
s =  [(xi  )2] / N
Where N is the number of data points
making up the population.
Area from 900 to 1000
= (area from –∞ to 1000) – (area from –∞ to 900)
= 0.949841 – 0.719629
= 0.230211
S pooled
 (xi –x1)2 +  (xj –x2)2 +  (xk –x3)2 + …
n1+ n2 + n3 +… – Nt
The term Nt is the total number of data sets that are pooled.
Variance
The variance is the square of the standard deviation.
V = s2
Coefficient of variation ; measure of precision
Relative standard deviation ; R.S.D = s / x
Coefficient of variation ; CV(%) = ( s / x ) ×100%
Propagation of uncertainty
1) Addition and subtraction
1.76 0.03  e1
 1.89 0.02  e2
 0.59 0.03  e3
3.06  e4
e4 =  e12+ e22+ e32
=  (0.03)2+(0.02)2+(0.03)2
= 0.041
Percent relative error = 0.041 100 / 3.06
= 1.3 %
Absolute error 3.06  0.04
Relative error 3.06 1%
2) Multiplication and division ; Error of the product or quotient
%e4 =  (%e1)2+ (%e2)2+ (%e3)2
Ex. {1.76(0.03)  1.89 (0.02)}  {0.59 (0.02)} = 5.64?
1> absolute error  %relative error
{1.76(1.7%)  1.89 (1.1%)}  {0.59 (3.4%)} = 5.64?
2> %e4 =  (%e1)2+ (%e2)2+ (%e3)2
=  (1.7)2+ (1.1)2+ (3.4)2
= 4.0 %
result : 5.6( 4%)
3> % relative error  absolute error
4.0 %  5.64 = 0.23
result : 5.6 (0.2)
3) Mixed operations
{1.76(0.03) 0.59(0.02)}  {1.89 (0.02)} = 0.6190?
1> {1.76(0.03) 0.59(0.02)} = 1.17 0.036
2> absolute error  %relative error
1.17 (0.036 ) 1.89 (0.02) = 1.17 (3.1 %)1.89 (1.1%)
= 0.619 0 (3.3%)
Result : 0.62(3%)
3> 0.619 0  3.3% = 0.020
Result : 0.62(0.02)
Significant figures
The number of significant figures is the number of digits
needed to write a given value in scientific notation without
loss of accuracy.
8.25 × 104
3 significant figures
8.250 × 104 4
8.2500 × 104 5
0.801
3
0.0801
3
0.8010
4
Rules for determining the number of significant figures:
1> Discard all initial zeros
2> Disregard all final zeros unless they follow a decimal point
3> All remaining digits, including zeros between nonzero digits, are significant
Logarithms and antilogarithms
a = 10b
log a = b
a = antilogarithm of b
Exponent 指數
2
log 339 = log 3.39 10 = 2.530
3 digit
Character Mantissa 3 digit
指標
假數
102.531 = 340 (339.6)
102.530 = 339 (338.8)
102.529 = 338 (338.1)
antilog(–3.42) = 10 –3.42 = 3.8 104
The scale of Spectronic 20 spectrophotometer.
Graphs demonstrating choice of rulings in relation to significant figures in the data.
14.00
12.00
10.00
pH
8.00
6.00
4.00
2.00
0.00
0
2
4
6
8
10
12
14
16
18
20
22
24
Volume of 0.1N NaOH added(mL)
Experimental potentiometric titration curve: 0.1 N NaOH vs 0.0686 N PHP(25ml)
Example of a graph intended to show the qualitative behavior of
the function y = e–x/6 cos x.
Calibration curve for a 50 mL buret.
Summary
Normal distribution curve
Gauss curve
z - score
Variance
Coefficient of variation
significant figures
Q
&
A
Thanks.
Dong-Sun Lee / 분석화학연구실 (CAT) / SWU.