Download Statistical Analysis

Significant Figures Significant Figures: 142.7 (4 sig. figs) It is the minimum number of digits needed to write a given value in scientific notation without loss of accuracy. 142.7 (4 sig. figs) 1.427 x 102 9.25 x 104 (3 sig. figs) 9.250 x 104 (4 sig. figs) 9.2500 x 104 (5 sig. figs) Zeros are significant when they occur 1) in the middle of the number; 2) at the end of the number on the rhs of a decimal pt. 106, 0.0106, 0.01060 Estimates: Absorbance: 0.234 ± 1: 0.235 to 0.233 Transmittance: 58.3 ± 2 Significant figures in arithmetic 5.345 1.362 x 10-4 -4 +3.111 x 10 + 6.728 --------------------------4 4.473 x 10 12.073 In addition and subtraction the number of the significant figures in the answer may exceed or be less than that in the original data. In multiplication and division the number of the significant figures is limited to the number of digits contained in the number with the fewest significant figures. 34.60 ÷ 2.46287 = 14.05 4.3179 x 1012 * 3.6 x 10-19 = 1.6 x 10-6 1 n = 10a Logarithms of a) 339 = 3.39 x 102 log 339 = 2. 530 means that log n = a (the number n is the antilogarithm | 2: characteristic 530: mantissa The number of digits in the mantissa should equal the number of total significant figures Consequently, when we convert a logarithm to an antilogarithm, the number of sig. figs in the antilogarithm should equal the number of digits in the mantissa. antilog ( - 3.42 ) = 10 -3.42 = 3.8 x 10-4 Experimental Error Every measurement has some uncertainty ⇒ experimental error Systematic (determinate) error: arises from a flaw in equipment or the design of the experiment Random (indeterminate) error: arises from the effects of uncontrolled (and maybe uncontrollable) variables in the measurement. It can be positive or negative and always present! (Electric noise, human readings…). It cannot be eliminated but it can be reduced. Precision: describes the reproducibility of the results Accuracy: describes how close a measured value is to the “true value”. Absolute Uncertainty: expresses the margin of uncertainty associated with a measurement. Buret: ± 0.02 mL Relative Uncertainty: compares the size of the absolute uncertainty to the size of its associated measurement. {absolute uncertainty ÷ magnitude of measurement} 12.35 ± 0.02 0.02 = 0.002 12.35 2 Percent Relative Uncertainty: Relative Uncertainty x 100 Uncertainties in Data and Results • Random Errors and Precision We assume that the numerical result to which our discussion applies is obtained with an instrument that measures a physical quantity for which the a priori range of all physical values constitutes a continuum. 1 N x = Average or mean of N measurements of an experimental variable x: ∑ xi N i =1 Range of measured values: R = xl arg est − x smallest - not a clear measure of the precision. 1 Average deviation: ave.dev = N computers N ∑| x i =1 i − x | - declining value in the age of A measure of precision unbiased by sample size is the variance, S2: 1 N S2 = ( xi − x ) 2 - Additive property! (= Estimates of random error from ∑ N − 1 i =1 variable sources may be combined). Divisor is known as degrees of freedom. The number of degrees of freedom is equal to the number of independent data on which the calculation of variance is based. Alternative form of the variance equation: 2 2 1 1 S2 = xi2 − N x = ( x 2 − x ) - Useful for calculators and computers. ∑ N −1 N −1 ( ) The square root of the variance is called the estimated standard deviation: 1/ 2 S = 1 N 2 ( x − x ) ∑ i  N − 1  i =1  - indicates the precision of individual measurements. 3 The precision of the mean of the measurements is given by the estimated standard 1/ 2 S 1 N 2 = = − S ( x x ) deviation of the mean of N values: m ∑ i  N N − 1  i =1  The precision of the mean can be increased by increasing the number of individual measurements! Rejection of Discordant Data Grubbs Test for an Outlier To determine whether a particular data point can be excluded based upon its questionable veracity; we compute the Grubbs statistic G, defined as: | questionable value - x | Gcalculated = s If Gcalculated < Gtable the questionable point should be retained. Example: 10.2, 10.8, 11.6, 9.9, 9.4, 7.8, 10.00, 9.2, 11.3, 9.5, 10.6, 11.6 The value of 7.8 appears out of line. We get s = 1.11 and x = 10.16 Gcalculated = 2.13 and on comparison to Gtable = 2.285 (for 12 observations), the questionable value should be retained. Statistical Treatment of Random Errors Error Frequency Distribution For a physical quantity x we get a large number of measurements xi (I = 1, 2, 3, … N) and are subject to random errors εi. For simplicity we assume that the true value xo is known so the errors are known also. Therefore, we are concerned with the frequency of occurrence, nε, of errors of size ε. 4 The graph on the left represents the actual error frequency distribution nε for 376 measurements; the estimated normal error probability function P(ε) is given by the dashed curve. Estimated values of the standard deviation σ and the 95% confidence limit ∆ are indicated in relation to the normal error curve. The width w is chosen as a compromise between the desirability of having the numbers in each bar as large as possible and the desirability of having the number of bars as large as possible. +∞ P(ε) is normalized, so that ∫ P(ε )dε = 1 . −∞ The significance of normalization is that a single measurement will be in error by an amount lying in the range between ε and ε + dε is equal to P(ε)dε. A probability function derived in this way is approximate. It can be assumed that the probability function is represented by a Gaussian distribution which is called normal error −ε 2 1 2 e 2σ Where the standard deviation σ is a probability function. P (ε ) = 2π σ parameter which characterizes the width of the distribution. It is the root-meansquare error expected with this probability function. 1/ 2  1 + ∞ 2 − ε 2 2σ 2  2 σ ≡ (ε ) =  ε e dε  ∫ π σ 2 −∞   known, then σ can be estimated by If the true value is xo and the errors εI are 1/ 2 1 N  σ =  ∑ ε i2  The dashed curve represents a normal error probability function  N i =1  with a value of σ for 376 measurements. 5 All the assumptions made are required for the validity of the central limit theorem. Infinitely Large Sample So far our discussion has dealt with the errors themselves. In real circumstances we do not know the errors εI by which each measurement deviate from the true value xo. We know only the deviations from a mean value ( xi − x ) . If the errors are only random then the mean value is the best estimate of the true value. If we can make a very large (theoretically infinite) number of measurements then we can determine the true mean µ exactly and the spread of the data points about this mean would indicate the precision of the observation. The probability function for the deviation will be  ( x − µ )2  1 1 exp  − , where σ = lim   2 2σ  2π σ N →∞  N  P( x − µ ) = 1/ 2  ( x − µ )2  ∑ i =1  N In the absence of systematic errors µ should be equal to xo. The normal probability distribution function is used to establish the probability P that an error is less than a certain magnitude δ or to establish the limiting width of the range -δ to +δ.,within which the integrated probability P has a certain value. +δ 2 −ε 1 2σ 2 P= e dε If δ = σ, then P = 0.6826. This means that 68.26% of all 2π σ −∫δ errors are less than the standard deviation in magnitude. If P = 0.95 then δ 0.95 = 1.96σ ≈ 2σ The value of P is given by the shaded area. The 95% confidence limit is shown in rhs graph. For σ to be known satisfactory, N should be at least 20. 6 Correspondence between uncertainty value and confidence level ±σ Uncertainty ± 1.64σ ± 1.96σ ± 2.58σ ±3.29σ Confidence Level 68.26 90 95 99 99.9 Large Finite Sample - Uncertainty in mean value The error in the mean εm of N observations is the mean of the individual errors εi: εm = 1 N N ∑ε i =1 i The estimated standard deviation Sm of the mean is given by: 1  N 2 N S = ε = 2 ∑ ε i  + ∑ N  i =1  i =1 2 m 2 m N ∑ε ε i j j =1 The meaning of the above equation is that the mean of a group of N independent measurements of equal weight has a higher precision than any single of these measurements. For a sample of N ≥ 20 there is a 68.26% probability that the true value lies between ( x − Sm ) and ( x + Sm ) We can determine a 95% confidence limit in the mean, denoted as ∆, ∆ = δ m ,0.95 = 1.96S 2S ≅ If N < 20 then Student t distribution will be used N N The joint probability is given by Pm ( x ) =  ( x − µ )2  exp  − 2 2π (σ / N )  2(σ / N )  1 In the case where σ is known, σ m = σ N 7 Small Samples (1 < N < 20) – Student t distribution function Student t distribution functions P(τ) for ν = 1, 3, 5, …∞ (degrees of freedom). The quantities actually plotted  τ2  P(τ ) / k norm = 1 +   N − 1 x−µ x−µ τ≡ = Sm (S / N ) −N / 2 = [1 + τ 2 /ν ]−(ν +1) / 2 , where N – 1 = ν and The curve for ν = ∞ is the normal error curve. The short vertical bars mark the 95% confidence level. The t distribution curve can be used in the same way as the normal distribution curve. Suppose we seek to find the values of τ over which the integral of the Student probability function is a fraction of P. Then we calculate the following integral: t ∫ P(τ )dτ = P −t 8 We define as limit of error δ as the value of ( x − µ ) that correspond to the limit of integration t. δ = tSm = tS t0.95 S and for the 95% confidence limit, ∆ = t0.95 Sm = N N The table below shows “critical” values of t for a given number of degrees of freedom ν and a given P. Propagation of Errors (random/systematic) If one determines a quantity F(x, y, z, …) where x, y, z, … are measured values with uncertainties ∆(x), ∆(y), ∆(z), …then the error in F is given by 2 2 2  ∂F  2  ∂F  2  ∂F  2 ∆ (F ) =   ∆ ( y) +   ∆ ( x) +   ∆ ( z ) + ...  ∂x   ∂z   ∂y  2 In certain cases the propagation of errors can be carried out very simply: 2 2 2 2 2 2 2 a) For F = ax ± by ± cz , ∆ ( F ) = a ∆ ( x ) + b ∆ ( y ) + c ∆ ( z ) 9 ∆2 ( F ) ∆2 ( x ) ∆2 ( y ) ∆2 ( z ) = + + 2 b) For F = axyz (or axy/z, or ax/yz, or a/xyz), F2 x2 y2 z n c) For F = ax , 2 ∆2 ( F ) ∆ (F ) ∆ ( x) 2 ∆ ( x) n n = → = F2 x2 F x 2 2 2 2 d) For F = aex, ∆ ( F ) = a e ∆ ( x ) → ∆( F ) = ∆( x ) F ∆ ( x) a2 2 e) For F = a ln x, ∆ ( F ) = 2 ∆ ( x ) → ∆( F ) == a x x 2 Method of Least Squares We use this method in order to draw the best straight line through experimental data points that have some scatter and they do not lie perfectly on a straight line. The equation of the straight line is: y = mx + b 10 The vertical deviation is: d i = yi − y = y1 − (mxi + b) , some + and some – 2 2 2 We square, d i = ( yi − y ) = ( y1 − mxi − b) 2 Now we minimize the sum of the squares of all the deviations: SSE = σ = ∑ d i i The values of m and b are found which minimize SSE,  ∂   ∂  SSE  = and  SSE  =   ∂m b  ∂b m LINEST in Excel and Regression Analysis in SigmaPlot 11 12

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Statistical Analysis