Download Information Analysis Gaussian or Normal Distribution

Information Analysis Gaussian or Normal Distribution 0.012 Probability 0.01 0.008 0.006 0.004 0.002 0 0 50 100 150 200 X 250 300 350   s    xi x 2 i  n 1 n  2 n 1       1/ 2 Probability = mean, estimated as x x = observed sample mean = 3x/n = standard deviation, estimated as s n = sample size 0.012 S= observed standard deviation 0.01 Area under curve = 1  0.008 0.006 0.004  0.002 0 0 50 100 150 200 X 250 300 350  0.012 Coefficient of Variation Probability 0.01  0.008 0.006 Cv   0.004 0.002 50 100 150   0 0 x 200 250 300 s 350 X 0.025 0.007 0.006 Probability Probability 0.02 0.015 0.01 0.005 0.005 0.004 0.003 0.002 0.001 0 -0.005 0 50 100 150 200 X Cv = 150/20 = 7.5 250 300 350 0 0 50 100 150 200 X Cv = 150/60 = 2.5 250 300 350 Example 100 kg of glass is recovered from municipal refuse and processed. The glass is crushed and sieved. Lot the cumulative distribution of particle size from the data below Sieve Size 4 3 2 1 <1 Fraction Retained 10/100 = 0.1 25/100 = 0.25 35/100 = 0.35 20/100 = 0.20 10/100 = 0.1 0.4 0.4 0.35 0.35 0.3 0.3 0.25 0.25 Fraction Retained 3 mm holes 2 mm holes 1 mm holes No holes 10 kg glass remained on the sieve (90 kg went through) 25 kg remained on the sieve 35 kg remained on the sieve 20 kg remained on the sieve 10 kg went all the way through Fraction Retained 4 mm holes 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 00 44 33 22 Sieve SieveSize Size(mm) (mm) 11 Pan Pan Cumulative Distribution Sieve Size Fraction Smaller Than sieve size 1 – 0.1 = 0.9 1 – (0.1+0.25) = 0.65 1 –(0.35 + 0.35) = 0.3 1 – (0.7 + 0.2) = 0.1 Fraction of PArticles smaller than size indicated 4 3 2 1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 Particle Size (mm) 4 5 Graphs Independent variable Abscissa (x-axis) A variable is independent if the value is chosen, like sieve size in the previous example. Dependent variable Ordinate (y-axis) A value is dependent if is determined by experiment Probability Paper X-axis is linear Y-axis is plotted so that if the probability is normal (Gaussian) then the cumulative probability will plot as a straight line. If this is the case the mean is at 0.5 or 50% and the standard deviation is 0.335 on either side of the mean. You can also calculate s by: s = 2/5(x90 – x10) Example Consider the recycled glass data from the previous example. What is the mean, the standard deviation, and the 95% interval? The mean is the value on the x-axis when the y-axis value is 0.5, 2.4 mm. The standard deviation is the spread around the mean so that 68% of the data fall into the range (or about 34% on either side of the mean). 0.5 + 0.34 = 0.84, which corresponds to 3.5 mm, so s = 3.5 – 2.4 = 1.1, or: S=2/5(3.9-1.0) = 1.16 The 95% interval means 95% of the data is in the range, or between 0.025 and 0.975, or 0.2 mm and 4.8 mm Return Period Return period is how often an event is expected to recur. If the annual probability of an event occurring is 5%, then the event can be expected to occur once every 20 years, or have a return period of 20 years: Return period = 1/fractional probability To determine return periods, first rank time-variant data (smallest to largest or largest to smallest) then calculate the probabilities and plot the data. Return Period Example The data below are from a wastewater treatment plant. BOD is the measure of organic pollution in a water. The BOD is measured daily. . Does this data fit the normal distribution? Can it be used to calculate the mean and standard deviation? What is the worst quality expected in 30 days? First, rank the data: Now plot the data. We will plot m/n (which is the probability), versus the BOD It does fit the normal distribution fairly well The mean is about 35 mg/L BOD To find the worst quality in a 30 day period, calculate: 29/30 = 0.967. This is the fraction of days the quality is better than the worst day out of 30 days Enter the graph at 0.967 and find the answer: 67 mg/L BOD Sometimes data is analyzed after it is grouped. Often the mean is used to analyze the data. Example: Using the data from the previous problem estimate the highest expected BOD to occur once every 30 days using grouped data analysis First define groups of BOD values. Now plot these data Notice how the data points form a curve. This means the data don’t really fit the normal Distribution, but we’ll go ahead anyway Now P29/30 = 0.967 and we read 67 mg/L BOD from the graph.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Information Analysis Gaussian or Normal Distribution