Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Transcript

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Lecture-8: Probabilistic Analysis Akm Saiful Islam Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET) June, 2008 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Frequency Analysis Continuous Distributions Normal distribution Lognormal distribution Pearson Type III distribution Gumbel’s Extremal distribution Confidence Interval WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Log-Normal Distribution The lognormal distribution (sometimes spelled out as the logarithmic normal distribution) of a random variable is one for which the logarithm of follows a normal or Gaussian distribution. Denote , Y ln X then Y has a normal or Gaussian distribution given by: f ( y) 1 2 2 y e 1 y y 2 y 2 , y (1) WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Derived distribution: Since Y ln X , dy 1 dx x the distribution of X can be found as: dy (2) 1 1 1 f ( x) f ( y ) e e 1 y y 2 y dx 2 y2 2 1 y y 2 y x 2 2x 2 y2 Note that equation (1) gives the distribution of Y as a normal distribution with mean y and variance y2 . Equation (2) gives the distribution of X as the lognormal distribution with parameters y and y2 . WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Estimation of parameters ( y , y2 ) of lognormal distribution: y Note: Y ln X , y i , S y ny Chow (1954) Method: n n 1 (1) C v S x / X 1 X (2) Y 2 ln C 1 (3) S 2 ln( C 2 1) y v 2 y 2 i 2 2 2 v (4)The mean and variance of the lognormal distribution are: E( X ) exp( y y2 / 2) and Var ( X ) e 1 2 x y2 Cv e y2 1 (5) The coefficient of variation of the Xs is: (6) The coefficient of skew of the Xs is: 3C v C v3 (7) Thus the lognormal distribution is skewed to the right; the skewness increasing with increasing values of C v . WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Example-1: Use the lognormal distribution and calculate the expected relative frequency for the third class interval on the discharge data in the next table WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Frequency of the discharge of a River Class Number Observed Relative Frequency 25,000 35,000 45,000 55,000 2 3 10 9 0.03 0.045 0.152 0.136 65,000 75,000 85,000 11 10 12 0.167 0.152 0.182 95,000 105,000 115,000 6 0 3 0.091 0.000 0.045 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Solution According to the lognormal distribution is CV S x / x 21,000 / 67,500 0.311 y 1 ln[ x 2 /(Cv2 1)] 1 ln[67,500 2 /(0.311 2 1)] 11.0737 2 2 s y ln(Cv2 1) 0.311 2 1 0.30395 z (ln x y) / s y (ln 45,000 11.0737) / 0.30395 1.182 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam So from the standard normal table we get p z ( z ) 0.198 px ( x) Pz ( z ) /( x S y ) 0.198 / 45,000(0.30395) px ( x) 1.4476 105 f 45,000 10,000 (1.4476 10 5 ) 0.145 The expected relative frequency according to the lognormal distribution is 0.145 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Example-2: Assume the data of previous table follow the lognormal distribution. Calculate the magnitude of the 100-year peak flood. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Solution: The 100-year peak flow corresponds to a prob(X > x) of 0.01. X must be evaluated such that Px(x) = 0.99. This can accomplished by evaluating Z such that Pz(z)=0.99 and then transforming to X. From the standard normal tables the value of Z corresponding to Pz(Z) of 0.99 is 2.326. y sy z y The values of Sy and y are given y 0.30395 (2.326 ) 11.0737 11.781 x exp( y) 130 ,700 cfs The 100-year peak flow according to the lognormal distribution is about 1,30,700 cfs. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Extreme Value Distributions Many times interest exists in extreme events such as the maximum peak discharge of a stream or minimum daily flows. The probability distribution of a set of random variables is also a random variable. The probability distribution of this extreme value random variable will in general depend on the sample size and the parent distribution from which the sample was obtained. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Extreme value type-I: Gumbel distribution Extreme Value Type I distribution, Chow (1953) derived the expression T 6 KT 0.5772 ln ln T 1 (3) To express T in terms of K , the above equation can be written as T T 1 K T 1 exp exp 6 0.5772 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Example-3: Gumble Determine the 5-year return period rainfall for Chicago using the frequency factor method and the annual maximum rainfall data given below. (Chow et al., 1988, p. 391) Rainfall Rainfall Rainfall Year 1913 1914 1915 1916 1917 1918 1920 1921 1922 1923 1924 1925 (inch) 0.49 0.66 0.36 0.58 0.41 0.47 0.74 0.53 0.76 0.57 0.8 0.66 Year 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 (inch) 0.68 0.61 0.88 0.49 0.33 0.96 0.94 0.8 0.62 0.71 1.11 0.64 Year 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 (inch) 0.52 0.64 0.34 0.7 0.57 0.92 0.66 0.65 0.63 0.6 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Solution The mean and standard deviation of annual maximum rainfalls at Chicago are 0.67 inch and 0.177 inch, respectively. For , T=5, equation (3) gives T 6 KT 0.5772 ln ln T 1 5 6 KT 0.719 0.5772 ln ln 5 1 xT x KT s xT = 0.649 + (0.719)(0. 177) = 0.78 in WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Log Pearson Type III For this distribution, the first step is to take the logarithms of the hydrologic data, . Usually logarithms to base 10 are used. The mean , standard deviation , and coefficient of skewness, Cs are calculated for the logarithms of the data. The frequency factor depends on the return period and the coefficient of skewness . When , C s 0 the frequency factor is equal to the standard normal variable z . When ,C s 0 is approximated by Kite (1977) as 1 3 1 5 2 2 3 4 K T z ( z 1)k ( z 6 z )k ( z 1)k zk k 3 3 2 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Example-4: Calculate the 5- and 50-year return period annual maximum discharges of the Gaudalupe River near Victoria, Texas, using the lognormal and log-pearson Type III distributions. The data in cfs from 1935 to 1978 are given below. (Chow et al., 1988, p. 393) Year 1930 0 1 2 3 4 5 38500 17900 6 0 7 17200 8 25400 9 4940 1940 55900 58000 56000 7710 12300 22000 1950 13300 12300 28400 11600 8560 4950 1960 23700 55800 10800 4100 5720 15000 1970 9190 9740 58500 33100 25200 30200 17900 46000 6970 20600 1730 25300 58300 10100 9790 70000 44300 15200 14100 54500 12700 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Solution WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam It can be seen that the effect of including the small negative coefficient of skewness in the calculations is to alter slightly the estimated flow with that effect being more pronounced at years than at years. Another feature of the results is that the 50-year return period estimates are about three times as large as the 5-year return period estimates; for this example, the increase in the estimated flood discharges is less than proportional to the increase in return period. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Confidence Interval WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam