Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Lecture-11: Frequency Analysis Akm Saiful Islam Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET) June, 2008 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Frequency Analysis Probability Position Formula and Probability Plot Analytical Frequency Analysis Normal and Log-normal distribution Gumbels Extreme Value distributions Type I Log Pearsons Type III distribution WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Introduction to Frequency Analysis The magnitude of an extreme event is inversely related to its frequency of occurrence, very severe events occurring less frequently than more moderate events. The objective of frequency analysis is to relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions. Frequency analysis is defined as the investigation of population sample data to estimate recurrence or probabilities of magnitudes. It is one of the earliest and most frequent uses of statistics in hydrology and natural sciences. Early applications of frequency analysis were largely in the area of flood flow estimation. Today nearly every phase of hydrology and natural sciences is subjected to frequency analyses. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Methods Two methods of frequency analysis are described: one is a straightforward plotting technique to obtain the cumulative distribution and the other uses the frequency factors. The cumulative distribution function provides a rapid means of determining the probability of an event equal to or less than some specified quantity. The inverse is used to obtain the recurrence intervals. The analytical frequency analysis is a simplified technique based on frequency factors depending on the distributional assumption that is made and of the mean, variance and for some distributions the coefficient of skew of the data. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Plotting Position Formula The frequency of an even can be obtained by use of “plotting position” formulas. Where, P = the probability of occurrence n = the number of values m = the rank of descending values with largest equal to 1 T = 1-P = the mean number of exceedances a=c = parameters depending on n WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Plotting Position relationship WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Parameters n 10 20 30 40 50 a 0.448 0.443 0.442 0.441 0.440 n 60 70 80 90 100 a 0.440 0.440 0.440 0.439 0.439 a is generally recommended as 0.4 . For normal distribution a = 3/8 For Gumbel’s (EV1) distribution a = 0.4 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Exercise-1: Using the 23 years of annual precipitation depths for a station given in the table below, estimation the exceedance frequency and recurrence intervals of the highest ten values using Weibull equation WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Here, n = 23 Year Rain depth (in) Rank, m P (m/(n+1)) Tr (year) 1981 1986 1988 1978 24 23 23 21 1 3 3 4 0.042 24 0.125 0.167 8 6 1993 1999 1998 20 19 19 5 8 8 0.208 4.8 1980 1990 19 18 8 10 0.333 3 1983 18 10 0.417 2.4 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Probability plot A probability plot is a plot of a magnitude versus a probability. Determining the probability to assign a data point is commonly referred to as determining the plotting position. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Plotting position may be expressed as a probability from 0 to 1 or a percent from 0 to 100. Which method is being used should be clear from the context. In some discussions of probability plotting, especially in hydrologic literature, the probability scale is used to denote prob ( X x ) or 1 Px ( x) . One can always transform the probability scale 1 Px ( x) to Px (x) or even Tx (x) if desired. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Gumbel’s (1958) Criteria The plotting position must be such that all observations can be plotted. The plotting position should lie between the observed frequencies m-1/n of m/n and n where is the rank of the observation beginning with m=1 for the largest (smallest) value and n is the number of years of record (if applicable) or the number of observations. The return period of a value equal to or larger than the largest observation and the return period of a value equal to or smaller than the smallest observation should converge toward n . The observations should be equally spaced on the frequency scale. The plotting position should have an initiative meaning, be analytically simple, and be easy to use. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Steps for probability plot Rank the data from the largest (smallest) to the smallest (largest) value. If two or more observations have the same value, several procedures can be used for assigning a plotting position. Calculate the plotting position of each data point from relationship Table presented in earlier slide. Select the type of probability paper to be used. Plot the observations on the probability paper. A theoretical normal line is drawn. For normal distribution, the line should pass through the mean plus one standard deviation at 84.1% and the mean minus one standard deviation at 15.9%. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Analytical Frequency Analysis Chow has proposed xT x KT s where, K is the frequency factor s is the standard deviation and x bar is the mean value. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Methods of Analytical Frequency Analysis Normal distribution Log-normal distribution Gumbel’s Extreme Value distributions Type I Log Pearsons Type III distribution WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Normal Distribution The probability that X is less than or equal to x when X can be evaluated N ( , 2 ) from x prob( X x) px ( x) (2 ) 2 1/ 2 e ( t )2 / 2 2 dt (4.9) The parameters (mean) and 2 (variance) are denoted as location and scale parameters, respectively. The normal distribution is a bell-shaped, continuous and symmetrical distribution (the coefficient of skew is zero). WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Standard normal distribution The probability that X is less than or equal to x when X is N ( , ) can be evaluated from 2 x prob( X x) px ( x) (2 ) 2 1/ 2 e ( t )2 / 2 2 dt (4.9) WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam The equation (4.9) cannot be evaluated analytically so that approximate methods of integration arc required. If a tabulation of the integral was made, a separate table would be required for each value of and . By using the liner transformation Z ( X ) / , the random variable Z will be N(0,1). The random variable Z is said to be standardized (has 0 and 1 ) and N(0,1) is said to be the standard normal distribution. The standard normal distribution is given by z (4.10) p ( z) (2 ) e and the cumulative standard normal is given by z prob( Z z) PZ ( z) (2 ) 1 / 2 e t / 2 dt (4.11) 2 2 1 / 2 z2 / 2 Z 2 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Figure 4.2.1.3 Standard normal distribution WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Figure 4.2.1.3 shows the standard normal distribution which along with the transformation Z ( X ) / contains all of the information shown in Figures 4.1 and 4.2. Both pZ (Z ) and PZ (z) are widely tabulated. Most tables utilize the symmetry of the normal distribution so that only positive values of Z are shown. Tables of PZ (z) may show prob( Z z ) or prob (0 Z z ) Care must be exercised when using normal probability tables to see what values are tabulated. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Exercise-2: Assume the following data follows a normal distribution. Find the rain depth that would have a recurrence interval of 100 years. Year 2000 1999 1998 1997 1996 ….. Annual Rainfall (in) 43 44 38 31 47 ….. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Solution: Mean = 41.5, St. Dev = 6.7 in (given) x= Mean + Std.Dev * z x = 41.5 + z(6.7) P(z) = 1/T = 1/100 = 0.01 F(z) = 1.0 – P(z) = 0.99 From Interpolation using Tables E.4 Z = 2.33 X = 41.5 + (2.33 x 6.7) = 57.11 in WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Table: Area under standardized normal distribution WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Linear Interpolation from Z-table z z1 z 2 z1 p p1 p2 p1 For, p=0.99 , find z ? From table, z1 = 2.32 , p1 = 0.9898 and z2 =2.33 p2=0.9901 z 2.32 2.33 2.32 0.99 0.9898 0.9901 0.9898