Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
45 approximate probability that the yield level will be less than or equal to 250 MPa is P[Y#250] = 0.257/2 = 0.1285 or almost 13% 1.2.7 The Transformation of Multi-Dimensional Random Variables In many cases, one has to deal with distributions of random variables that are functions of other basic random variables, that is, they are derived through transformation of the basic variables. As an example let (X,Y) denote a two-dimensional random vector or point in ú2, the (x,y) plane, which through the functional transformation U=f(X,Y) (1.82) V=g(X,Y) produces another random vector (U,V). If the probability density p(x,y) of (X,Y) is given, the density q(u,v) of (U,V) is sought. The probability that (X,Y) has a value within the rectangular square, {(x1<X#x1+)x1), (y1<Y#y1+)y1)}, i.e. has a value that is close to (x1,y1), is represented by the integral which corresponds to the volume of the column shown in Fig. 1.14. Through the presumed one to one coordinate transformation u = f(x,y) : x = h(u,v) (1.83) v = g(x,y) : y = k(u,v) the volume of a corresponding column in the (u,v) plane is given by where *J* denotes the absolute value of the functional determinant, the Jacobean, of the inverse transformation, Eq. (1.83), that is, 46 Fig. 1.14 The Probability Density of (X,Y) (1.84) where by supposition, the inverse transformation functions h(u,v) and k(u,v) possess continuous first order partial derivatives. The probability that the pair (U,V) has a value within the transformed "rectangle" {(u1,u1+)u), (v1,v1+)v))} is therefore approximately and the probability density q(u,v) is given as q(u,v) = p(h(u,v),k(u,v))*J* (1.85) Example 1.21 The distribution of two basic random variables X and Y is given. A new random variable U is formed as, (1) the sum, U=X+Y, and (2) the product, U=XY, of the basic variables. The 47 distribution function of the two U’s is sought. Solution: By introducing an auxiliary variable V=X, the problem can be treated as a transformation exercise. Therefore, let (1) U=X+Y, V=X Then, x=h(u,v)=v and y=k(u,v)=u-v and whereby the probability density of the two variables U and V is equal to p(v,u-v). The density R(u) of the variable U=X+Y must then be obtained as the marginal density from the two-dimensional distribution for the random vector (U,V) by integration over v from -4 to 4. If the two basic variables X and Y were independent, then p(x,y)=f1(x)f2(y) and (2) U=XY, V=X, x=h(u,v)=v, y=k(u,v)=u/v and The density of U is again obtained by integration over v, that is, 48 Notice that x=0 must be excluded as it represents the y-axis in the xy-plane, which will be transformed into the u-axis in the uv-plane where v=0. Since both these axes support the probability mass zero, this does not affect the calculation. Example 1.22 Let Y be a random variable that is constructed as the weighted sum of n random variables Xi, i=1,2,....,n. Thus where the constants ai, the weights, have real values. Find the expected value and the variance of Y. Solution: To obtain the moments of Y by applying the method of transformed variables with multidimensional distributions would seem to be insurmountable work according to the above exercise. It is therefore more advantageous to obtain the moments directly from their definition. Thus or E[Y] = E[(a1X1+a2X2+a3X3+@@@+anXn)] = a1E[X1]+a2E[X2]+a3E[X3]+@@@+anE[Xn] (1.86) since the operator E[*] is linear-additive. The second moment, the variance, is obtained as follows, (1.87a) or by introducing the correlation coefficient and the standard deviations (1.87b) 49 If the random variables {Xi} are uncorrelated two and two, Cov[Xi,Xj]=0 and (1.87c) An important special case of Eq. (1.87c) is when the Xi variables are mutually independent. 1.2.8 Approximate Moments and Distributions When dealing with random variables that have complicated functional relationships with the basic variables, it is often possible to approximately evaluate their statistical properties, sufficiently accurately for most practical problems. Consider a random variable Y that is given as a reasonably well behaved function Y=g(X) of the random variable X, which has a coefficient of variation VX that is not too large. Then the difference between X and its mean value will not be too large either, and by expanding g(x) into Taylor series about the mean value :X, the following approximation is obtained: (1.88) in which the derivatives are denoted by Taking the expected value of both sides of the above equation, the following relation is found, or (1.89) The evaluation of the variance of both sides is somewhat more cumbersome. By applying Eq. (1.87a) in connection with Eq. (1.88), the following expression is obtained: 50 or (1.90) by neglecting the last two higher order terms on the assumption that the coefficient of variation VX is small. For instance, if *VX* <<1, Var[(X/:X)2]<<Var[X/:X] or Var[X2]<<:X2Var[X] =:X2FX2. Most often, only the first two elements of the expansion in the right hand side of Eq. (1.88) are needed to obtain a reasonably accurate value for the mean, and for the variance only the first element is needed. Thus, E[Y]=E[g(X)].g(E[X]) Var[Y]=Var[g(X)].Var[X](g'(:X))2 (1.91) Similar approximations are also possible for multidimensional functional relationship. For instance, let Y=g(X1,X2,X3,.....Xn) The second order approximation for the mean is obtained by, (1.92) and noting that The first order approximation for the variance is (1.93) 51 When the random variables {Xi} are mutually independent, the following simpler expressions are obtained: (1.94) and (1.95) where :i and Fi are the mean and variance of the random variable Xi. Finally, by series expansion, (cf. Eq. (1.88), a simple approximation for the distribution of Y can be obtained. By conserving the first two elements of the right hand side of Eq. (1.88), the following expression for the mean is obtained: Y•g(:X)-:XgN(:X)+xgN(:X) which is of the form Y=a+bX. Then if X has the density function p(x), or (1.96) Example 1.23 In Fig. 1.15, the internal stress distribution in a cross section of a reinforced concrete beam in pure bending is shown, just before breaking. Now let the ultimate breaking stress of the concrete correspond to the random variable C with :C=28 MPa and a coefficient of variation VC=0.25, and the yield stress of the reinforcing steel bars correspond to the random variable S with :S=308 MPa and VS=0.1. The cross section of the beam has a so-called balanced design, that is, in the ultimate state, the concrete and the steel will reach their ultimate limits Fig. 1.15 The Ultimate Breaking State of a RC Beam 52 simultaneously. The mean value and the variance of the breaking moment of the beam shall be determined, given that the steel reinforcement cross section area is AS=942 mm2. Solution: The ultimate breaking moment of a RC beam with a balanced design can be expressed as follows, (cf. Fig. 1.15). thus M=g(S,C) is a function of the two random variables S and C that must be independent due to the nature of the problem, i.e. the strength of steel has nothing to do with the strength of concrete. At first the partial derivatives must be evaluated. g(s,c)=5.28@105@s-1400@s2/c Nmm gNs=5.28@105-2800@s/c , gOs=-2800/c gNc=1400@s2/c2 , gOcc= -2800@s2/c3 (gOss)2=104 for c=:C , (gOcc)2=114@104 for c=:C and s=:S By the expressions Eqs. (1.94) and (1.95), it is now possible to obtain the necessary information about the ultimate breaking moment. g(:S,:C)=5.28@105@308-1400@3082/[email protected]@106=157.9@106 Nmm =-(2800/28@(0.1@308)2+2800@3082/283@(0.25@28)2)/2=-0.34@106 :M=157.9-0.34=157.6 KNm 53 The variance of the breaking moment is obtained from Eq. (1.95) =[(5.28@105-2800@308/28)2@(0.1@308)2 +(1400@3082/282)2@(0.25@28)2]=(15.35@106)2 (Nmm)2 and the co-efficient of variation VM=FM/:M=15.35/157.6=0.0973. Therefore the fundamental assumption for the approximation is fulfilled, that is, the coefficient of variation is sufficiently small, (V#0.1). If a reliable design for the moment is defined as the mean value minus two standard deviations, [email protected]=126.9 KNm, the probability of failure can be computed as P[M#MD]=pf. Assuming that the moment is a normally distributed variable, the probability of failure can be obtained from the standardized variable U=(M-:M)/FM, (cf. Ex. 1.12), that is, pf = P[U#(-2)] = 1-P[U#2] = 1-0.9772 = 0.0228 In other words, if 1000 beams according to the above conditions were to be made, 23 of them could be expected to be a failure. 1.2.9 Characteristic Functions The concept of characteristic functions of a random variable can be very useful in many applications. The characteristic function of the random variable X is defined as the function, n(T) = E[eiTX] in which T is a real variable and i is the imaginary unit. The expected value of the function exp(iTX) is obtained in the usual manner by integration, (1.97) The two functions n(T) and p(x) bear close resemblance to functions that form Fourier transform pairs, (for an overview of Fourier transforms and Fourier integrals see Chapter 7). In order to that p(x) has a Fourier transform, it must satisfy the Dirichlet's condition, that is, be absolutely integrable or . Now since , Dirichlet's condition is fulfilled and the function n(-T) is the same as the Fourier transform of the probability density p(x). By way of the inverse Fourier transform, the probability density p(x) can be obtained from the characteristic function, 54 (1.98) that is, the probability density p(x) is the inverse transform of n(-T), or its representation in the x-domain. If the random variable X has a finite mean value, then Eq. (1.97) can be differentiated with respect to T and If T= 0, then from the above relation, and If the k-th moment of X is finite, then through repeated differentiation of Eq. (1.97), one finally obtains for the k-th derivative, n(k)(T) = E[ikXkeiTX] (1.99) or with T = 0, (1.100) Thus the k-th moment of a random variable has the same magnitude as the k-th derivative of its characteristic function at the point T= 0, and a sign corresponding to ik. Example 1.24 The probability density of the Gaussian distribution is Find the moments of the distribution. Solution: 55 Firstly, the variable X will be standardized, that is, U = (X-:)/F, X =FU+: The characteristic functions of U and X are then related through the expression, nX(T) = E[e(FU+:)iT]=E[eFUiT]@ei:T = ei:TnU(T) As p(u) = exp(-u2/2)/%(2B), the characteristic function nU(T) is easily obtained, Therefore, (1.101) It can be observed that all moments of the normal distribution are functions of : and F only. That is, the normal distribution is completely characterized by its mean and variance. --------------The log-characteristic function should also be mentioned: R(T) = logen(T) (1.102) It can be used for obtaining the so-called semi-invariants or cumulants 6s, first introduced by the Danish mathematician T. N. Thiele, (see, for instance, Cramer [41]). Indeed, (1.103) The semi-invariants of the random variable X are thus related to the central moments as follows, 61 = m1 = :X = :1 62 = :2 = FX2 63 = :3 64 = :4-3:22 .......................... (1.104) 56 Example 1.25 From the characteristic function n(T), an other important function, the moment generating function, can be obtained by changing the argument T to it where i is the imaginary unit. Thus a real function M(t) is obtained, i.e. M(t) = n(it) = E[e-tX] (1.105) Using the Taylor series expansion of the exponential function, e-tx = 1-tx+(tx)2/2!-(tx)3/3!+(tx)4/4!-@@@@@@ the moment generating function is expanded as follows if the first n moments are finite: (1.106) Thus, the j-th moment of the random variable X is the coefficient of tj/j! in Eq. (1.106) E[Xj] = M(j)(0) (1.107) The characteristic function itself can be expanded in the same manner (cf. Eq. (1.100)), (1.108) Example 1.26 The Poisson distribution for a discrete random variable X can be very useful when dealing with random arrival times, for instance for accidents occurring at a certain road sections, earthquake waves arriving at a specific site etc. The discrete variable X has a Poisson distribution, that is, it can assume the values xk = k, where the k's are non-negative integers, with the probability (1.109) in which 8 is a positive real constant. Obtain the characteristic function n(T) and the semi-invariants of the Poisson distribution. 57 Solution: By Eqs. (1.100) and (1.33), so (1.110) Then Also by Eq. (1.103), so the three first central moments of the Poisson distribution are equal to 8, cf. Eq. (1.104). -------------The characteristic function Eq. (1.97) can easily be extended to describe more than one random variable. Let X,Y and Z be three random variables. Then the joint characteristic function is given by (1.111) and the log-characteristic function is given by RXYZ(T1,T2,T3) = logenXYZ(T1,T2,T3) (1.112) The joint characteristic function can be obtained from integration of the joint density in the usual manner, Eq. (1.98), (1.113) Thus the joint characteristic function is but for the sign of the power in the exponential, the triple Fourier transform of the joint density function. The inverse relation is obtained in the same way as for the one-dimensional characteristic function, Eq. (1.99), 58 (1.114) The marginal characteristic functions can be defined and expressed in the same manner as marginal densities, Eq. (1.55). For instance, putting T2=T3=0 in Eq. (1.113], the onedimensional characteristic function is obtained through double integration, i.e. nX(T) = E[eiTX] = nXYZ(T,0,0) (1.115) Now, consider the sum variable Z=aX+bY. The characteristic function of Z is nZ(T) = E[eiTZ] = E[ei(aTX+bTY)] The last exponential is the same as in Eq. (1.111), where T1 and T2 have been replaced by aT and bT. Therefore, nZ(T) = nXY(aT,bT) If the two variables X and Y are independent, then and nXY(T1,T2) = nX(T1)@nY(T2) (1.116) Using the inverse transform, Eq. (1.114), it is easy to show that the converse statement is true. Example 1.27 The two discrete random variables X and Y are independent and Poisson-distributed, Ex. 1.26, with the parameters " and $ respectively. What is the distribution of the sum variable Z=X+Y? Solution: From Ex. 1.24, Eq. (1.110), By Eq. (1.116), 59 so the sum variable Z is also Poisson-distributed with a parameter ("+$). It can be proven that the converse statement is also true. 1.2.10 Laws of Large Numbers In many cases the result or outcome of an action, which is repeated a great number of times, will gradually tend to a pattern that can be predicted at least intuitively. For instance, tossing a coin many times, the heads will come up in 50% of the tosses on the average. In other words, the proportion of heads and tails will tend to 1/2 as the number of tosses becomes sufficiently large. As an other example consider a basket that contains 100 yellow and 300 white golf balls. If a ball is taken out at random, there is no obvious pattern to be noticed. However, when a large number of balls have been taken out of the basket, one would expect the ratio of yellow to white golf balls, thus extracted, to be 1/3. It can be useful to dress this simple probabilistic model into a theorem that reflects this behaviour. Actually, it is this theorem, which is called the law of large numbers, that makes the probability theory useful as a basic tool for statistical evaluations. Let Xi, i=1,2,...,n be independent random variables, all with the mean value : and variance F2. By forming the statistical average, a new random variable Y is formed, The mean value of Y is easily obtained, and the variance The law of large numbers may then be stated as follows. For any prescribed constant *>0, no matter how small, (1.117) 60 that is, the probability that Y and the mean : remain apart by as much as * tends to zero for sufficiently large numbers of variables. The proof is easily carried out by using the Chebyshev inequality, Eq. (1.81), that is, so The law of large numbers provides a theoretical counterpart of the interpretation of probability as the favourable fraction that was discussed in section 1.1.1, Eq. (1.1). For this purpose, consider first a discrete random variable X that can only assume two different values with probabilities p and q=1-p and is therefore a Bernoulli type variable, Eq. (1.30). Without loss of generality, the two values can be taken as 0 and 1. Its probability distribution can be conveniently expressed by the following formula, P[X = k]=pk(1-p)1-k, for k=0 or 1 Its probability density function can be written using the delta function, Eq. (1.33). p(x)=(1-p)*(x)+p*(x-1) The expectation of any function g(X) of a Bernoulli variable is then easily obtained by integration, that is, E[g(X)]=g(1)p+g(0)(1-p) For instance, the k-th moment E[Xk]=p, so :=p. Now if a large sequence of n Bernoulli trials is carried out with j successes, (e.g. X=1 is a success), the law of large numbers states that, (1.118) That is, the probability that the favourable fraction j/n is different from the actual probability p by more than a small prescribed number *>0, tends to zero for large number of trials n. Sometimes a distinction is made whether the convergence implied by Eq. (1.118) is strong or weak. The weak law of large numbers as stated above is a special case of Khinchine's theorem, which states that a random variable Y, which is the average value of the n random IID variables {Xi}, converges to : in probability even if the variances, Var[Xi], are infinite, that is, (1.119) The strong law of large numbers on the other hand implies that the convergence is stronger as 61 the name indicates, that is, (1.120) which states that Y converges to : with probability one. The weak law of large numbers can be proved by applying the characteristic function of the common variable X. By Eq. (1.108), the two first terms of the expansion of the characteristic function are given by, n(T)=1+i:T+o(T) The first derivative is i: and is also continuous at T= 0. The log-characteristic function Eq. (1.103) is R(T)=logen(T) and its first derivative by Eq. (1.104) is i:. Now let n*(T) be the characteristic function of the average value Y. It can be interpreted by studying the logcharacteristic function R*(T)=logen*(T), In the limit, as n goes to infinity, the last fraction becomes the first derivative of the logcharacteristic function taken at T=0, i:, so the whole exercise yields the value i:T. As the logarithm function is continuous, the reverse function is obtained as n*(T) = ei:T which is the characteristic function of a random variable that is a constant, equal to :! Therefore, the characteristic function of the random variable Y degenerates to the form where Y is equal to :, such that Y tends to assume this value in the limit. 1.2.11 The Central Limit Theorem Whereas the law of large numbers stated that the average value of a number of IID variables Xi, i=1,2,...,n, tends to become a constant for large values of n, this does not mean that the sum of the variables approaches a constant multiple of n in the limit. On the contrary, the sum Sn=X1+X2+X3+@@@+Xn shows increasing variability as the number n grows larger and the spread of the distribution of Sn becomes larger and larger. The mean value of Sn is obviously equal to n: and the variance is nF2. Thus for finite values of : and F, as n becomes a very large number, the centre of the distribution of Sn moves off to infinity with a very large spread. The above revelation about the variable Sn may seem disappointing. However, it can 62 nevertheless be useful to have some idea about the shape of the distribution of Sn. Actually, it can be shown that Sn has a universal limiting shape, regardless of the distribution of the summands Xi, subject to reasonable restrictions. This remarkable feature of Sn is the kernel of the Central Limit Theorem, which can be stated as follows. Let Xi, i=1,2,...,n, be a sequence of independent and identically distributed random variables. The mean and variance of each of the variables Xi is : and F2. Denote by Sn the sum of the variables. Then, the standardized variable. will be asymptotically normally distributed, that is, as n goes to infinity, the distribution function of Zn approaches the standardized normal distribution, N(z) with mean 0 and variance 1. (1.121) for any value of z. Hence, approximately for large n, (1.121b) To prove this important theorem, it is assumed that the standard deviation F of the common basic variable Xi is finite and not equal to zero. Then, the characteristic function of the distribution of Xi-: can be expressed as follows, (Eq. (1.108), Now form the variable, The characteristic function of Zn = Y1+Y2+Y3+@@@+Yn is obtained from the characteristic function of the Yi,s and by raising it to the nth power, 63 Now recalling that e-x=lim (1-x/n)n for n64, also noting that no(t2/n) becomes zero in the limit for an arbitrary value of T, the characteristic function for Zn is obtained by going to the limit, i.e. which is the characteristic function of a standardized normal variable, cf. Eq. (1.102). Since the characteristic function of a random variable is unique this implies that in the limit, Zn has the standardised normal distribution. The Central Limit Theorem has been proved for a large number of random contributions that are identically distributed. This condition can be relaxed and the Central Limit Theorem is also valid for a large sum of small random contributions that are independent but can have different probability laws. As pointed out by Kolmogorov, [5], perhaps the greatest service rendered by the classical Russian school of mathematics, is the work done by Chebychev and his students, Markov, Bernstein and Lyapunov to formulate, expand and generalize the conditions for the laws of large numbers and the Central Limit Theorem. Under fairly general conditions, Lyapunov and Bernstein could show that for an arithmetic mean of random variables, Y = (X1+X2+@@@+Xn)/n (1.122) where the Xi are random variables with bounded individual variances, not necessarily independent and not necessarily with identical probability distributions, (1.123) where the error function, (1.124) which is sometimes used in the literature to represent the normal distribution, has been introduced for future reference. Finally, it can be useful to state the conditions for the above result, Eq. (1.121), for independent variables with different probability distributions. In this dress, the theorem is often referred to as Lyapounov's theorem, [71]. If for a sequence of mutually independent random variables X1,X2,....Xk,..., a positive constant * can be found such that for n64 64 (1.125) where (1.126) then as n64 (1.127) Example 1.28 The sum of n real numbers is calculated by rounding each number off to the nearest integer in the sum. If the round-off error for each number is uniformly distributed over the interval [-0.5,0.5], find the probability distribution for the round-off error for the sum itself. Solution: Letting Xi denote the random value of the round-off error, the mean value is obviously equal to zero as the density is equal to one in the interval [-0.5,0.5]. The variance of the roundoff error on the other hand is given by the integral, The probability that the sum of the round-off errors, denoted by Sn, is less than any given number approaches the normal distribution for large value of n, cf. Eq. (1.121b). Hence, Example 1.29 Kolmogorov's Law of Fragmentation Many physical problems have to do with the fragmentation of a larger piece of material into many small pieces and also the size of grains or particles in a large collection. A typical example is the size of gold particles in a large sample of gold sand. Many studies of such problems have shown, that the logarithm of the dimension of the grains or fragmented pieces 65 from a large sample will approximately follow the normal distribution, that is, if a typical dimension of the particle is called D=eY, Y=logeD is normally distributed. That is whereby the lognormal distribution has the following probability density (1.128) Using the moment generating function for Y=logeD, Eqs. (1.102) and (1.105), E[Dk]=E[ekY]=M(k)=exp(k:Y+FY2k2/2) the mean value, k=1, is :D=exp(:Y+(1/2)FY2) (1.129) and the variance FD2=E[D2]-(E[D])2=(exp(FY2)-1)exp(2:Y+FY2) (1.130) The median of D, Ex. 1.13, is easily obtained by noting that P[X#M2]=P[eY#M2]=P[Y#logeM2]=1/2 whereby, logeM2=:Y so M2=exp(:Y). In a short paper, Kolmogorov [111] (reproduced by permission of Prof. Dr. A.N. Shirayev), presented and formalized this interesting probabilistic behaviour, which may be referenced as the Kolmogorov's law of fragmentation. This work has perhaps not received the attention that it deserves. In the following, a translation that follows loosely the German text, is presented. Since the paper is very short and compact without many explanations, further explanations by the author and Dr Miguel de Icaza have been added, [88]. Following Kolmogorov, let N(t) be the total number of particles at integer times t, t = 0,1,2,3,..., due to fragmentation from a larger piece of material, (rock, gold etc.). Further, N(r,t) is the total number of particles that at time t have a typical dimension, (diameter, volume, weight etc.), k#r, that is, the fragmentation of a piece of material of size r can only yield particles that are smaller. Now assume that between times t and t+1, the probability that one single particle of size r is fragmented into n smaller particles is pn. Introduce the relative ratio of the dimension of each of the n fragmented particles to the dimension of the parent piece r as ki = 66 ri/r. Obviously, ki is a random variable Ki and the joint probability distribution can be written as follows Fn(a1,a2,...,an)=P[K1#a1, K2#a2, ..., Kn#an] (1.131) where the n fragmented particles have been numbered according to increasing dimensions, r1#r2#r3#...#rn, so the distribution Eq. (1.131) is only defined for 0#a1#a2#...#an#1. Now, the average number of particles of dimension k#kr, called Q(k), that were produced by one single particle of an arbitrary dimension r between times t and t+1 is obviously (1.132) where each term in the sum is the probability pn times the probabilities that the dimension kr of each particle belongs to the ordered groups. This last statement Eq. (1.132) is perhaps not as clear to everybody as it was to Kolmogorov. The following explanation may be given. In addition to the above event that a single particle of dimension r is broken up between times t and t+1, define the subevents: (a) J, which is the event that precisely j particles were produced having dimensions less than or equal to k, and (b) J* , the event that at least j, ($j), particles were produced, having K#k, in both cases given that in all n particles were produced. Obviously the two events, J and (J+1)* are disjoint events and J+(J+1)* = J* Therefore, the conditional probability of the event J, called qi is given by P[J*n] = qj = P[J**n]-P[(J+1)**n] (1.133) The probability qj is then given by Eq. (1.133) and the distribution Eq. (1.131), that is, where follows qi = Fn ({k,k,.....k,}i 1,1,....1)-Fn ({k,k,.....k,}i+1 1,1,....1) (1.134) Fn ({k,k,.....k,}i+1 1,1,....1) is defined as zero if i=n. It is now possible to write up Q(k) as (1.135) where (1.136) 67 is the conditional mean value of the number of particles with K#k, conditional to that exactly n particles were produced, and qi is the probability to have exactly i particles for which K#k, Eq. (1.134). By introducing this result, i.e. the last sum, Eq. (1.136), into Eq. (1.135), the original Kolmogorov equation, Eq. (1.132), is obtained. The following reasonable assumptions are now made: (a) the probabilities pn and the distribution Fn are not dependent on the absolute dimensions of the particles, nor on the history of the fragmentation, that is, how a particle of any size at an earlier stage was broken up, nor on the destiny of other particles. (b) the expectation Q(1), that is, the average number of particles produced by fragmentation of one single particle of size r, during the time interval [t,t+1] without regard to size of the fragmented particles, k#r, is both limited and larger than one. (c) the integral is finite. (d) at the beginning, t = 0, a certain number n(0)of particles is at hand, which can have an arbitrary distribution of size N(r,0). Under these assumptions, the expectation or the average number N(t) of particles at time t is E[N(t)] = n(0)(Q(1))t (1.137) Now, form the ratio (1.138) where the function N(ex,t) represents the number of particles at time t that have the dimension k#r=ex with x as a real variable. (As pointed out by Kolmogorov, it is completely irrelevant what the dimension k really is). The real task is now to estimate the function T(x,t). From the assumptions (a) and (b), it follows that (1.139) that is the average number of particles at time t+1, E[N(r,t+1)], is obtained by summing over all particles with a dimension k#kr, that are produced from the available particles at time t, letting k run through all values between 0 and 1. Eq. (1.139) needs further explanation since it is far from being obvious. Take any partition at time t, {k1,k2,.....km} such that 0<L#k1#k2#k3#@@@#km=1, L being a very small number. E[N(r/ks+1,t)] is the average number of particles with k#r/ks+1. 68 However, these particles are also counted or included in the number E[N(r/ks,t)]. Therefore, the particles E[N(r,t)] are counted or included in all the numbers E[N(r/ks,t)], s=1,2,3,...m. Now from E[N(r,t)], E[N(r,t)]Q(1)=E[N(r/km),t)]Q(km) particles are produced between t and t+1. From E[Nr/km-1,t)-N(r/km,t)], E[Nr/km-1,t)-N(r/km,t)]Q(km-1) particles are produced ...... ...... From E[Nr/k1,t)-N(r/k2,t)], E[Nr/k1,t)-N(r/k2,t)]Q(k1) particles are produced. Summing up, By making the partition finer and finer, the above sum tends to the integral, L60+, whereby Eq. (1.139) is obtained. Now put Q(k)=Q(1)S(>), >=logek , 0<k<1 and -4<><0 (1.140) Then by Eqs. (1.138) and (1.139), (1.141) By Eqs. (1.138) and (1.140), the two functions, S(x) and T(x,0) = N(ex,0)/n(0) fulfil all the conditions of probability distribution functions. If x6 -4, k60, and both S(x) and T(x,0) tend to zero. If k61, x60 and both functions tend to one. If x>0, S(x) = S(0) =1 and the upper limit 0 in integrals involving S(x) can be replaced by 4. The two functions, therefore, have the same character as ordinary probability distribution functions, even if they should not be considered as such in the strictest sense. From the recurrence equation, Eq. (1.141), the same applies to the function T(x,t) for any integer t>0. 69 From the assumption (c), it follows that the integral possible is finite. It is now to apply Lyapunov's theorem, Eq. (1.127), which allows the interpretation that for t64, (1.142) in which (1.143) Through Eq. (1.133), these integrals can be replaced by (1.144) in which A and B2 represent the mean value and the variance of the variable logeK, which by Eq. (1.142) is normally distributed, so K is lognormal. As in many of the above derivations, the final result Eq. (1.132) is far from being obvious. An explanation may be sought as follows. The recurrence equation Eq. (1.141) is a kind of a Stieltjes integral for two distribution functions, [71]. In fact, consider two independent random variables U and V and their sum W=U+V, ( cf. Ex. 1.21). Then (1.145) If S(x) and T(x,t) are considered to be two such distribution functions, that is T(x,0) is the distribution function of the variable U and S(x) of V, then by Eqs. (1.141) and (1.145), T(x,1) is the distribution function of the variable U+V1, T(x,2) is the distribution function of the variable U+V1+V2 and T(x,n) is the distribution function of the variable U+V1+V2+@@@+Vn. Introducing the mean values and the variances of the individual variables and the sum divided by n, (n tends to a large number), and Ak=A=E[Vk] and An/n=E[U+V1+V2+@@@+Vn]/n=E[U]/n+A 6A Bk2=B2=Var[Vk] and Bn2/n=Var[U+V1+V2+@@@+Vn]/n=Var[U]/n+B26B2 where the A and B are given by the above integrals Eq. (1.143). By selecting the arbitrary number * in the Lyapounov's condition Eq. (1.125) equal to one, it suffices to show that E[*Vk-Ak*3]#"<4 70 where " is a positive bounded number. From the condition (c) it follows that so the conditions for Lyapounov's theorem are fulfilled, and it has been shown that T(x,t) has the distribution Eq. (1.142). The Kolmogorov law of fragmentation has proved very useful in many different and often unrelated problems. For instance, the law of fragmentation has been used for the study of stress release in earthquake fracture zones, and in soil mechanics studies, [129], [131]. 1.2.12 Distributions of Extreme Values In Exs. 1.9 and 1.18, the probability distribution of the largest value of a collection of sample values was obtained, Eq. (1.37). The distributions of extreme values from a population or series of measurements or observation of certain physical quantities, show up in many kinds of applications, dealing with the assessment or evaluation of the maxima of these quantities. Maximum annual wind speeds at a given location, maximum earthquake magnitudes to be expected in certain areas, the lowest values of strength to be expected in a material, to name but a few types of such problems, that have to be addressed through the probability distributions of extreme values of the quantities involved. Due to the importance of this particular field of application of probability theory to engineers and other scientists, a brief overview of the distributions of extreme values will be presented, following the classical approach, [57], [73], [87], [176] and [239]. By Eq. (1.68) it was shown that the probability distribution of the largest or extreme value Yn from a collection of n random variables Xi, i=1,2,...,n that are independent and identically distributed with the parent distribution F(x), is found by the n-th power of parent distribution, that is y=max{x1,x2,....,xn} P[Yn#y]=Qn(y) = (F(y))n (1.68) Under certain assumptions, it can be shown that as n grows very large, n64, Q(y) approaches asymptotically a distribution that only depends on few characteristic features of the parent distribution F(x). An asymptotic extreme value distribution, therefore, has the common property 71 that its mean value and variance may depend on n but its form will be independent of n. It is customary to introduce the so-called attractions coefficients "n and $n to facilitate the above implied convergence such that (1.146) (1.147) The first equation Eq. (1.146) can be shown to have in general, three different solutions or types of limiting distribution functions for large values of n that are governed by some simple fundamental characteristics of the parent distribution F(x). (1) If the random variable X is unlimited and (1-F(x)) is asymptotically equivalent to the exponential function for large x, an asymptotic extreme value distribution of the 1st kind is obtained. (2) If the random variable X has a lower limit such that (X-"1) can only have positive values, and (1-F(x)) is asymptotically equivalent to cx-k for large x, an asymptotic extreme value distribution of the 2nd kind is obtained. (3) The asymptotic extreme value distribution of the 3rd kind is obtained when (X-"1) can only assume negative values and (1-F(x)) is asymptotically equivalent to c("1-x)k for values of x close to the limit "1. This surprising result, that regardless of the diverse parent distributions that can be encountered in various applications, only three distinct extreme value distributions emerge from the limiting process, is truly remarkable. In Table 1.2, the main properties of the above limiting distribution types are shown with the attraction coefficients as they turn out during the limiting process. For the first distribution, the attraction coefficient "n, also called the localization parameter, along with the expected value, increases proportionally with logen, whereas the spread is unaffected. For the second distribution, the localization parameter and the expected value are unaffected by larger n but the spread, i.e. the variance, increases proportionally with n1/k. As for the third distribution, the expected value is also unaffected, whereas the spread decreases proportionally with n-1/k. Table 1.2 Properties of Asymptotic Extreme Value Distributions, (After C. Dyrbye et al. [57]). Type Qi(x) "n $n :n-" Fn2 I exp[-e-x] "1+$1@logen $1 $n(*) $n2B2/6 II exp[-x-k] "1 $1n1/k $n'(1-1/k) $n2['(1-2/k)-'2(1-1/k)] III exp[-(-x)k] "1 $1n-1/k -$n'(1-1/k) $n2['(1+2/k)-'2(1+1/k)] 72 *) ( = 0.5772.. is Euler's constant The three extreme value distributions that have thus been established, have the following properties. For the sake of convenience, the distributions for large positive values (maxima) are presented. The minima are easily obtained by changing the variables as follows, min{xi}=max{-xi} (1.148) The extreme value distributions for the minima are thus completely analogous. The first distribution is either called the Gumbel distribution or Fischer-Tippet type I, after the statisticians, who played a major part in the development of the theory of extreme values, and has the form (1.149) whereas the probability density function is given by (1.150) This distribution has the mean value and variance E[Y]="+$( and Var[Y]=$2B2/6 (1.151) with the main parameters " and $>0 and an additional constant ( = 0.5772...., which is an irrational number, the so-called Euler's constant. The Gumbel distribution is widely applicable in problems when dealing with extreme values that can have both negative and positive values. However, it can also be applied for variables that are positive only, as its negative tail is very weak. It has been used, for example, to predict maximum wind speeds, floods and maximum earthquake magnitudes. It should also be mentioned that it can be shown that the requirement that all the random variables {Xi }have the same parent distribution can be relaxed, still giving the Gumbel distribution as the limiting distribution, [87]. It is often convenient to study the standardized Gumbel distribution, which is introduced by the coordinate transformation whereby Q(z)=exp{-exp(-z)} , -4<z<4 The form of the standardized distribution is shown in Fig. 1.16. (1.152) 73 Finally, the Gumbel distribution is sometimes used as a distribution for minimum values, Y=min{X1,X2,X3,....}, in which case it takes the form with the mean value and variance E[Y]="-$( and Var[Y]=$2B2/6 (1.153) The second distribution is often called the Fischer-Tippet Type II or the Frêchet distribution, after the French statistician and mathematician, and has the form Fig. 1.16 The standardized Gumbel distribution (1.154) Usually the lower limit of y is set at 0. The mean value and variance are easily derived E[Y]="+$'[1-1/k], k>1 and Var[Y]=$2{'[1-2/k]-'2[1-1/k]}, k>2 (1.155) The Frêchet distribution is convenient when the extreme values are positive only. It has found use in wind engineering to predict maximum wind speeds to name an example, [4]. Often the 74 coefficient " is disregarded so as to have a two parameter distribution only. In this form the distribution is written as follows: (1.156) The expectation and the coefficient of variation are then given as follows: (1.157) which shows that the coefficient of variation is only dependent on the second parameter k, making it possible to calculate k if the variation of the data is known (see Ex. 1.32). The general form of the of the Frêchet distribution is shown in Fig. 1.17. There exists a similar relation between the Gumbel distribution (type I) and the Frêchet distribution (type II) as between the normal and lognormal distributions. If Y is a Frêchet variable with the parameters $F and k, then Z=logeY is a Gumbel variable with the parameters "=loge$F and 1/$G=k, whereby (1.158) The third distribution, which is often called the Fischer-Tippet Type III or the Weibull distribution, after the Swedish engineer and statistician, has the form (1.159) 75 Fig. 1.17 The Frêchet distribution with the parameters $=1 and k=3 but is more often used for the distribution of minima in the form (cf. Eq. (1.148)) Q( y ) = 1 − exp[ − ( y − α)k β k ] , β ,k > 0 , α > −∞ (1.160) Its density function is given by (1.161) The mean value and variance of the Weibull distribution are found to be E[Y]="+$'[1+1/k] and Var[Y]=$2{'[1+2/k]-'2[1+1/k]} (1.162) in which '(x) is the Gamma function with the basic properties '(1+x)=x '(x) and Γ( 21 ) = π . Setting the lower limit for y equal to zero ("=0) gives the following simpler version of the distribution (1.163) where 8=1/$k. Two important special cases are found when k=1 and k=2. In the first case, (1.164) 76 which is the exponential distribution. In the second case (k=2), y2 Q( y ) = 1 − exp[ − 2 ] 2c y2 y q ( y ) = 2 exp[ 2 ] , 0 ≤ y < ∞ c 2c (1.165) where c is the modus of the distribution (cf. Ex. 1.13). This is the classical Rayleigh distribution which has found widespread use in various applications (cf. Ex. 2.3). The Weibull distribution has found use in material science, dealing with material strength, in its form for distributions of minima, but can also be used in applications involving maxima. It can be used for wind speed distribution as well. In section 4.4, various forms of the Weibull distribution are shown in connection with distribution of extreme peaks of a random signal. The above presentation of the extreme value distributions has been both short and superficial. It is only included for reference, that is, for the reader to have an easy access to the main formulas. For further details, the reader should consult Gumbel, [73], or any other modern text on extreme value statistics, [87], [176]. Some interesting applications and an overview of the most recent development in the theory of extreme values are found in the proceedings of a conference commemorating the centennial anniversary of Emil Gumbel’s birthday, [68].