Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DISTRIBUTIONS ON THE SIMPLEX G. Mateu-Figueras, V. Pawlowsky-Glahn, and C. Barelo-Vidal Univ ersitat de Girona, Girona; gloria.mateuudg.es Abstrat The simplex, the sample spae of ompositional data, an be strutured as a real Eulidean spae. This fat allows to work with the oeÆients with respet to an orthonormal basis. Over these oeÆients we apply standard real analysis, in partiular, we dene two dierent laws of probability through the density funtion and we study their main properties. 1 Algebrai geometri struture on SD 0 representing proportions of some whole an be expressed as subAny vetor x = (x1 ; x2 ; : : : ; xD )P jet to the unit-sum onstraint Di=1 xi = 1. Therefore, a suitable sample spae for ompositional data, onsisting of suh vetors of proportions (ompositions), an be taken to be the unit simplex dened as D X 0 D S = f(x1 ; x2 ; : : : ; xD ) : x1 > 0; x2 > 0; : : : ; xD > 0; xi = 1g: i=1 A real vetor spae struture is indued by the perturbation operation, dened for any two ompositions x; x 2 S D , as x x = C (x1 x1 ; x2 x2 ; : : : ; xD xD )0 , and the power transformation, dened for any x 2 S D and 2 R as x = C (x1 ; x2 ; : : : ; xD )0 . C () denotes the losure operation, a transformation from RD+ to S D that onverts eah vetor into its omposition. The Eulidean spae struture is indued by the inner produt dened by Aithison (2002) for any two ompositions x; x 2 S D as X (1) hx; x i = 1 ln xi ln xi : a D i<j xj xj p The assoiated norm is kxka = hx; xia and the assoiated distane, whih we know as Aihison distane, is v ! u i 2 u1 X x x i ln ln : (2) d (x; x ) = t a D i<j xj xj The distane (2) follows the standard properties of a distane. Given x; x ; x0 2 S D , and 2 R, it is easy to prove that da (x; x ) = da (x0 x; x0 x ), and da ( x; x ) = jjda (x; x ). Thus we say that the distane da is oherent or ompatible with the struture of S D . Also, the distane (2) does not depend on the partiular order of the parts beause da (Px; Px ) = da (x; x ), for any permutation matrix P. These properties had been studied by Aithison (1992) and MartnFernandez (2001); for a proof see Pawlowsky-Glahn and Egozue (2002, p. 269). A D-part omposition is usually expressed in terms of the anonial basis of RD , using the sum and the salar produt operations. In fat, any x 2 S D an be written as x = (x1 ; x2 ; : : : ; xD )0 = x1 (1; 0; : : : ; 0)0 + x2 (0; 1; 0; : : : ; 0)0 + + xD (0; : : : ; 0; 1)0 . Note that the anonial basis of RD is not a basis on S D and this expression is not a linear ombination on S D with respet to its vetor spae struture. But, given that S D is a vetor spae with dimension D 1, general algebra theory assures the existene of a basis. Before we introdue a basis of S D , note that the set B = fw1 ; w2 ; : : : ; wD g, where wi = C (1; 1; : : : ; e; : : : ; 1)0 with the element e plaed in the i-th row, is a generating set of S D . Certainly, any omposition x 2 S D an be expressed as, for example, x = (ln x1 w1 ) (ln x2 w2 ) (ln xD wD ). Obviously there are more possible expressions as x = (ln(x1 =g(x)) w1 ) (ln(x2 =g(x)) w2 ) (ln(xD =g(x)) wD ), where g(x) is the geometri mean of omposition x. Given x 2 S D , we say that one possible vetor of oeÆients with respet to the generating set B is 0 x1 x2 xD (3) lr(x) = ln g(x) ; ln g(x) ; : : : ; ln g(x) : We will use the notation lr(x) to emphasize the similarity with the vetor obtained applying the entred logratio transformation to omposition x (Aithison, 1986, p. 94). Note that, like for the entred logratio vetor, the sum of the omponents of the vetor lr(x) is equal to 0. As the dimension of S D is D 1, we an eliminate any omposition of B to obtain a basis of S D . If we eliminate the last omposition wD we obtain the basis B = fw1 ; w2 ; : : : ; wD 1 g and now a omposition x has a unique expression as a linear ombination x = (ln(x1 =xD ) w1 ) (ln(x2 =xD ) w2 ) (ln(xD 1 =xD ) wD 1 ). Thus we say that the oeÆients of x with respet to the B basis are alr(x) = ln xx1 ; ln xx2 ; : : : ; ln xxD 1 D D D 0 : (4) We will use the notation alr(x) to emphasize the similarity with the vetor obtained applying the additive logratio transformation to a omposition x (Aithison, 1986, p. 113). We an suppress any other omposition wi (i = 1; 2; : : : ; D 1) of the set B instead of wD , to obtain another basis. In this ase, the omponents of x with respet to this new basis are (ln(x1 =xi ); : : : ; ln(xi 1 =xi ); ln(xi+1 =xi ); : : : ; ln(xD =xi ))0 . The inner produt (1) and its assoiated norm ensure the existene of an orthonormal basis. With simple algebrai operations we an hek that the set fw1 ; w2 ; : : : ; wD 1 g is not orthonormal. Using the Gram-Shmidt method we an obtain from this basis an orthonormal basis denoted as fe1; e2 ; : : : ; eD 1 g. A omposition x has a unique expression as a linear ombination x = (hx; e1 ia e1 ) (hx; e2 ia e2 ) : : : (hx; eD 1 ia eD 1 ). Thus we say that the oeÆients of any omposition x 2 S D with respet to the orthonormal basis fe1 ; e2 ; : : : ; eD 1g are: ilr(x) = (hx; e1 ia ; hx; e2 ia ; : : : ; hx; eD 1 ia )0 : (5) We will use the notation ilr(x) to emphasize the similarity with the vetor obtained applying the isometri logratio transformation to omposition x (Egozue et al., 2003, p. 294). As in real spae, there exist an innite number of orthonormal basis on S D . Aithison (1986, p. 92) provides the relationship between the alr and lr transformations. Later, Egozue et al. (2003, p. 298) provide the relationship between the alr, lr and ilr transformations. Now they an be interpreted as a hange of basis or of generating system that relates the oeÆients alr(x); lr(x) and ilr(x). These relations are: alr(x) = FD 1;D lr(x); lr(x) = FD;D 1 alr(x); ilr(x) = U0D 1;D lr(x); lr(x) = UD;D 1 ilr(x); (6) alr(x) = FD 1;D UD;D 1 ilr(x); ilr(x) = U0D 1;D FD;D 1 alr(x); where FD 1;D = [ID 1 : jD 1 ℄, with ID 1 representing the identity matrix of order D 1 and 1) matrix with the oeÆients lr(ei ) 1 the olumn vetor of units; UD;D 1 is a D (D (i = 1; 2; : : : ; D 1) as olumns; and FD;D 1 is the Moore-Penrose generalized inverse of matrix FD 1;D given by 1 0 D 1 1 1 B 1 D 1 1 C B C 1 B . . . . .. .. .. .. C F =DB C: B C 1 1 D 1 A 1 1 1 jD Over the oeÆients with respet to an orthonormal basis we an apply standard real analysis. Certainly, it is easy to see that the operations and are equivalent to the sum and the salar produt of the respetive oeÆients with respet to any basis (not neessarily orthonormal). In the partiular ase of oeÆients with respet to an orthonormal basis, we an apply the standard inner produt and the Eulidean distane in RD 1 . We annot assure this properties for the oeÆients with respet to a generating set in general, but in the partiular ase of the oeÆients (3), all properties are fullled. Following Aithison (1986), given a omposition x 2 S D we may wish to fous attention on the relative magnitude of a subset of omponents, so we need the denition of a subomposition. As stated in Aithison (1986), the formation of a C -part subomposition, s, from a D-part omposition, x, an be ahieved as s = C (Sx), where S is a C D seletion matrix with C elements equal to 1 (one in eah row and at most one in eah olumn) and the remaining elements equal to 0. A subomposition an be regarded as a omposition in a simplex of lower dimension. When we work with large dimensional ompositions, it may be of interest to amalgamate omponents to form a new omposition. An amalgamation is any omposition in a simplex of lower dimension obtained from the sum of some groups of omponents of the original omposition. A random omposition x is a random vetor with S D as domain. Aithison (1997) uses the geometri interpretation of the expeted value of a random vetor to dene the enter of a random omposition as the omposition en(x) whih minimizes the expression E[d2a (x; en[x)℄℄. We obtain en[x℄ = C (exp(E[ln x℄)) or equivalently en[x℄ = C (exp(E[ln(x=g(x))℄)) (Aithison, 1997, p. 10, Pawlowsky-Glahn and Egozue, 2002, p. 270), who also inlude the equality alr(en[x℄) = E[alr(x)℄: (7) Using (6) and (7), we an prove the equalities lr(en[x℄) = E[lr(x)℄; ilr(en[x℄) = E[ilr(x)℄ (8) This means that the oeÆients of omposition en(x) with respet to the B basis, to the generating set B and to an orthonomal basis of S D are equal to the expeted value of the oeÆients of x with respet to the B basis, the generating set B and an orthonormal basis of S D , respetively. Observe that we an ompute E[alr(x)℄; E[lr(x)℄ and E[ilr(x)℄ applying the standard denition. Afterwards we an obtain the related omposition en(x) through the orresponding linear ombination. This result onrms that working with the oeÆients with respet to a basis and applying the standard methodology is equivalent to work diretly with ompositions. In partiular, for the enter of a random omposition, we an use the oeÆients alr, lr or ilr without distintion. More generally, Pawlowsky-Glahn and Egozue (2001, p.388) prove the equality h(en[x℄) = E[h(x)℄ for any isomorphism h. The variane of a real random vetor an be interpreted as the expeted value of the squared Eulidean distane around its expeted value. This interpretation is used by Pawlowsky-Glahn and Egozue (2002, p. 264) to dene the metri variane of a random omposition x as Mvar[x℄ = E[d2a (x; en[x℄)℄. The oeÆients with respet to an orthonormal basis and with respet to the generating set B provide two equivalent expressions for the metri variane as Mvar[x℄ = E[d2eu (ilr(x); ilr(en[x℄))℄ = E[d2eu (lr(x); lr(en[x℄))℄: In this ase it is not possible to use the Eulidean distane between the alr oeÆients beause the basis B is not orthonormal. Consequently, the Aithison distane between x and en(x) is not equal to the Eulidean distane between the respetive alr oeÆients. Certainly, PawlowskyGlahn and Egozue (2001) prove the equality Mvar[x℄ = E[d2eu (h(x); E[h(x)℄)℄ for any h isometry but we know that the alr transformation is not an isometry. Aithison(1997, p. 13) denes the total variability as totvar(x) = trae( ) where is the standard ovariane matrix of oeÆients (3). Later, Pawlowsky-Glahn and Egozue (2002, p. 265) obtain the equality Mvar[x℄ = totvar[x℄. Using equalities (6) we an prove that totvar(x) = trae() where is the standard ovariane matrix of oeÆients (5). As = var(ilr(x)) = var(U0 lr(x)U) = U0 var(lr(x))U = U0 U, we obtain trae() = trae(U0 U) = trae( UU0 ) = trae( ) beause matrix UU0 has the rows of as eigenvetors with eigenvalue 1. In this ase it is not possible to use the ovariane matrix of oeÆients (4) beause when we ompute a ovariane we use impliitly a distane and we know that the Aithison distane between two ompositions is not equal to the Eulidean distane between the respetive alr oeÆients. We know that S D RD and onsequently we an apply the traditional approah to dene laws of probability. The additive logisti normal distribution dened by Aithison and Shen (1980) and studied by Aithison (1986) and the additive logisti skew-normal distribution introdued by Mateu-Figueras et al. (1998) and studied in Mateu Figueras (2003) are two laws of probability dened through transformations from S D to the real spae. In the next setion, we onsider the Eulidean spae struture of S D and we dene a law of probability over the oeÆients of the random omposition with respet to an orthonormal basis. In partiular, we dene the normal and the skew-normal distributions on S D . To state it learly: the essential dierene between the traditional methodology and the approah we are going to present is the measure assumed on S D . In the traditional approah the measure used is the usual Lebesgue measure on S D , whereas in our approah, the measure is the Lebesgue measure on the oeÆients with respet to an orthonormal basis on S D . 2 Distributions on S D, some general aspets Given any measurable spae, the Radon-Nikodym derivative of a probability P with respet to a measure is a measurable and non-negative funtion f () suh that the probability of any event A of the orresponding -algebra is P (A) = Z A f (x)d (x): We also name f () as the density funtion of probability P with respet to the measure . When we work with random variables or random vetors in real spae, we use the density funtion with respet to the Lebesgue measure and we all it \the density funtion of the random variable". We do not mention the measure beause it is understood that it is the Lebesgue measure. The Lebesgue measure has an important role in real analysis beause it is invariant under translations. On S D we an dene quite straightforwardly a law of probability through the density funtion with respet to a measure on S D . But we are interested in nding a measure similar to the Lebesgue measure in real spae. In partiular we look for a measure invariant under the operation , the internal operation in S D . Taking into aount the distane da , we onlude that the Lebesgue measure in real spae is not adequate in S D . Nevertheless, note that if is not the usual Lebesgue measure in real spae, we annot make eetive the alulation of any integral. As we have seen in the previous setion, S D has a D 1 dimensional Eulidean spae struture and we ould apply the isomorphism between S D and RD 1 . We have only to identify eah element of S D with its vetor of oeÆients with respet to an orthonormal basis. In this ase we an introdue the density funtion of the oeÆients with respet to the Lebesgue measure and apply all the standard probability theory. This funtion allows us to ompute the probability of any event by means of an ordinate integral, i.e., if f () is the density funtion of the oeÆients with respet to an orthonormal basis, we an ompute the probability of any event A S D as P (A) = Z A f (v1 ; v2 ; : : : ; vD 1 )dv1 dv2 : : : dvD 1 ; where A and (v1 ; v2 ; : : : ; vD 1 ) represent the oeÆients with respet to the orthonormal basis that haraterize the set A and the omposition x. We have to bear in mind that if we use this methodology to ompute any element of the support spae, we will obtain the oeÆients of this element with respet to the orthonormal basis used. Next, we ould obtain the orresponding omposition by means of a linear ombination. MateuFigueras and Pawlowsky-Glahn (2003) and Pawlowsky-Glahn et al. (2003) ompute, respetively, the expeted value of a normal model on R+ and the expeted value of a bivariate normal model on R2+ using the priniples of working on oeÆients with respet to an orthonormal basis. In our ase it will be easy to ompute the expeted value of the oeÆients of any random omposition x, i.e. E[ilr(x)℄. Interpreting them as oeÆients of our basis of referene, we will obtain the omposition that we all expeted value and we denote as E[x℄ through a linear ombination. Now we introdue two laws of probability on S D indiating the expression of funtion f . In both ases, the families of distributions are losed by and operations and, moreover, the densities satisfy the equality (9) fx(x) = fax(a x); where fx and fax represent the densities of random ompositions x and a x respetively, with a a onstant omposition. This property has important onsequenes in S D beause we often apply the entering operation (Martn-Fernandez, 2001). We also ompute the expeted value and the ovariane matrix using the oeÆients with respet to the orthonormal basis and the standard proedures in real spae. The obtained results are oherent with the enter and the metri variane of a random omposition. 3 The normal distribution on SD 3.1 Denition and properties Denition 1 Let be ( ; F ; p) a probability spae. A random omposition x : ! S D is said to have a normal on S D distribution with parameters and , if the density funtion of the oeÆients with respet to an orthonormal basis of S D is 1 0 1 1=2 (D 1)=2 j j exp 2 (ilr(x) ) (ilr(x) ) : fx (x) = (2) (10) We use the notation x NSD (; ). The subsript S indiates that it is a model on the simplex and the supersript D indiates the number of parts of the omposition. We also use the notation and to indiate the expeted value and the ovariane matrix of the vetor of oeÆients ilr(x). The density (10) orresponds to a density funtion of a normal random vetor in RD 1 . This is the reason why we all it the normal on S D law. We want to insist that (10) is the density of the oeÆients of x with respet to an orthonormal basis on S D , and therefore it is a Radon-Nikodym derivative with respet to the Lebesgue measure in RD 1 , the spae of oeÆients. This allows us to ompute the probability of any event A S D with an ordinary integral as Z 1 0 1 (D 1)=2 1=2 j j exp 2 (ilr(x) ) (ilr(x) ) d(ilr(x)); P (A) = (2) A where A represents the oeÆients of A with respet to the onsidered orthonomal basis, and is the Lebesgue measure in RD 1 . One important aspet is that the normal on S D law is equivalent, on S D , in terms of probabilities, to the additive logisti normal law studied by Aithison (1986) and dened using transformations from S D to real spae. Reall that to dene the additive logisti normal model we onsider xD = 1 x1 x2 xD 1 and S D RD . Its density funtion is the Radon-Nikodym derivative with respet to the Lebesgue measure in RD 1 and it is obtained using the additive logratio transformation. Given relations (6) it is also possible to obtain the density funtion using the isometri logratio transformation. Then, we obtain the probability of any event A S D omputing the standard integral D 1=2 P (A) = A (2 )(D Q D 1 1 0 1 exp 2 (ilr(x) ) (ilr(x) ) dx1 dx2 dxD 1 ; (11) 1)=2 j j1=2 where now the vetor ilr(x) denotes the isometri logratio transformation of omposition x (Egozue et al., 2003). The parameters and are now the standard expeted value and the ovariane matrix of the ilr transformed vetor. Note that we use the same notation as in the normal on S D ase, beause even though the interpretation is dierent, the expression of the vetors ilr(x); and of the matrix are the same. Observe that the above Q density is obtained through the transformation tehnique beause it ontains the term D 1=2 ( Di=1 xi ) 1 , the jaobian of the isometri logratio transformation. Now it is important to orretly interpret the vetor ilr(x) as the isometri logratio vetor or as the oeÆients with respet to an orthonormal basis. To avoid possible onfusions we denote as v = (v1 ; v2 ; : : : ; vD 1 )0 the oeÆients with respet to an orthonormal basis of omposition x, and as ilr(x) its isometri logratio transformed vetor. Thus, given x NSD (; ) we an write the probability of any event A S D as Z 1 0 1 (D 1)=2 1=2 j j exp 2 (v ) (v ) dv1 dv2 dvD 1 ; (12) P (A) = (2) A where A represents the oeÆients of A with respet to the hosen orthonormal basis. In both ases, expressions (11) and (12) are standard integrals of a real valued funtion and we an apply all the standard proedures. In partiular we an apply the hange of variable theorem. QThen, we take the expression (12) and we apply the hange v = ilr(x), whose jaobian is D 1=2 ( Di=1 xi ) 1 . The hange of variable theorem assures the equality Z 1 0 1 (D 1)=2 1=2 j j exp 2 (v ) (v ) dv1 dv2 dvD 1 P (A) = (2) A Z D 1=2 = ilr 1 (A ) (2)(D Z i=1 xi Q D i=1 xi = 1) 2 1 j j = exp 1 2 1 0 1 2 (ilr(x) ) (ilr(x) ) dx1 dx2 dxD 1 : The oeÆients with respet to an orthonormal basis of any element of S D are equal to its isometri logratio transformation. Thus, the oeÆients of A are also equal to its isometri logratio transformed event, and onsequently ilr 1 (A ) = A, where ilr 1 denotes the inverse transformation. Observe that the probability of any event A S D is the same using both models, the normal in S D and the logisti normal, and we say that the two laws are equivalent on S D in terms of probabilities. But without any doubt, the normal on S D and the additive logisti normal models are onsiderably dierent, speially from onepts and properties that depend on the geometry of the spae. Now we provide some general properties of the normal in S D model that will help us to observe the dierenes from the additive logisti normal model. To be oherent with the rest of the work, we revert to using the notation ilr(x) to indiate the oeÆients with respet to an orthonormal basis of S D of omposition x. Property 1 Let be x a D-part random omposition with a NSD (; ) distribution. Let be a 2 S D a vetor of onstants and b 2 R a salar. Then, the D-part omposition x = a (b x) has a NSD (ilr(a) + b; b2 ) distribution. The ilr oeÆients of omposition x are obtained from a linear transformation of the ilr oeÆients of omposition x beause ilr(x ) = ilr(a) + bilr(x). We an deal with the density funtion of the ilr oeÆients of x as a density funtion in real spae. As it is a lassial normal density in real spae, we use the linear transformation property to obtain the density funtion of the ilr(x ) vetor. Therefore x NSD (ilr(a) + b; b2 ). Proof. Observe that the resulting parameters are ilr(a)+ b and b2 . These values are the expeted value and ovariane matrix of the ilr(x ) oeÆients beause E[ilr(x )℄ = E[ilr(a) + bilr(x)℄ = ilr(a) + bE[ilr(x)℄ = ilr(a) + b; var[ilr(x )℄ = var[ilr(a) + bilr(x)℄ = b2 var[ilr(x)℄ = b2 : Property 2 Let be a random omposition x NSD (; ) and a 2 S D a vetor of onstants. Then fax(a x) = fx (x), where fax and fx represent the density funtion of random ompositions x and a x respetively. Proof. Using Property 1 we have that a x NSD (ilr(a) + ; ). Therefore, fax(a x) = (2) (D jj = exp 21 (ilr(a x) (ilr(a) + ))0 (ilr(a x) (ilr(a) + )) = (2) D = j j = exp 21 (ilr(a) + ilr(x) (ilr(a) + ))0 (ilr(a) + ilr(x) (ilr(a) + )) = 1) 2 1 2 1 ( 1) 2 1 2 1 = (2) D ( = 1) 2 jj = 1 2 exp 1 0 1 2 (ilr(x) ) (ilr(x) ) = fx (x); as indiated in (9). Note that Property 2 is not hold true for the additive logisti normal distribution. Property 3 Let be x a D-part random omposition with a NSD (; ) distribution. Let be xP = Px the omposition x with the parts reordered by a permutation matrix P. Then xP has a NSD (P ; P ) distribution with P = U0 PU and P = (U0 PU)(U0 PU)0 ; where U is a D (D 1) matrix with vetors lr(ei ) (i = 1; 2; : : : ; D 1) as olumns. Proof. To obtain the distribution of a random omposition xP in terms of the distribution of x, it is neessary to nd a matrix relationship between the ilr oeÆients of both ompositions xP and x. If we work with the lr oeÆients, we have lr(xP ) = Plr(x). Applying (6) we obtain ilr(xP ) = (U0 PU) ilr(x). As the ilr(x) vetor has a normal distribution, we an apply the hange of variable theorem or the linear transformation property of the normal distribution in real spae to obtain a NSD (U0 PU; (U0 PU)(U0 PU)0 ) distribution for the random omposition xP . Note that the parameters of the model agree with the expeted value and the ovariane matrix of the ilr(xP ) vetor. Property 4 Let be x a D-part random omposition with a NSD (; ) distribution. Let be s = C (Sx) a C -part subomposition obtained from the C D seletion matrix S. Then s has a NSC (S ; S ) distribution, with S = U 0 SU and S = (U 0 SU)(U 0 SU)0 ; where U is a D (D 1) matrix with the lr oeÆients of an orthonormal basis of S D as olumns, and U is a C (C 1) matrix with the lr oeÆients of an orthonormal basis of S C as olumns. Proof. We know that the oeÆients alr(s) and alr(x) are equal to the respetive alr transformed vetors. Aithison (1986, p. 119) proves that alr(s) = FC 1;C SFD;D 1 alr(x). Applying (6) we obtain ilr(s) = U 0 FC;C 1 FC 1;C SFD;D 1 FD 1;D U ilr(x). We an easily hek that matries FC;C 1 FC 1;C and FD;D 1 FD 1;D have the olumns of matries U and U, respetively, as eigenvetors with eigenvalue 1. Consequently we have U 0 FC;C 1 FC 1;C = U 0 between the ilr oeÆients of subompoand FD;D 1 FD 1;D U = U, and the relationship sition s and omposition x is ilr(s) = U 0 SU ilr(x). Given the density of the ilr(x) vetor and applying the hange of variable theorem or the linear transformation property of the normal distribution in real spae, we obtain the density of the ilr(s) vetor, that orresponds to a NSC U 0 PU ; (U 0 PU)(U 0 PU)0 density funtion. Observe that properties of the lassial normal distribution in real spae have allowed us to prove the loseness under perturbation, power transformation, permutation and subompositions of the normal on S D family. Nevertheless, given x NSD (; ), it has not seem possible up to now to desribe the distribution of any amalgamation in terms of the distribution x. In partiular, we have been unable to nd a matrix relationship between the ilr oeÆients of x and the orresponding amalgamated omposition. Following the methodology stated by Pawlowsky-Glahn, Egozue, and Burger (2003), we an dene the expeted value of a normal on S D distributed random omposition: Property 5 Let be x a D-part random omposition with a NSD (; ) distribution and let be = (1 ; 2 ; : : : ; D 1 )0 . Then E[x℄ = (1 e1) (2 e2) : : : (D 1 eD 1 ), where fe1 ; e2; : : : ; eD 1 g is an orthonormal basis of S D . Proof. The expeted value of any random vetor is an element of the support spae. If we apply the standard denition of the expeted value to the oeÆients of omposition x with respet to the orthonormal basis fe1 ; e2 ; : : : ; eD 1 g using density (10), we obtain the oeÆients of omposition E[x℄ with respet to the onsidered orthonormal basis. Applying standard integration methods we have ilr(E[x℄) = . Finally, we obtain the omposition E[x℄ through the linear ombination (1 e1 ) (2 e2 ) : : : (D 1 eD 1 ). Reall that the vetor denotes the expeted value of the ilr(x) vetor. Then we have that ilr(E[x℄) = E[ilr(x)℄. Also, equality (8) says that the vetor of oeÆients with respet to the orthonormal basis of omposition en(x) is E[ilr(x)℄. Consequently we have that E[x℄ = en[x℄. In the additive logisti normal ase, we an ompute the expeted value using the standard proedure. But, as Aithison (1986) adverts, the integral expressions are not reduible to any simple form and it is neessari to apply Hermitian integration to obtain numerial results. But, ertainly, the expeted value is not equal to the omposition en(x) obtained in the normal in S D ase. Property 6 Let be x a D-part random omposition with a NSD (; ) distribution. Then a dispersion measure around the expeted value is Mvar[x℄ = trae(). Proof. The metri variane is dened as Mvar[x℄ = E[d2a (x; en[x℄)℄. Given x NSD ( ; ) we know from property 5 that en[x℄ = E[x℄. Then, the metri variane is a dispersion measure around the expeted value Mvar[x℄ = E[d2a (x; E[x℄)℄. The distane da between two elements is equal to the Eulidean distane deu between the orresponding oeÆients with respet to an orthonormal basis. Then, we an write Mvar[x℄ = E[d2eu (ilr(x); E[ilr(x)℄)℄. This value orresponds to the trae of matrix var(ilr(x)). Finally, using the ovariane matrix of a normal distribution in real spae we obtain Mvar[x℄ = trae(). As Aithison (1986) averts, we annot interpret the rude ovariane or orrelations. Therefore, we always ompute the ovarianes and orrelations of the oeÆients with respet to the orthonormal basis. In the ase of the normal on S D model, the ovarianes between omponents are equal to the o-diagonal elements of matrix . 3.2 Inferential aspets Given a ompositional data set X, the estimates of parameters and an be alulated through the sample mean and sample ovariane matrix of the ilr oeÆients of the data set: ^ = ilr(X) b = var(ilr(X)): With these values we an obtain the estimates of E[x℄ and Mvar[x℄ as d E[ x℄ = (^1 e1 ) (^2 e2 ) (^D 1 eD 1 ); \X℄ = trae( b ): Mvar[ d \X℄ are onsistent and minimize the variane. This an be proved The estimators E[ x℄ and Mvar[ using properties of estimators for the normal law in real spae. To validate the normal on S D law, we have only to apply a goodness-of-t test for the multivariate normal distribution to the oeÆients with respet to an orthonormal basis of sample X. Unfortunately the most ommon tests of normality as the Anderson-Darling or the Kolmogorov-Smirnov tests are dependent on the orthonormal basis hosen. But in this partiular ase, we an reprodue the singular value deomposition and a power-perturbation haraterisation of ompositional variability of the random omposition as proposed by Aithison et al. (2003). 3.3 Another parametrization Aithison (1986) introdues the additive logisti normal distribution using the additive logratio transformation, but equivalent parametrizations using the entred logratio transformation (Aithison, 1986, p. 116) or the isometri logratio transformation (Mateu Figueras, 2003, p. 78) ould be obtained. The normal on S D law is dened using the ilr oeÆients. Given the similarity among the alr, lr and ilr oeÆients and the additive logratio, entred logratio and isometri logratio vetors, it is natural to ask about the expression of the density funtion in terms of the alr or lr oeÆients. To avoid large expressions we will use v to denote the oeÆients of omposition x with respet to an orthonormal basis and y to denote the oeÆients of x with respet to the B basis introdued earlier in setion 1. Given a NSD (; ) law, the probability of any event A S D is Z 1 (v )0 1 (v ) dv; P (A) = (2) (D 1)=2 j j 1=2 exp 2 A where A denotes the oeÆients of A with respet to the orthonormal basis. Matrix FU is a hange of basis matrix from the orthonormal basis to the B basis. Then, the transformation v = (FU) 1 y makes eetive this hange of basis and we obtain Z (2) (D 1)=2 exp 1 (FU) 1 y 0 1 (FU) 1 y 1 dy P (A) = 2 jFUj FUA j j1=2 Z (D 1)=2 (2 ) 1 0 1 1 0 1 = exp 2 (y FU) ((FU) ) (FU) (y FU) dy FUA j j1=2 jFUj Z 1 (2 ) (D 1)=2 0 1 exp 2 (y ) (y ) dy; = FUA j j1=2 where = FU and = FU(FU)0 . Using (6) it is easy to see that = E[y℄, = var[y℄ and j j1=2 jFUj =j j1=2 . Observe that FUA represents the oeÆients of the event A with respet to the B basis. In onlusion, the normal on S D law using alr(x), and parametrization an be obtained. But reall that the basis B is not orthonormal, and therefore the Eulidean distane between two alr oeÆients is not equal to the Aithison distane between the orresponding ompositions. Thus, it will have no sense to use the density of the alr oeÆients in the proedures that use distanes or salar produts. In a similar way, we ould work with the density of the lr oeÆients. Even though the Eulidean distane between two lr oeÆients is equal to the Aithison distane between the orresponding ompositions, we have an additional diÆulty: the density funtion of lr(x) oeÆients is degenerate. 4 The skew-normal distribution on SD 4.1 Denition and properties Denition 2 Let be ( ; F ; p) a probability spae. A random omposition x : ! S D is said to have a skew-normal on S D distribution with parameters , and % (APPENDIX), if the density funtion of the oeÆients with respet to an orthonormal basis of S D is 1 0 1 (D 1)=2 1=2 fx (x) = 2(2) j j exp 2 (ilr(x) ) (ilr(x) ) (13) 0 1 % (ilr(x) ) ; where is the N (0; 1) distribution funtion and is the square root of diag(), where diag() stands for the matrix obtained putting to zero all the o-diagonal elements of . We use the notation x SN DS (; ; %). The subsript S indiates that it is a model on the simplex and the supersript D indiates the number of parts of the omposition. As in the normal on S D ase, we use the notation and to represent the parameters of the model, but in this ase neither nor denote the expeted value and ovariane matrix of the ilr(x) vetor. The density (13) orresponds to a density funtion of a skew-normal random vetor in RD 1 . This is the reason why we all it the skew-normal on S D law. We want to insist that (13) is the density of the oeÆients of x with respet to an orthonormal basis on S D , and therefore it is a RadonNikodym derivative with respet to the Lebesgue measure in RD 1 , the oeÆients spae. This allows us to ompute the probability of any event A S D with an ordinary integral as P (A) = with Z A 2(2) D ( = 1) 2 jj = 1 2 exp [M℄ %0 1 (ilr(x) ) d(ilr(x)); M = 21 (ilr(x) )0 1 (ilr(x) ) ; where A represents the oeÆients of the event A with respet to the onsidered orthonomal basis, and is the Lebesgue measure in RD 1 . Like for the normal on S D law, whih is equivalent, on S D , in terms of probabilities, to the additive logisti normal law, we an prove that the skew-normal on S D law is equivalent, on S D and in terms of probabilities, to the additive logisti skew-normal law studied in Mateu Figueras (2003). To dene the additive logisti skew-normal distribution we onsider xD = 1 x1 x2 xD 1 and S D RD . Its density funtion is the Radon-Nikodym derivative of the probability with respet to the Lebesgue measure in RD 1 , and it is obtained using the additive logratio transformation. In Mateu Figueras (2003, p. 96) the density funtion using the isometri logratio transformation is provided, and onsequently the probability of any event A S D is 2D 1=2 P (A) = A (2 )(D Z Q D = i=1 xi 1) 2 with jj 1 exp [M℄ %0 1 (ilr(x) ) dx1 dx2 dxD 1 ; = 1 2 (14) M = 1 (ilr(x) )0 1 (ilr(x) ) ; 2 where now the vetor ilr(x) denotes the isometri logratio transformation of x. Observe that the above densityQfuntion is obtained through the transformation tehnique beause it ontains the term D 1=2 ( Di=1 xi ) 1 , the jaobian of the isometri logratio transformation. We also use the notation , and % for the parameters of the model, but in this ase neither nor orrespond to the expeted value and the ovariane matrix of the isometri logratio transformed vetor. Now it is important to orretly interpret the ilr(x) vetor as the isometri logratio vetor or as the vetor of oeÆients with respet to an orthonormal basis, both denoted as ilr(x). Therefore, to avoid possible onfusions, we denote as v = (v1 ; v2 ; : : : ; vD 1 )0 the oeÆients with respet to an orthonormal basis of omposition x, and as ilr(x) its isometri logratio transformed vetor. Thus, given x SN DS (; ; %), we an write the probability of any event A S D as Z P (A) = 2(2) A D ( jj = 1) 2 with = 1 2 exp [M ℄ %0 1 (v ) dv1 dv2 dvD 1 ; (15) M = 1 (v )0 1 (v ) ; 2 where A represents the oeÆients of A with respet to the hosen orthonormal basis. In both ases, the expressions (14) and (15) are standard integrals of a real valued funtion and we an apply all the standard proedures. In partiular we an apply the Q hange of variable theorem in expression (15) and, taking v = ilr(x), whose jaobian is D 1=2 ( Di=1 xi ) 1 , we obtain the equality P (A) = Z A 2(2) D ( = 1) 2 2D 1=2 = ilr 1 (A) (2)(D Z where jj Q D = i=1 xi 1) 2 exp [M ℄ %0 1 (v ) dv1 dv2 dvD = 1 2 jj = 1 1 1 2 exp [M℄ %0 1 (ilr(x) ) dx1 dx2 dxD 1 ; M = 1 ( v ) 0 1 ( v ) ; 2 M = 12 (ilr(x) )0 1 (ilr(x) ) : The seond term of this equality agrees with (14) beause ilr 1 (A ) = A. Observe that the probability of any event A S D is the same using both models. In these ases we say that both models are equivalent on S D in terms of probabilities. But the skew-normal on S D and the additive logisti skew-normal models present essential dierenes. The two density funtions will dier and some properties that depend on the spae struture also dier. Next, we study the prinipal properties of the skew-normal on S D model. To be oherent with the rest of our work, we revert to using the notation ilr(x) to indiate the oeÆients with respet to an orthonormal basis of S D of omposition x. Property 7 Let be x a D-part random omposition with a SN DS (; ; %) distribution. Let be a 2 S D a vetor of onstants and b 2 R a salar. Then, the D-part omposition x = a (b x) has a SN DS (ilr(a) + b; b2 ; %) distribution. Proof. The ilr oeÆients of omposition x are obtained from a linear transformation of the ilr oeÆients of omposition x beause ilr(x ) = ilr(a)+ bilr(x). We deal with the density funtion of the ilr(x) oeÆients as a density funtion in real spae. As this density is a lassial skew-normal density, we use the linear transformation property (APPENDIX) to obtain the density funtion of the ilr(x ) oeÆients. Therefore x SN DS (ilr(a) + b; b2 ; %). Property 8 Let be a random omposition x SN DS (; ; %) and a 2 S D a vetor of onstants. Then fax(a x) = fx (x), where fax and fx represent the density funtions of random ompositions x and a x respetively. Proof. Using Property 7 we have that a x SN D S (ilr(a) + ; ; %). Therefore, fax(a x) = 2(2) (D jj = 1 0 exp 2 (ilr(a x) (ilr(a) + )) (ilr(a x) (ilr(a) + )) %0 (ilr(a x) (ilr(a) + )) = 2(2) D = j j = exp 21 (ilr(a) + ilr(x) (ilr(a) + ))0 (ilr(a) + ilr(x) (ilr(a) + )) %0 (ilr(a) + ilr(x) (ilr(a) + )) = 2(2) D = j j = exp 21 (ilr(x) )0 (ilr(x) ) %0 (ilr(x) ) = fx(x); = 1) 2 1 2 1 1 ( 1) 2 1 2 1 1 ( 1) 2 1 1 2 1 as was indiated in (9). Note that Property 8 is not hold true for the additive logisti skew-normal distribution. Property 9 Let be x a D-part random omposition with a SN DS (; ; %) distribution. Let be xP = Px the omposition x with the parts reordered by a permutation matrix P. Then xP has a SN DS (P ; P ; %P ) distribution with P P 1 B0 % ; 1 + %0 ( 1 1 BP 1 B0 )% P = U0 PU ; P = (U0 PU)(U0 PU)0 ; %P = q where U is a D (D 1) matrix with vetors lr(ei ) (i = 1; 2; : : : ; D 1) as olumns, B = 1 (U0 P0 U), and and P are the square roots of diag() and diag(P ), respetively. Proof. In property 3 we have seen that ilr(xP ) = (U0 PU) ilr(x). Applying the hange of variable theorem or the linear transformation property of the skew-normal distribution in real spae, we obtain a SN DS (P ; P ; %P ) distribution for the random omposition xP . Property 10 Let be x a D-part random omposition with a SN DS (; ; %) distribution. Let be s = C (Sx) a C -part subomposition obtained from the C D seletion matrix S. Then s has a SN CS (S ; S ; %S ) distribution, with S S 1 B0 % ; S = U 0 SU; S = (U 0 SU)(U 0 SU)0 ; %S = q 1 + %0 ( 1 1 BS 1 B0 )% where U is a D (D 1) matrix with the lr oeÆients of an orthonormal basis of S D as olumns, U is a C (C 1) matrix with the lr oeÆients of an orthonormal basis of S C as olumns, B = 1 (U0 S0 U ), and and S are the square roots of diag() and diag(S ), respetively. Proof. In property 4 we have seen that ilr(s) = U 0 SU ilr(x). Given the density of the ilr(x) oeÆients and applying the hange of variable theorem or the linear transformation property of the skew-normal distribution in real spae, we obtain the density of the ilr(s) oeÆients that orresponds to a SN CS (S ; S ; %S ) density funtion. We have seen that the skew-normal on S D family is losed under perturbation, power transformation, permutation and subompositions. Again, given x SN DS (; ; %), it has not been possible up to now to desribe the distribution of any amalgamation in terms of the distribution of x beause we have not a matriial relationship between both ompositions. We an also ompute the expeted value of a skew-normal in S D distributed random omposition. Property 11 Let be x a D-part omposition with a SN DS (; ; %) distribution. Then, E[x℄ = D (1 e1 ) (2 pe2 ) : : : (D 1 eD 1 ), with fe1 ; e2 ; : : : ; eD 1 g an orthonormal basis of S and = + Æ 2=, where Æ is a parameter related with % following the equality (17), and is the square root of diag(). Proof. The expeted value of any random omposition is an element of the support spae. From the oeÆients of x with respet to the orthonormal basis fe1 ; e2 ; : : : ; eD 1 g and from the density (13), we obtain the oeÆients of the vetor E[x℄ with respet to the same orthonormal basis. Using the p expeted value of the skew-normal distribution in real spae we have that E[ilr(x)℄ = + Æ 2= denoted as . Finally, we obtain omposition E[x℄ through the linear ombination (1 e1 ) (2 e2 ) : : : (D 1 eD 1 ). In this ase we have obtained E[ilr(x)℄ = . Using (8) we also onlude that ilr(en[x℄) = and onsequently we have en[x℄ = E[x℄ = (1 e1 ) (2 e2 ) : : : (D 1 eD 1 ). This is an essential dierene between the skew-normal on S D law and the additive logisti skew-normal law. As it is observed in Mateu Figueras (2003), we have not the equality between en[x℄ and the expeted value of an additive logisti skew-normal model. Property 12 Let be x a D-part random omposition with a SN DS (; ; %) distribution. A dis persion measure around the expeted value is Mvar[x℄ = trae (2=)ÆÆ0 , where is the square root of diag(), and Æ is the parameter related with % following the equality (17). Proof. The metri variane is dened as Mvar[x℄ = E[d2a (x; en[x℄)℄. We know from Property 11 that en[x℄ = E[x℄. Then the metri variane is a dispersion measure around the expeted value Mvar[x℄ = E[d2a (x; E[x℄)℄. The distane da between two ompositions is equal to the Eulidean distane deu between the orresponding ilr oeÆients. Therefore, we an write Mvar[x℄ = E[d2eu (ilr(x); E[ilr(x)℄)℄. This value orresponds to the trae of matrix var(ilr(x)). Finally, using the ovariane matrix of a skew-normal distribution in real spae we obtain Mvar[x℄ = trae( (2=)ÆÆ 0 ). 4.2 Inferential aspets Given a ompositional data set X, the estimates of parameters , and % an be alulated applying the maximum likelihood proedure to the ilr oeÆients of the data set. These estimates annot be expressed in analyti terms and we have to use numerial methods to ompute an approximation from the sample. The estimated values ^; b and %^ allow us to ompute the estimates of the expeted value and metri variane of omposition x: d E[ x℄ = (^1 e1 ) (^2 e2 ) (^D 1 eD 1 ); 2 b bÆ bÆ0 b ; \X℄ = trae b Mvar[ p where ^ = ^ + ^ Æ^ 2= and ^ is the square root of diag(b ). The normal model in S D is a partiular ase of the skew-normal model in S D beause it orresponds to the ase % = 0. Thus, to deide if a skew-normal on S D model is better than a normal on S D model, it suÆes to test the null hypothesis H0 : % = 0 versus the hypothesis H1 : % = 6 0 applying a likelihood ratio test to the ilr oeÆients of the sample. To validate the distributional assumption of skew-normality in S D , we have only to apply some goodness-of-t tests of multivariate skew-normal distribution to the ilr oeÆients of the sample data set. Mateu-Figueras et al. (2003) and Dalla-Valle (2001) have reently developed some tests for the skew-normal distribution. Unfortunately, these tests are dependent on the orthonormal basis hosen. 4.3 Another parametrization We have dened the skew-normal on S D law using the ilr oeÆients of omposition x. Nevertheless we an obtain the expression of the density funtion using the alr oeÆients. To avoid large expressions we denote as v and y the oeÆients of omposition x with respet to an orthonormal basis and to the B basis introdued in setion 1 respetively. In terms of the v oeÆients, the probability of any event A S D is Z 1 0 1 (D 1)=2 1=2 P (A) = 2(2) j j exp 2 (v ) (v ) %0 1 (v ) dv; A where A represents the oeÆients of A with respet to the orthonormal basis. Matrix FU is a hange of basis matrix from the orthonormal basis to the B basis. Then, the transformation v = (FU) 1 y makes eetive this hange of basis and we obtain Z 2(2) (D 1)=2 exp [M℄ %0 1 ((FU) 1 y ) 1 dy; P (A) = jFUj j j1=2 FUA Z (D 1)=2 2(2) exp [M ℄ %0 1 (FU) 1 (y FU) dy; = 1=2 jFUj j j FUA Z 0 2(2 ) (D 1)=2 1 0 1 1 ( y ) ( y ) = exp ! ( y ) dy; 1 = 2 2 jj FUA with M = 21 (FU) 1 y 0 1 (FU) 1 y ; M = 21 (y (FU))0 ((FU) 1 )0 1 (FU) 1 (y (FU) ) ; where = FU, = (FU)(FU)0 , = !((FU) 1 )0 1 %, and ! is the square root of diag(). Observe that FUA represents the oeÆients of A with respet to the B basis. In this ase neither nor represent the expeted value and the ovariane matrix of the oeÆients y. In onlusion, an equivalent expression of the the skew-normal on S D law using alr(x), , and parametrization an be obtained and used in pratie. But reall that the basis B is not orthonormal, and therefore the Eulidean distane between two alr oeÆients is not equal to the Aithison distane between the orresponding ompositions. Thus, it will have no sense to use the density of the alr oeÆients in the proedures that use distanes or salar produts. We ould also dene the skew-normal on S D law using the oeÆients with respet to the generating set B but we will obtain a degenerate density funtion. At the moment we have not the denition of the degenerate skew-normal model in real spae. 5 Conlusions The vetor spae struture of the simplex allows us to dene parametri models instead of using transformations to real spae keeping the usual Lebesgue measure. We have dened the normal model on S D and the skew-normal model on S D through their density over the oeÆients with respet to an orthonormal basis. In terms of probabilities of subsets of S D , the normal on S D and the skew-normal on S D laws are idential to the additive logisti normal and the additive logisti skew-normal distribution. Nevertheless their density funtions and some properties are dierent. For example, for the models dened on S D using the oeÆients, the expeted value is equal to the enter of the random omposition. An additional reason, to prove properties using the oeÆients with respet to an orthonormal basis, is that we an apply standard real analysis to them. Aknowledgements This researh has reeived nanial support from the Direion General de Ense~nanza Supeof the Spanish Ministry of Eduation and Culture through the projet BFM2000-0540 and from the Direio General de Reera of the Departament d'Universitats, Reera i Soietat de la Informaio of the Generalitat de Catalunya through the projet 2001XT 00057. rior e Investigaion Cienta (DGESIC) Referenes Aithison, J. (1986). The Statistial Analysis of Compositional Data. London (UK): Chapman and Hall. Aithison, J. (1992). On riteria for measures of ompositional dierene. Mathematial Geology vol. 24 (no. 4), pp. 365{379. Aithison, J. (1997). The one-hour ourse in ompositional data analysis or ompositional data analysis is simple. In V. Pawlowsky-Glahn (Ed.), Proeedings of IAMG'97 | The third annual onferene of the International Assoiation for Mathematial Geology, Volume I, II and addendum, pp. 3{35. International Center for Numerial Methods in Engineering, Barelona (E). Aithison, J. (2002). Simpliial inferene. In M. A. G. Viana and D. S. P. Rihards (Eds.), Algebrai Methods in Statistis and Probability, Volume 287 of Contemporary Mathematis Series, pp. 1{22. Amerian Mathematial Soiety, Providene, Rhode Island (USA). Aithison, J., G. Mateu-Figueras, and K. Ng (2003). Charaterisation of distributional forms for ompositional data and assoiated distributional tests. Submitted to Mathematial Geology . Aithison, J. and S. Shen (1980). Logisti-normal distributions: some properties and uses. Biometrika vol. 67 (no. 2), pp. 261{272. Azzalini, A. and A. Capitanio (1999). Statistial appliations of the multivariate skew-normal distribution. J.R. Statist. So. B vol. 61 (no. 3), pp. 579{602. Dalla-Valle, A. (2001). A test for the hypothesis of skew normality in a population. Submitted to Journal of Statistial Computation and Simulation . Egozue, J. J., V. Pawlowsky-Glahn, G. Mateu-Figueras, and C. Barelo-Vidal (2003). Isometri logratio transformations for ompositional data analysis. Mathematial Geology 35 (3), pp. 279{300. Martn-Fernandez, J. A. (2001). Medidas de diferenia y lasiaion no parametria de datos omposiionales. Ph. D. thesis, Universitat Politenia de Catalunya, Barelona (E). Mateu Figueras, G. (2003). Models de distribuio sobre el smplex. Ph. D. thesis, Universitat Politenia de Catalunya, Barelona (E). Mateu-Figueras, G., C. Barelo-Vidal, and V. Pawlowsky-Glahn (1998). Modeling ompositional data with multivariate skew-normal distributions. In A. Buianti, G. Nardi, and R. Potenza (Eds.), Proeedings of IAMG'98 | The fourth annual onferene of the International Assoiation for Mathematial Geology, Volume I and II, pp. 532{537. De Frede Editore, Napoli (I). Mateu-Figueras, G. and V. Pawlowsky-Glahn (2003). Una alternativa a la distribuion lognormal. In J. Saralegui and E. Ripoll (Eds.), Atas del XXVII Congreso Naional de la Soiedad de Estadstia e Investigaion Operativa (SEIO), pp. 1849{1858. Soiedad de Estadstia e Investigaion Operativa, Lleida (E). Mateu-Figueras, G., P. Puig, and C. Barelo-Vidal (2003). Goodness-of-t test for the skewnormal distribution with unknown parameters. (In preparation). Pawlowsky-Glahn, V., J. Egozue, and H. Burger (2003). Proeedings of iamg'03 | the annual onferene of the international assoiation for mathematial geology. Portsmouth (UK). International Assoiation for Mathematial Geology. CD-ROM. Pawlowsky-Glahn, V. and J. J. Egozue (2001). Geometri approah to statistial analysis on the simplex. SERRA vol. 15 (no. 5), pp. 384{398. Pawlowsky-Glahn, V. and J. J. Egozue (2002). BLU estimators and ompositional data. Mathematial Geology vol. 34 (no. 3), pp. 259{274. APPENDIX. The skew-normal distribution The multivariate skew-normal distribution was studied in detail by Azzalini and Capitanio (1999). Aording to them, a D-variate random vetor y is said to have a multivariate skew-normal distribution if it is ontinuous with density funtion 1 0 1 D= 2 1=2 (16) 2(2) j j exp 2 (v ) (v ) (%0 1 (y )); where is the N (0; 1) distribution funtion and is the square root of diag(). The % parameter is a D-variate vetor whih regulates the shape of the distribution and indiates the diretion of maximum skewness. When % = 0 the random vetor y redues to a N D (; ) distributed vetor. We will use the notation y SN D (; ; %) to symbolize a random vetor with a density funtion given by (16). We know that eah omponent of y is univariate skew-normal distributed. Its marginal skewness index an be omputed using the parameter % and it varies only in the interval ( 0:995; +0:995). In the multivariate ase, we an also onsider a multivariate index of skewness. This multivariate index is also bounded aording to the salar ase. Consequently the skew-normal family allows densities with a moderate skewness. p Given y SN D (; ; %), the expeted value is E(y) = + Æ 2= and the ovariane matrix is var(y) = (2=)ÆÆ0 , where Æ is a D-variate vetor related to % parameter as 1 Æ=p 1 1 %: (17) 0 1 1 1 + % % Azzalini and Capitanio (1999) provide a wide range of properties for the multivariate skew-normal distribution, most of them similar to the properties of the multivariate normal distribution. Linear transformation property. If y SN D (; ; %) and A is a D H matrix of onstants, then y = A0 y SN H ( ; ; % ), with ( ) 1 B0 % ; 1 + %0 ( 1 1 B( ) 1 B0 )% where B = 1 A and and are, respetively, the square root of diag() and diag( ). In partiular, if A is a non singular and square matrix, then % = A 1 1 %. = A0 ; = A0 A; % = p Given a sample, to nd the estimates ^; ^ and %^ of the parameters ; and %, we apply the maximum likelihood proedure. But the estimates annot be expressed in analyti terms and we have to use numerial methods (e.g. Newton-Raphson or the generalized gradient method) to ompute an approximation from a sample. There are however some problems. In the univariate ase, for example, there is always an inetion point at % = 0 of the prole log-likelihood and the shape of this funtion ould be problemati and slows the onvergene down. Azzalini and Capitanio (1999, p. 591) suggest to substitute the parameter % by a new parameter = 1 % in the likelihood funtion. Then the parameters and appear well separated in two fators in the loglikelihood funtion and we an exploit the fatorization property whih makes the omputation of estimates easier. Initial estimates of the parameters neessary to start the iterative numerial proedure an be alulated from the sample by the method of moments, using the relations of the mean vetor, ovariane matrix and the skewness vetor with the parameters ; and % (respetively ). But there are still ases where the behaviour of the maximum likelihood estimates appears unsatisfatory beause, with nothing pathologial in the data pattern, the shape parameter tends to its maximum value. In these ases we have to adopt a temporary solution suggested by Azzalini and Capitanio (1999, p. 591): when the maximum of % ours on the frontier, re-start the maximization proedure and stop it when it reahes a loglikelihood value not signiantly lower than the maximum.