Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PROBABILITY METRICS AND UNIQUENESS OF THE SOLUTION TO THE BOLTZMANN EQUATION FOR A MAXWELL GAS G. TOSCANI AND C. VILLANI Abstract. We consider a metric for probability densities with finite variance on Rd , and compare it with other metrics. We use it for several applications, including a uniqueness result for the solution of the spatially homogeneous Boltzmann equation for a gas of true Maxwell molecules. Key-words : spatially homogeneous Boltzmann equation, probability metrics, Maxwellian molecules. 1. Introduction Denote by Ps (Rd ), s > 0, the class of all probability distributions F on Rd , d ≥ 1, such that Z |v|s dF (v) < ∞. Rd We introduce a metric on Ps (Rd ) by (1) |fb(ξ) − gb(ξ)| |ξ|s ξ∈Rd ds (F, G) = sup where fb is the Fourier transform of F , Z fb(ξ) = e−iξ·v dF (v). Rd Let us write s = m + α, where m is an integer and 0 ≤ α < 1. In order that ds (F, G) be finite, it suffices that F and G have the same moments up to order m. The norm (1) has been introduced in [6] to investigate the trend to equilibrium of the solutions to the Boltzmann equation for Maxwell molecules. There, the case s = 2 + α, α > 0, was considered. Further applications of ds , with s = 4, were studied in [3], while the cases s = 2 and s = 2 + α, α > 0, have been considered in [4] in connection with the so-called Mc Kean graphs [8]. 1 2 G. TOSCANI AND C. VILLANI In this paper, we shall be interested with the case s = 2. To understand why this case separates in a natural way from the other ones, let us briefly introduce and discuss other well-known metrics on Ps (Rd ). Let F , G in Ps (Rd ), and let Π(F, G) be the set of all probability distributions L in Ps (Rd × Rd ) having F and G as marginal distributions. Let Z (2) Ts (F, G) = inf |v − w|s dL(v, w). L∈Π(F,G) 1/s Then τs = Ts metrizes the weak-* topology T W∗ on Ps (Rd ). We note that T1 is the Kantorovich-Vasershtein distance of F and G [7, 18]. For a detailed discussion, and application of these distances to statistics and information theory, see Vajda [17]. See also [10] for a recent application to kinetic theory. The case s = 2 was introduced and studied independently by Tanaka [15] who, in the one-dimensional case d = 1, applied T2 to the study of Kac’s equation. Subsequently, the properties of T2 were studied in the multidimensional case by Murata and Tanaka [9]. Applications to the kinetic theory of rarefied gases were finally studied by Tanaka in [16]; several of these applications were given a simplified proof in [12]. The importance of Tanaka’s distance τ2 mainly relies on its convexity and superadditivity with respect to rescaled convolutions. We recall this property, that is at the basis of most of the applications of T2 . Let {X0 , Y0 }, {X1 , Y1 } be two independent pairs of random variables, and let Fi (resp. Gi ) be the probability distribution of Xi (resp. Yi ), i = 0, 1. Gλ√ ) be the probability distribution √ For 0 < √ √ λ < 1, let Fλ (resp. of λX0 + 1 − λX1 (resp. λY0 + 1 − λY1 ), i.e. µ ¶ µ ¶ 1 · · 1 (3) Fλ = d/2 F0 √ ∗ F1 √ . λ (1 − λ)d/2 1−λ λ Then, (4) T2 (Fλ , Gλ ) ≤ λT2 (F0 , G0 ) + (1 − λ)T2 (F1 , G1 ). Superadditivity is also known for convex functionals (relative entropies), like Boltzmann’s relative entropy Z f (v) f dv, (5) H(f |M ) = f (v) log f M (v) Rd where f is a probability density and M f is the Gaussian density with the same mean vector and variance as those of f . This means that the property (4) holds with T2 replaced by H and {G0 , G1 } replaced by {M f0 , M f1 }. This is a consequence of Shannon’s entropy power PROBABILITY METRICS AND UNIQUENESS 3 inequality (Cf [13, 1]). The same property holds for the relative Fisher information, Z ¯ ¯ f ¯∇ log f (v) − ∇ log M f (v)¯2 f (v) dv (6) I(f |M ) = Rd (see again [13, 1]). As discussed by Csiszar [5], by means of the relative entropy H, one can define the so-called H-neighbourhoods. Even if those do not define a topological space, in the usual sense, their topological structure is finer than the metric topology defined by the total variation distance, Z ν(f, g) = |f (v) − g(v)| dv, in the sense that (7) ν(f, g) ≤ p 2H(f |g), which is the so-called Csiszar-Kullback inequality. It turns out that these properties of superadditivity and convexity also hold for d2 , the proofs being in fact much more simple. As an illustration of the interest of these properties, we shall give a version of the central limit theorem and a very simple proof of Kac’s theorem. We shall also apply d2 to the study of the Boltzmann equation with Maxwellian molecules, ∂f (8) (t, v) = Q(f, f )(t, v) ∂t µ ¶ Z u·n = σ [f (v 0 )f (w0 ) − f (v)f (w)] dw dn, |u| 3 2 R ×S where u = v − w is the relative velocity of colliding particles with velocity v and w, and v + w |u| v + w |u| v0 = + n, w0 = − n 2 2 2 2 are the postcollisional velocities. σ(ν) is a nonnegative function which for true Maxwell molecules has a nonintegrable singularity of the form (1 − ν)−5/4 as ν → 1. Usually, one truncates σ in some way, so that it become integrable (cut-off assumption). We shall prove that d2 shares a remarkable property with Tanaka’s distance : it is nonexpanding with time along trajectories of the Boltzmann equation; that is, if f and g are two such solutions, (9) d2 (f (t), g(t)) ≤ d2 (f (0), g(0)). This holds even if σ is singular. As an immediate corollary, we obtain that the solution to the Cauchy problem for the Boltzmann equation is 4 G. TOSCANI AND C. VILLANI unique. Up to our knowledge, this is so far the only uniqueness result available for long-range interactions. The organization of the paper is as follows. First, in section 2, we investigate the connections between several distances on P2 (Rd ), including d2 and τ2 . In section 3, we establish the basic properties of superadditivity for d2 and give applications. Then, in section 4, we apply this metric to the study of the Boltzmann equation. 2. Metrics on P2 (Rd ) In order that d2 be well-defined, we need to restrict it to some space of probability densities with the same mean vector. For simplicity, we shall restrict to probability measures with zero mean vector, and we shall work on ½ ¾ Z Z d 2 (10) Dσ = F ∈ P2 (R ); vi dF (v) = 0, |v| dF (v) = d · σ where σ is some positive real number. We begin with two elementary lemmas. Lemma 1. Let (Fn ) in Dσ , Fn * F in P0 (Rd ). Then Z (11) F ∈ Dσ ⇐⇒ lim sup |v|2 1|v|≥K dFn (v) = 0. K→∞ n→∞ R Proof. It is clear that if limK→∞ supn→∞ |v|2 1|v|≥K dFn (v) = 0, then F ∈ Dσ . On the other hand, if this condition is not satisfied, then R 2 there exists ε > 0 and (Kn ) → ∞ with |v| R 1|v|≤Kn dFn (v) ≤ d · σ − ε. Since |v|2 1|v|≤Kn * |v|2 , this implies that |v|2 dF (v) ≤ dσ − ε, hence F ∈ / Dσ . ¤ Lemma 2. Let (Fn ) and F in Dσ , such that Fn * F . Then, there exists a nonnegative function φ such that φ(r)/r −→ ∞ as r → ∞, which can be chosen smooth and convex, and a constant M > 0, such that Z (12) sup φ(|v|2 ) dFn (v) ≤ M. n Proof. Using Lemma 1, copy the construction thatRwas done in [6] for one single function : there exists k1 such that supn |v|≥k1 |v|2 dFn (v) ≤ R 1/2; then there exists k2 ≥ k1 such that supn |v|≥k2 |v|2 dFn (v) ≤ 1/4; and so on... for all p ≥ 2, there exists kp ≥ kp−1 such that Z 1 sup |v|2 dFn (v) ≤ p . 2 n |v|≥kp PROBABILITY METRICS AND UNIQUENESS 5 Then choose φ(|v|2 ) = p|v|2 if |v| ∈ [kp , kp+1 ); smooth this function and slow its growth if necessary, as in [6]. ¤ In the sequel of the paper, for M > 0, we shall denote by ½ ¾ Z M d 2+α (13) P2+α (R ) = F ∈ Dσ ; |v| dF (v) ≤ M , (14) PφM (Rd ) ½ ¾ Z 2 = F ∈ Dσ ; φ(|v| ) dF (v) ≤ M for φ nonnegative, φ(r)/r −→ ∞ as r → ∞. Lemma 2 enables us to restrict to spaces PφM in all the cases when one is interested with weak convergence in Dσ . In addition to the metrics d2 and τ = τ2 which were introduced in the last section, we consider • Prokhorov’s distance ρ(F, G) : for δ ≥ 0 and U ⊂ Rd , we define © ª © ª U δ = v ∈ Rd ; d(v, U ) < δ , U δ] = v ∈ Rd ; d(v, U ) ≤ δ , where d(v, U ) = inf{kv − wk, w ∈ U }. Let © ª σ(F, G) = inf ε > 0/F (A) ≤ G(Aε ) + ε for all closed A ⊂ Rd ; we set (15) ρ(F, G) = max{σ(F, G), σ(G, F )}. • the (C m )∗ distance kF − Gk∗m : for m ≥ 1, let C m (Rd ) be the set of m-times continuously differentiable functions, endowed with its natural norm k · km . Then, let ¯ ½¯Z ¾ ¯ ¯ ∗ m ¯ ¯ (16) kF − Gkm = sup ¯ ϕ dF (v)¯ , ϕ ∈ C , kϕkm ≤ 1 . Theorem 1. Let (Fn ) in Dσ and F in P2 (Rd ). Then, the following statements are equivalent. (i) Fn * F and F ∈ Dσ ; (ii) τ (Fn , F ) −→ 0; (iii) ρ(Fn , F ) −→ 0 and F ∈ Dσ ; (iv) For any m ≥ 1, kFn − F k∗m −→ 0, and F ∈ Dσ ; (v) d2 (Fn , F ) −→ 0. Proof. Here we shall only show (i) ⇐⇒ (v). First, if d2 (Fn , F ) −→ 0, then obviously, for all ξ ∈ Rd \ {0}, fbn (ξ) −→ fb(ξ); on the other hand, fbn (0) = fb(0) = 1. This entails that Fn * F ; it remains to prove that F ∈ Dσ . But for fixed ξ, |ξ| = 1, t ≥ 0, since the first derivatives of 6 G. TOSCANI AND C. VILLANI fbn and fb coincide at the origin, and since their Fourier transforms are twice continuously differentiable, ¯ ¯ ¯ ¯ ¯ ¯ b b f (tξ) − f (tξ) ¯ 2 b ¯ ¯ ¯ n b D ( f − f )(0) · (ξ, ξ) = lim ¯ ¯ ¯ ¯ ≤ d2 (Fn , F ) −→ 0. n ¯t→0 ¯ t2 Conversely, suppose that Fn * F ∈ Dσ . By Lemma 2, there exists φ and M such that Fn ∈ PφM . By Lemma 3.1 of [6], all D2 fbn have a uniform modulus of continuity. In particular, for ε > 0, there exists δ > 0 such that ∀n |ξ| ≤ δ ⇒ |D2 fbn (ξ) − D2 fbn (0)| ≤ ε. Since Z 1h i µξ ξ ¶ fbn (ξ) − fb(ξ) 2b 2b = D fn (tξ) − D fn (0) · , (1 − t) dt |ξ|2 |ξ| |ξ| 0 iµ ξ ξ ¶ Z 1h iµ ξ ξ ¶ 1h 2b 2b 2b 2b + D fn (0) − D f (0) , − D f (tξ) − D f (0) · , (1−t) dt, 2 |ξ| |ξ| |ξ| |ξ| 0 we obtain |fbn (ξ) − fb(ξ)| ≤ ε. |ξ|2 |ξ|≤δ On the other hand, clearly, there exists D > 0 such that |fbn (ξ) − fb(ξ)| |ξ| ≥ D =⇒ ≤ ε. |ξ|2 sup b b f (ξ)| Thus d2 (Fn , F ) ≤ max{ε, supδ≤|ξ|≤D |fn (ξ)− } ≤ max{ε, supδ≤|ξ|≤D |ξ|2 fb(ξ)|}. Now, since Fn * F we have 1 b |f (ξ)− δ2 n ∀ξ, fbn (ξ) −→ fb(ξ); and since |D2 fbn (ξ)| ≤ dσ and Dfbn (0) = 0, (fbn ) is uniformly equicontinuous on the compact set {δ ≤ |ξ| ≤ D}. By Ascoli’s theorem, this entails that supδ≤|ξ|≤D |fbn (ξ) − fb(ξ)| goes to 0, and thus there exists n0 ≥ 0 such that for n ≥ n0 , d2 (Fn , F )) ≤ ε. ¤ Next, we would like to compare more precisely these different metrics. We shall say that two metrics m1 and m2 define the same weak-* uniformity on a set S ⊂ P2 (Rd ) if for all ε > 0 there exists η > 0 such that for all F, G ∈ S, m1 (F, G) ≤ η =⇒ m2 (F, G) ≤ ε, m2 (F, G) ≤ η =⇒ m1 (F, G) ≤ ε. PROBABILITY METRICS AND UNIQUENESS 7 Theorem 2. Let M > 0 and φ be fixed. Then, for any m ≥ 1, τ , ρ, k · k∗m and d2 define the same weak-* uniformity in PφM . In order to simplify the proof, we shall prove this theorem only on M P2+α , where the bounds are explicit. In all the sequel α > 0, M > 0 and m ≥ 1 will be fixed. The main part of the proof has already been performed in [6], therefore we shall only “fill the gaps”. We split the proof in three steps. M First step : τ and ρ define the same weak-* uniformity on P2+α . More precisely, α (17) (a) τ (F, G)2 ≤ (2M + 8)ρ(F, G) α+2 + 4ρ(F, G)2 ; (b) ρ(F, G) ≤ τ (F, G)3/2 . Proof. Part (a) is Theorem 5.2 of [6]. As for part (b), let β > 0, then there exists L∗ ∈ Π(F, G) such that Z T2 (F, G) = |v − w|2 dL∗ (v, w) ≥ β 2 L∗ (|v − w| ≥ β) . Chosing β = T2 (F, G)1/3 , we see that L∗ (|v − w| ≥ β) ≤ β. By Theorem 5.1 of [6], this implies that F (A) ≤ G(Aδ] ) + β for all closed A ⊂ Rd . Thus, ρ(F, G) ≤ β = τ (F, G)3/2 . ¤ Second step : For all m ≥ 1, ρ and k · k∗m define the same weak-* uniformity on P0 (Rd ). This is done by putting together Lemma 5.3 and Corollary 5.5 in [6]. Third step : For all m ≥ 1, k · k∗m and d2 define the same weak-* M uniformity on P2+α . More precisely, d+1 (18) 2 (a) kF − Gk∗d+3 ≤ Cd σ d+3 d2 (F, G) d+3 ; α 2 (b) d2 (F, G) ≤ Cd M 2+α (k|F − Gk∗1 ) 2+α . Proof. (a) is an easy adaptation of Lemma 5.7 in [6]. Let ϕ ∈ C d+3 (Rd ), kϕkd+3 ≤ 1; let R ≥ 1. Let χR be a smooth function such that 0 ≤ χR ≤ 1, χR = 1 for |v| ≤ R, χR (v) = 0 for |v| ≥ R + 1. We estimate first the tails of the distributions. ¯ Z ¯Z Z ¯ ¯ 2σ ¯ (1 − χR )ϕ d(F − G)¯ ≤ dF + dG ≤ 2 . ¯ ¯ R |v|≥R |v|≥R 8 G. TOSCANI AND C. VILLANI Then, by Parseval, ¯Z ¯ ¯Z ¯ ¯ ¯ ¯ ¯ b ¯ χR ϕ d(F − G)¯ = ¯ ϕχ ¯ ¯ ¯ ¯ dR (ξ)[f (ξ) − gb(ξ)] dξ ¯ £ ¤ |fb(ξ) − gb(ξ)| d+3 sup χd ) ≤ sup R ϕ(ξ)(1 + |ξ| 2 |ξ| Rd Rd Z |ξ|2 dξ. 1 + |ξ|d+3 Using the classical inequality © ª d+1 sup {(1 + |ξ|m )|χd |Dm (χR ϕ)(v)| ≤ Cd (1+R)d+1 , R ϕ(ξ)|} ≤ Cd sup (1 + |v|) ξ v we conclude that ¯ ¯Z ¯ ¯ ¯ ϕ d(F − G)¯ ≤ 2σ + Cd Rd+1 d2 (F, G). ¯ ¯ R2 Optimizing over R, we get the desired result. As for (b) : clearly, ¯Z ¯ ¯Z ¯ ¯ ¯ sin(ξ · v) ¯ |fb(ξ) − gb(ξ)| ¯¯ cos(ξ · v) ¯+¯ ¯. ≤ d(F − G) d(F − G) ¯ ¯ ¯ ¯ |ξ|2 |ξ|2 |ξ|2 Let us estimate for instance the first term in the right-hand side. First we note that Z Z cos(ξ · v) cos(ξ · v) − 1 d(F − G) = d(F − G). |ξ|2 |ξ|2 For fixed ξ, Φξ (v) = cos(ξ · v) − 1 |ξ|2 is a C 2 function, vanishing at the origin, as well as its first-order derivatives, and elementary estimates show that |Φ0ξ (v)| ≤ |v|, Φξ (v) ≤ |v|2 /2, hence kΦξ χR k1 ≤ CR2 , whence by definition ¯ ¯Z ¯ 1 ¯¯ Φξ χR d(F − G)¯¯ ≤ kF − Gk∗1 . ¯ 2 CR On the other hand, ¯ ¯Z ¯ ¯ ¯ (1 − χR )Φξ d(F − G)¯ ≤ 2M . ¯ ¯ Rα Optimizing over R, we get the result. ¤ PROBABILITY METRICS AND UNIQUENESS 9 3. Superadditivity Let {X0 , Y0 } and {X1 , Y1 } be two independent pairs of random variables with zero mean vector. Let Fi (resp. Gi ) denote the law of Xi (resp. Yi ). For 0 < λ < 1, the law of √ √ Xλ = λ X0 + 1 − λ X1 is ¶ ¶ µ 1 · · ∗ . (19) Fλ = d/2 F0 √ F1 √ λ (1 − λ)d/2 1−λ λ Theorem 3. d2 is superadditive with respect to the rescaled convolution, i.e. 1 (20) µ d2 (Fλ , Gλ ) ≤ λd2 (F0 , G0 ) + (1 − λ)d2 (F1 , G1 ). Corollary. The rescaled convolution is continuous with respect to d2 : with obvious notations, if d2 (F0n , G0 ) −→ 0 and d2 (F1n , G1 ) −→ 0, then d2 (Fλn , Gλ ) −→ 0. Proof. We denote by fbi and gbi the Fourier transforms of Fi and Gi . Since √ √ fbλ (ξ) = fb0 ( λ ξ)fb1 ( 1 − λ ξ), and an analogous formula holds for gbλ , we have √ √ √ √ |fb0 ( λ ξ)fb1 ( 1 − λ ξ) − gb0 ( λ ξ)gb1 ( 1 − λ ξ)| d2 (Fλ , Gλ ) = sup |ξ|2 ξ ( √ √ √ |fb1 ( 1 − λ ξ) − gb1 ( 1 − λ ξ)| b = sup (1 − λ)|f0 ( λ ξ)| (1 − λ)|ξ|2 ξ ) √ √ √ |fb0 ( λ ξ) − gb0 ( λ ξ)| +λ|gb1 ( 1 − λ ξ)| . λ|ξ|2 Since kfb0 k∞ ≤ 1 and kgb1 k∞ ≤ 1, this expression can be bounded by (1 − λ) d2 (F1 , G1 ) + λ d2 (F0 , G0 ). ¤ Remark. This property is sufficient to imply that d2 is nonexpandive along solutions of the Boltzmann equation if d = 2 (Cf. [2]) . But we shall show in the next section that this restriction can be dispended with. 10 G. TOSCANI AND C. VILLANI As an application, for X a random variable with law F , let us consider the functional dF | |fb − M (21) J(X) = J(F ) = d2 (F, M F ) = sup , |ξ|2 ξ where this time M F is the Gaussian distribution with the same mean vector and covariance matrix as F . The same proof as before shows the Theorem 4. For any two independent random variables X0 and X1 , and 0 < λ < 1, ³√ ´ √ (22) J λ X0 + 1 − λ X1 ≤ λJ(X0 ) + (1 − λ)J(X1 ) with equality if and only if X0 and X1 are gaussian variables with the same mean vector and covariance matrix. Proof. Suppose that there is equality in the proof of Theorem 3, with G0 and G1 replaced by a centered gaussian probability law (one can always reduce to this case). Suppose that F0 6= M F0 . Let us denote by g0 and g1 the densities of M F0 and M F1 . Let (ξn ) such that √ √ √ √ √ |fb0 ( λ ξn )fb1 ( 1 − λξn ) − gb0 ( λ ξn )gb1 ( 1 − λ ξn )| n→∞ √ −→ J( λ X + 1 − λ X1 ); 0 |ξn |2 the left-hand side is bounded by ( √ √ √ |fb1 ( 1 − λ ξn ) − gb1 ( 1 − λ ξn )| b (1 − λ)|f0 ( λ ξn )| (1 − λ)|ξn |2 ) √ √ √ |fb0 ( λ ξn ) − gb0 ( λ ξn )| . +λ|gb1 ( 1 − λ ξn )| λ|ξn |2 √ √ If |ξn | −→ ∞, then |gb1 (ξn )| ≤ 1/2 for large n, and J( λ X0 + 1 − λ X1 ) ≤ (1 − λ)J(X1 ) + λ/2J(X1 ), which is impossible since J(X1 ) 6= 0. Therefore, extracting a subsequence if necessary, we may assume that ξn −→ ξ ∈ Rd . The same argument shows then that ξ = 0. But by defintion, √ √ |fb0 ( λ ξn ) − gb0 ( λ ξn )| −→ 0 λ|ξn |2 as n → ∞. Hence, J(Xλ ) = 0, and J(X0 ) = J(X1 ) = 0. ¤ As remarked by Murata and Tanaka [9] and others, a functional having these properties can be used to several applications, as the following. PROBABILITY METRICS AND UNIQUENESS 11 First application : the central limit theorem. Let (Xn ) be a sequence of independent identically distributed variables with finite variance, and let (23) 1 ξn = √ (X1 + · · · + Xn ). n Then, if F denotes the common probability law of each Xn , Gn the probability law of ξn , and G the gaussian probability law with the same mean vector and covariance matrix as those of F , (24) d2 (Gn , G) −→ 0 as n → ∞. Moreover, given ε > 0, knowing d2 (F, G) and a modulus of continuity of D2 fˆ at the origin, one can compute explicitly n0 such that for n ≥ n0 , d2 (Gn , G) ≤ ε. Proof. For P and Q two probability let us denote by P ◦Q √ √ distributions, −d the rescaled convolution 2 P (·/ 2) ∗ Q(·/ 2). Let ηk = ξ2k , and Pk its probability law, then, thanks to Theorem 4, J(ηk+1 ) = J(Pk ◦ Pk ) ≤ J(Pk ) = J(ηk ) : (J(ηk )) is decreasing, hence converging to some limit l. Admit for a while that there exists Q so that for some subsequence kp , d2 (Pkp , Q) −→ 0, so that l = J(Q). Then, since the rescaled convolution is continuous with respect to d2 , J(Pkp ◦ Pkp ) −→ J(Q ◦ Q); but it is also J(Pkp +1 ) −→ l = J(Q), so that J(Q ◦ Q) = J(Q). This implies that Q is the gaussian distribution G, whence J(ηk ) −→ 0. P Now, any integer n ≥ 1 can be written n0 αk 2k with αk ∈ {0, 1}, and 1X αk 2k J(ηk ) −→ 0. J(ξn ) ≤ n Finally, to prove that (Pk ) has a weak cluster point with respect to note that the Fourier transform of to the topology³of d2 , it suffices ´n √ Fn is fbn (ξ) = fb(ξ/ n) , whose second derivatives can be readily computed, µ µ µ µ ¶¶n ¶ ¶ µ ¶n−2 n − 1 ξ ξ ξ ξ 2 = Dij fb √ Di fb √ Dj fb √ fb √ n n n n n µ µ ¶n−1 ¶ ξ ξ 2 b b +f √ Dij f √ . n n 12 G. TOSCANI AND C. VILLANI If one denotes by ψ a modulus of continuity of D2 fb near 0, i.e. ¯ ¯ ¯ 2b ¯ ¯D f (ξ) − D2 fˆ(0)¯ ≤ ψ(|ξ|), where ψ is chosen to be increasing from 0, we obtain at once that for D2 fbn one can take as modulus of continuity µ ¶ t 2 2 ψn (t) = σ t + ψ √ . n ψn is bounded, uniformly in n, by ψ1 (t) = σ 2 t2 + ψ(t). This is enough to conclude. Stated in this form, this would seem to be only an overcomplicated way of proving the central limit theorem; but the interest of this method is that it can immediately lead to explicit computations. Indeed, let ε > 0, and let us look for n0 such that for n ≥ n0 , d2 (Gn , G) ≤ ε. First, since J(Pk ) is decreasing, it we look for k0 such that J(Pk0 ) ≤ ε. The proof of Theorem 4 clearly shows that ( à à ! !) 2 1 + e−|ξ| /2 J(Pk+1 ) = J(Pk ◦ Pk ) ≤ sup inf ψ(|ξ|), J(Pk ) . 2 ξ Let η such that ψ(η) ≤ ε. As long as J(Pk ◦ Pk ) ≥ ε, the supremum can only be obtained for |ξ| ≥ η, hence J(Pk+1 ) ≤ µJ(Pk ) with 1 + e−η µ= 2 2 /2 < 1. Therefore, one can take k0 = − log J(X1 ) . log µ Now, for n ≥ 2k0 , one writes X αk 2k X αk 2k 2k0 J(X1 ) + ε ≤ J(X1 ) + ε, J(Xn ) ≤ n n n k≥k k<k 0 0 and it suffices to choose n ≥ 2k0 J(X1 )/ε for this expression to be less than 2ε. ¤ Second application : Kac’s theorem. Let X1 and X2 be two independent random variables with finite variance, such that ( f1 = X1 cos θ + X2 sin θ X (25) f2 = −X1 sin θ + X2 cos θ X PROBABILITY METRICS AND UNIQUENESS 13 are independent for some θ ∈ R \ π2 Z. Then X1 and X2 are gaussian. Proof. (26) f1 ) ≤ J(X1 ) cos2 θ + J(X2 ) sin2 θ; J(X f2 ) ≤ J(X1 ) sin2 θ + J(X2 ) cos2 θ; J(X f1 ) + J(X f2 ) ≤ J(X1 ) + J(X2 ). But hence J(X ( f1 cos θ − X f2 sin θ X1 = X (27) f1 sin θ + X f2 cos θ; X2 = X f1 ) + J(X f2 ), so that there by the same inequality, J(X1 ) + J(X2 ) ≤ J(X is equality in (26), which implies that X1 and X2 are gaussian. ¤ 4. Application to the Boltzmann equation In this section, we shall assume d = 3 for simplicity, but all the results can be generalized readily to any dimension d ≥ 2, or to the onedimensional Kac model. The Boltzmann equation (8) can be studied in weak form for a probability measure as well as for a distribution function, µ ¶ Z Z d u·n ϕ(v) dF (v) = σ {ϕ(v 0 ) − ϕ(v)} dF (v) dF (w) dn; dt |u| 3 3 2 R ×R ×S we shall study this with the conditions Z Z Z dF0 (v) dv = 1, v dF0 (v) = 0, v 2 dF0 (v) = 3; it is classical that these are preserved under the time-evolution of the Boltzmann equation. Moreover, it is equivalent to use the Fourier transform of the equation [12] : ¶ µ Z i ξ ·n hb + b − b (28) ∂t f (t, ξ) = f (ξ )f (ξ ) − fb(ξ)fb(0) dn, σ |ξ| S2 where ξ + |ξ|n + ξ = 2 (29) ξ − = ξ − |ξ|n , 2 and the initial conditions are such that fb(0) = 1, ∇fb(0) = 0, ∇2 fb(0) = −3, fb ∈ C 2 (Rd ). Note that ξ + + ξ − = ξ, and |ξ + |2 + |ξ − |2 = |ξ|2 . 14 G. TOSCANI AND C. VILLANI Theorem 5. Let F and G be two solutions of the Boltzmann equation (8). Then, for all time t ≥ 0, d2 (F (t), G(t)) ≤ d2 (F (0), G(0)). Before proving Theorem 5, we mention three useful corollaries. Corollary 5.1. Let F0 be a nonnegative measure with finite variance. Then, there exists a unique weak solution F (t) of the Boltzmann equation, such that F (0) = F0 . Corollary 5.2. If Fε (t) is a sequence of approximate solutions of the Boltzmann equation, obtained by a standard cut-off procedure for instance, then Fε converges weakly to F . This entails in particular that such results as the decrease of the Fisher information, or the decrease of Tanaka’s functional, which are known to hold for the cut-off equation, also hold for the non cut-off equation. Corollary 5.3. Let F0 be a nonnegative measure with finite variance, and F (t) the associated solution of the Boltzmann equation. Let M be the Maxwellian distribution with the same mean vector and variance that F0 . Then d2 (F (t), M ) is decreasing towards 0. Proof of Theorem 5. Let F and G be two solutions of the Boltzmann equation, and fb, gb their Fourier transforms. Then, (30) ³ ´ # ¶" b + b − Z µ fb − gb ξ·n f (ξ )f (ξ ) − gb(ξ + )b g (ξ − ) fb(ξ) − gb(ξ) ∂t = σ − dn. |ξ|2 |ξ| |ξ|2 |ξ|2 Now, we do the usual splitting ¯ ¯ ¯ ¯ ¯ fb(ξ + )fb(ξ − ) − gb(ξ + )b ¯ fb(ξ − ) − gb(ξ − ) ¯ |ξ − |2 − ¯ g (ξ ) ¯ ¯ ¯ ¯ ¯ ¯ ≤ |fb(ξ + )| ¯ ¯ ¯ ¯ ¯ ¯ |ξ|2 |ξ|2 |ξ − |2 ¯ ¯ ¯ fb(ξ + ) − gb(ξ + ) ¯ |ξ + |2 ¯ ¯ + |b g (ξ − )| ¯ ¯ ¯ ¯ |ξ|2 |ξ + |2 ¯ ¯ ¯ ¯ ¯ fb − gb ¯ µ |ξ − |2 + |ξ + |2 ¶ ¯ fb − gb ¯ ¯ ¯ ¯ ¯ ≤ sup ¯ = sup ¯ ¯ ¯. 2 2 2 ¯ |ξ| ¯ ¯ |ξ| ¯ |ξ| We set h(t, ξ) = fb(ξ) − gb(ξ) . |ξ|2 PROBABILITY METRICS AND UNIQUENESS 15 For cut-off molecules, let e be any fixed unit vector and let us denote by Z S= σ(n · e) dn S2 the total cross-section. By rotational invariance, for all ξ 6= 0, ¶ Z µ ξ·n σ = S, |ξ| and the preceding computation shows that ¯ ¯ ¯∂t h − Sh¯ ≤ Skhk∞ . (31) Gronwall’s lemma proves at once that for cut-off molecules, kh(t)k∞ is nonincreasing. Now, let us consider the case of true Maxwell molecules, where σ(ν) is singular like (1 − ν)−5/4 . This singularity corresponds to grazing collisions, i.e. ξ + ∼ ξ, ξ − ∼ 0. Since it is nonintegrable, S = ∞. Then we split the right-hand side of (28) according to |1 − ν| ≥ ε or |1 − ν| < ε. For the first term, we use the preceding estimate, while for the other, we use the fact that the singularity is cancelled by the vanishing of fb(ξ + )fb(ξ − ) − fb(ξ)fb(0) for grazing collisions. Indeed, as in [12], let us write ¯ ¯ ¯b + b − ¯ b b ¯f (ξ )f (ξ ) − f (ξ)f (0)¯ ≤ |fb(ξ + )| |fb(ξ + )−fb(ξ)|+|fb(ξ)| |fb(ξ − )−fb(0)| ≤ |∇fb(η)| |ξ + − ξ| + sup |∇fb(η)| |ξ − |. sup |η|≤sup(|ξ|,|ξ + |) |η|≤|ξ − | Since |ξ + |, |ξ − | ≤ |ξ|, |D2 fb(ξ)| ≤ d, and ∇fb(0) = 0, we conclude that |fb(ξ + )fb(ξ − ) − fb(ξ)fb(0)| ≤ C|ξ||ξ − | ≤ C|ξ|2 (1 − ν)1/2 , where C depends only on the dimension. This implies that the integrand in (28) is bounded by C(1 − ν)−3/4 , and thus the integral is convergent, uniformly in ξ and in t. As a conclusion, setting Z Sε = 1|1−n·e|≥ε σ(n · e) dn, ¯Z ¯ rε = sup ¯¯ ξ,t S2 ¶ i ¯¯ ξ ·n hb + b − b b 1|1− ξ·n |<ε σ f (ξ )f (ξ ) − f (ξ)f (0) dn¯¯ , |ξ| |ξ| 2 S we obtain that rε −→ 0 as ε → 0, and ¯ ¯ ¯∂t h(ξ, t) − Sε h(ξ, t)¯ ≤ Sε khk∞ (t) + rε . (32) µ This is equivalent to ¯ ¡ ¢¯ ¯∂t h(ξ, t)eSε t ¯ ≤ Sε kh(·, t)eSε t k∞ + rε eSε t . 16 G. TOSCANI AND C. VILLANI Integrating from 0 to t, we get Z t ¡ ¢ Sε t |h(ξ, t)|e ≤ |h(ξ, 0)| + dτ Sε kh(·, τ )eSε τ k∞ + rε eSε τ . 0 Hence, if Hε (t) = kh(·, t)e Sε t k∞ , Z t Z t Sε τ Hε (t) ≤ Hε (0) + rε e dτ + Sε H(τ ) dτ. 0 0 Now, by the generalized Gronwall inequality, Z t u(t) ≤ ϕ(t) + λ(τ )u(τ ) dτ 0 implies ½Z ¾ t Z ½Z t ¾ t dϕ dτ. dτ 0 0 τ Rt Applying this inequality with λ(τ ) = Sε and ϕ(t) = Hε (0)+ 0 rε eSε τ dτ , we obtain Hε (t) ≤ Hε (0)eSε t + trε eSε t , namely kh(·, t)k∞ ≤ kh(·, 0)k∞ + rε t. Letting ε going to 0, we obtain kh(·, t)k∞ ≤ kh(·, 0)k∞ , i.e. u(t) ≤ ϕ(0) exp λ(τ ) dτ + exp λ(τ ) dτ d2 (F (t), G(t)) ≤ d2 (F (0), G(0)). ¤ References [1] N.M. Blachman. The convolution inequality for entropy powers, IEEE Trans. Inform. Theory, 2:267–271, 1965. [2] A.V. Bobylev, G. Toscani On the generalization of the Boltzmann H-theorem for a spatially homogeneous Maxwell gas J. Math. Phys, 33:2578–2586, 1992. [3] E.A. Carlen, E. Gabetta and G. Toscani. Propagation of smoothness and the rate of exponential convergence to equilibrium for a spatially homogeneous Maxwellian gas, Commun. Math. Phys. (to appear), 1997. [4] E.A. Carlen, M.C. Carvalho and E. Gabetta. Central limit theorem for Maxwellian molecules and truncation of the Wild expansion, Preprint, 1997. [5] I. Csiszar. Information-type measures of difference of probability distributions and indirect observations, Stud. Sci. Math. Hung., 2:299–318, 1967. [6] E. Gabetta, G. Toscani and B. Wennberg. Metrics for probability distributions and the trend to equilibrium for solutions of the Boltzmann equation, J. Stat. Phys., 81 :901–934, 1995. [7] L. Kantorovich. On translation of mass(in Russian), Dokl. AN SSSR, 37 :227– 229, 1942. [8] H.P. McKean, Jr. Speed of approach to equilibrium for Kac’s caricature of a Maxwellian gas, Arch. Rat. Mech. Anal., 21:343–367, 1966. PROBABILITY METRICS AND UNIQUENESS 17 [9] H. Murata, H. Tanaka. An inequality for certain functional of multidimensional probability distributions, Hiroshima Math. J., 4:75–81, 1974. [10] R. Jordan, D. Kinderlehrer, F. Otto. The variational formulation of the FokkerPlanck equation, To appear in SIAM J. Appl. Math. Anal. [11] Yu.V. Prokhorov. Convergence of random processes and limit theorems in probability theory, Theory Prob. Appl., 1 :157–214, 1956. [12] A. Pulvirenti, G. Toscani. The theory of the nonlinear Boltzmann equation for Maxwell molecules in Fourier representation, Ann. Mat. pura ed appl., 4, Vol. 171:181–204, 1996. [13] A. Stam. Some inequalities satisfied by the quantities of information of Fisher and Shannon, Inform. Control, 2:101–112, (1959) [14] V. Strassen. The existence of probability measures with given marginals, Ann. Math. Statist., 36 :423–439, 1965. [15] H. Tanaka. An inequality for a functional of probability distributions and its application to Kac’s one-dimensional model of a Maxwellian gas, Wahrsch. Verw. Geb., 27 :47–52, 1973. [16] H. Tanaka. Probabilistic treatment of the Boltzmann equation of Maxwellian molecules, Wahrsch. Verw. Geb., 46 :67–105, 1978. [17] I. Vaida. Theory of statistical Inference and Information, Kluwer Academic Publishers, Dordrecht 1989 [18] L.N. Vasershtein. Markov processes on countable product space describing large systems of automata (in Russian), Probl. Pered. Inform., 5:64–73, 1969. [19] V.M. Zolotarev. Probability metrics, Theory Prob. Appl., 28 :278–302, 1983. Department of Mathematics, University of Pavia, via Abbiategrasso 209, 27100 Pavia, ITALY École Normale Suprieure, DMI, 45 rue d’Ulm, 75230 Paris Cedex 05