Download A quick introduction to Optimal Transport

Introduction The discrete case Measures The Euclidean case Gradient flows, optimal transport, and evolution PDE’s 2 - A quick introduction to Optimal Transport Giuseppe Savaré http://www.imati.cnr.it/∼savare Dipartimento di Matematica, Università di Pavia GNFM Summer School Ravello, September 13–18, 2010 1 Introduction The discrete case Measures The Euclidean case Outline 1 A short historical tour 2 The “discrete” case, duality and linear programming 3 The measure-theoretic setting 4 Euclidean spaces: geometry and transport maps 2 Introduction The discrete case Measures The Euclidean case Outline 1 A short historical tour 2 The “discrete” case, duality and linear programming 3 The measure-theoretic setting 4 Euclidean spaces: geometry and transport maps Introduction The discrete case Measures The Euclidean case Outline 1 A short historical tour 2 The “discrete” case, duality and linear programming 3 The measure-theoretic setting 4 Euclidean spaces: geometry and transport maps Introduction The discrete case Measures The Euclidean case Outline 1 A short historical tour 2 The “discrete” case, duality and linear programming 3 The measure-theoretic setting 4 Euclidean spaces: geometry and transport maps Introduction The discrete case Measures The Euclidean case Outline 1 A short historical tour 2 The “discrete” case, duality and linear programming 3 The measure-theoretic setting 4 Euclidean spaces: geometry and transport maps 3 Introduction The discrete case Measures The Euclidean case Gaspard Monge (1746-1818) 42 3 The founding fathers of optimal transport 1781: “La théorie des déblais et des remblais ” minimize the total cost. Monge assumed that the transport cost of one unit of mass along a certain distance was given by the product of the Problem: how to transport soil from the groud to a given configuration in the mass byefficient” the distance. “most way. T x déblais y remblais Fig. 3.1. Monge’s problem of déblais and remblais The transport cost is proportional to the distance |T (x) − x|. Nowadays there is a Monge street in Paris, and therein one can find 4 Introduction The discrete case Measures The Euclidean case Leonid Kantorovich (1912-1986) 1939: Mathematical Methods of Organizing and Planning of Production, (unpublished until 1960). 1942: On the translocation of masses 1948: On a problem of Monge 1975: Nobel prize, jointly with Tjalling Koopmans, “for their contributions to the theory of optimum allocation of resources” Autobiography: http://nobelprize.org/nobel prizes/economics/laureates/1975/kantorovich-autobio.html Parallel contributions: 1941: Frank Hitchcock, The distribution of a product from several sources to numerous localities (Jour. Math. Phys.) 1947: Tjalling Koopmans, Optimum utilization of the transportation system. 1947: George Dantzig, simplex method. 5 Introduction The discrete case Measures The Euclidean case Leonid Kantorovich (1912-1986) 1939: Mathematical Methods of Organizing and Planning of Production, (unpublished until 1960). 1942: On the translocation of masses 1948: On a problem of Monge 1975: Nobel prize, jointly with Tjalling Koopmans, “for their contributions to the theory of optimum allocation of resources” Autobiography: http://nobelprize.org/nobel prizes/economics/laureates/1975/kantorovich-autobio.html Parallel contributions: 1941: Frank Hitchcock, The distribution of a product from several sources to numerous localities (Jour. Math. Phys.) 1947: Tjalling Koopmans, Optimum utilization of the transportation system. 1947: George Dantzig, simplex method. Introduction The discrete case Measures The Euclidean case Leonid Kantorovich (1912-1986) 1939: Mathematical Methods of Organizing and Planning of Production, (unpublished until 1960). 1942: On the translocation of masses 1948: On a problem of Monge 1975: Nobel prize, jointly with Tjalling Koopmans, “for their contributions to the theory of optimum allocation of resources” Autobiography: http://nobelprize.org/nobel prizes/economics/laureates/1975/kantorovich-autobio.html Parallel contributions: 1941: Frank Hitchcock, The distribution of a product from several sources to numerous localities (Jour. Math. Phys.) 1947: Tjalling Koopmans, Optimum utilization of the transportation system. 1947: George Dantzig, simplex method. Introduction The discrete case Measures The Euclidean case Twoards the recent theory... I Statistical and probabilistic aspects: (beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . ) I Rachev-Rüschendorf, Mass Transportation Problems (1998) Particle systems, Boltzmann equation: Dobrushin, Tanaka (∼’70) I Yann Brenier (’89): fluid mechanics, transport map, polar decomposition. Dynamical interpratation of optimal transport. I John Mather: Lagrangian dynamical systems. Mike Cullen: meteorologic models, semigeostrofic equations. Regularity, geometric and functional inequalities, Riemannian geometry, urban planning, evolution equations, etc.: L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L. Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper, T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . . I I I C. Villani: Optimal transport: Old and New Springer (2009) 978 p. 6 Introduction The discrete case Measures The Euclidean case Twoards the recent theory... I Statistical and probabilistic aspects: (beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . ) I Rachev-Rüschendorf, Mass Transportation Problems (1998) Particle systems, Boltzmann equation: Dobrushin, Tanaka (∼’70) I Yann Brenier (’89): fluid mechanics, transport map, polar decomposition. Dynamical interpratation of optimal transport. I John Mather: Lagrangian dynamical systems. Mike Cullen: meteorologic models, semigeostrofic equations. Regularity, geometric and functional inequalities, Riemannian geometry, urban planning, evolution equations, etc.: L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L. Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper, T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . . I I I C. Villani: Optimal transport: Old and New Springer (2009) 978 p. Introduction The discrete case Measures The Euclidean case Twoards the recent theory... I Statistical and probabilistic aspects: (beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . ) I Rachev-Rüschendorf, Mass Transportation Problems (1998) Particle systems, Boltzmann equation: Dobrushin, Tanaka (∼’70) I Yann Brenier (’89): fluid mechanics, transport map, polar decomposition. Dynamical interpratation of optimal transport. I John Mather: Lagrangian dynamical systems. Mike Cullen: meteorologic models, semigeostrofic equations. Regularity, geometric and functional inequalities, Riemannian geometry, urban planning, evolution equations, etc.: L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L. Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper, T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . . I I I C. Villani: Optimal transport: Old and New Springer (2009) 978 p. Introduction The discrete case Measures The Euclidean case Twoards the recent theory... I Statistical and probabilistic aspects: (beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . ) I Rachev-Rüschendorf, Mass Transportation Problems (1998) Particle systems, Boltzmann equation: Dobrushin, Tanaka (∼’70) I Yann Brenier (’89): fluid mechanics, transport map, polar decomposition. Dynamical interpratation of optimal transport. I John Mather: Lagrangian dynamical systems. Mike Cullen: meteorologic models, semigeostrofic equations. Regularity, geometric and functional inequalities, Riemannian geometry, urban planning, evolution equations, etc.: L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L. Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper, T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . . I I I C. Villani: Optimal transport: Old and New Springer (2009) 978 p. Introduction The discrete case Measures The Euclidean case Twoards the recent theory... I Statistical and probabilistic aspects: (beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . ) I Rachev-Rüschendorf, Mass Transportation Problems (1998) Particle systems, Boltzmann equation: Dobrushin, Tanaka (∼’70) I Yann Brenier (’89): fluid mechanics, transport map, polar decomposition. Dynamical interpratation of optimal transport. I John Mather: Lagrangian dynamical systems. Mike Cullen: meteorologic models, semigeostrofic equations. Regularity, geometric and functional inequalities, Riemannian geometry, urban planning, evolution equations, etc.: L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L. Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper, T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . . I I I C. Villani: Optimal transport: Old and New Springer (2009) 978 p. Introduction The discrete case Measures The Euclidean case Twoards the recent theory... I Statistical and probabilistic aspects: (beginning of ’900: Gini, Dall’Aglio, Hoeffding, Fréchet,. . . ) I Rachev-Rüschendorf, Mass Transportation Problems (1998) Particle systems, Boltzmann equation: Dobrushin, Tanaka (∼’70) I Yann Brenier (’89): fluid mechanics, transport map, polar decomposition. Dynamical interpratation of optimal transport. I John Mather: Lagrangian dynamical systems. Mike Cullen: meteorologic models, semigeostrofic equations. Regularity, geometric and functional inequalities, Riemannian geometry, urban planning, evolution equations, etc.: L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L. Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper, T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . . I I I C. Villani: Optimal transport: Old and New Springer (2009) 978 p. Introduction The discrete case Measures The Euclidean case Outline 1 A short historical tour 2 The “discrete” case, duality and linear programming 3 The measure-theoretic setting 4 Euclidean spaces: geometry and transport maps 7 Introduction The discrete case Measures The Euclidean case Discrete formulation • Initial configuration of resources in X = {x1 , · · · , xh }; at every point xi ∈ X it is available the quantity mi = m(xi ). • Final configuration Y = {y1 , · · · , yn }: at every point yj the quantity nj = n(yj ) is expected. • The unitary cost cij = c(xi , yj ) for transporting the single unit from position xi to the destination yj . y1 x1 x2 y2 x3 y3 x4 Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved from xi to yj , so that T (xi , yj ) ≥ 0, X T (xi , y) = m(xi ), y∈Y The cost of the transference plan T is C(T ) := X T (x, yj ) = n(yj ) x∈X X c(x, y)T (x, y) x∈X,y∈Y 8 Introduction The discrete case Measures The Euclidean case Discrete formulation • Initial configuration of resources in X = {x1 , · · · , xh }; at every point xi ∈ X it is available the quantity mi = m(xi ). • Final configuration Y = {y1 , · · · , yn }: at every point yj the quantity nj = n(yj ) is expected. • The unitary cost cij = c(xi , yj ) for transporting the single unit from position xi to the destination yj . y1 x1 x2 y2 x3 y3 x4 Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved from xi to yj , so that T (xi , yj ) ≥ 0, X T (xi , y) = m(xi ), y∈Y The cost of the transference plan T is C(T ) := X T (x, yj ) = n(yj ) x∈X X c(x, y)T (x, y) x∈X,y∈Y 8 Introduction The discrete case Measures The Euclidean case Discrete formulation • Initial configuration of resources in X = {x1 , · · · , xh }; at every point xi ∈ X it is available the quantity mi = m(xi ). • Final configuration Y = {y1 , · · · , yn }: at every point yj the quantity nj = n(yj ) is expected. • The unitary cost cij = c(xi , yj ) for transporting the single unit from position xi to the destination yj . c11 x1 y1 c12 x2 c13 y2 x3 y3 x4 Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved from xi to yj , so that T (xi , yj ) ≥ 0, X T (xi , y) = m(xi ), y∈Y The cost of the transference plan T is C(T ) := X T (x, yj ) = n(yj ) x∈X X c(x, y)T (x, y) x∈X,y∈Y 8 Introduction The discrete case Measures The Euclidean case Discrete formulation • Initial configuration of resources in X = {x1 , · · · , xh }; at every point xi ∈ X it is available the quantity mi = m(xi ). • Final configuration Y = {y1 , · · · , yn }: at every point yj the quantity nj = n(yj ) is expected. • The unitary cost cij = c(xi , yj ) for transporting the single unit from position xi to the destination yj . y1 x1 c21 x2 c22 y2 c23 x3 y3 x4 Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved from xi to yj , so that T (xi , yj ) ≥ 0, X T (xi , y) = m(xi ), y∈Y The cost of the transference plan T is C(T ) := X T (x, yj ) = n(yj ) x∈X X c(x, y)T (x, y) x∈X,y∈Y 8 Introduction The discrete case Measures The Euclidean case Discrete formulation • Initial configuration of resources in X = {x1 , · · · , xh }; at every point xi ∈ X it is available the quantity mi = m(xi ). • Final configuration Y = {y1 , · · · , yn }: at every point yj the quantity nj = n(yj ) is expected. • The unitary cost cij = c(xi , yj ) for transporting the single unit from position xi to the destination yj . T11 x1 y1 T21 x2 y2 T33 x3 T42 x4 y3 T43 Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved from xi to yj , so that T (xi , yj ) ≥ 0, X T (xi , y) = m(xi ), y∈Y The cost of the transference plan T is C(T ) := X T (x, yj ) = n(yj ) x∈X X c(x, y)T (x, y) x∈X,y∈Y 8 Introduction The discrete case Measures The Euclidean case Discrete formulation • Initial configuration of resources in X = {x1 , · · · , xh }; at every point xi ∈ X it is available the quantity mi = m(xi ). • Final configuration Y = {y1 , · · · , yn }: at every point yj the quantity nj = n(yj ) is expected. • The unitary cost cij = c(xi , yj ) for transporting the single unit from position xi to the destination yj . y1 x1 x2 y2 x3 y3 x4 Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved from xi to yj , so that T (xi , yj ) ≥ 0, X T (xi , y) = m(xi ), y∈Y The cost of the transference plan T is C(T ) := X T (x, yj ) = n(yj ) x∈X X c(x, y)T (x, y) x∈X,y∈Y 8 Introduction The discrete case Measures The Euclidean case Discrete formulation • Initial configuration of resources in X = {x1 , · · · , xh }; at every point xi ∈ X it is available the quantity mi = m(xi ). • Final configuration Y = {y1 , · · · , yn }: at every point yj the quantity nj = n(yj ) is expected. • The unitary cost cij = c(xi , yj ) for transporting the single unit from position xi to the destination yj . y1 x1 x2 y2 x3 y3 x4 Admissible transference plan: choose the quantities Ti,j = T (xi , yj ) moved from xi to yj , so that T (xi , yj ) ≥ 0, X T (xi , y) = m(xi ), y∈Y The cost of the transference plan T is C(T ) := X T (x, yj ) = n(yj ) x∈X X c(x, y)T (x, y) x∈X,y∈Y 8 Introduction The discrete case Measures The Euclidean case Optimal transport Problem Find the best transference plan T which minimizes the cost C(T ) among all the admissible plans. The linear programming structure: given positive coefficients mi , nj and ci,j find the quantities Ti,j minimizing the linear functional C(T ) = X ci,j Ti,j i,j under the linear/convex constraints X Ti,j ≥ 0, Ti,j = mi , j X Ti,j = mj i In vector notation: ~ · T~ : min C A0 T~ ≥ 0, A1 T~ = ~b In the discrete case existence of the optimal plan is easy; more important are 3 foundamental properties: I Cyclical monotonicity of the optimal transference plan. I Dual characterization, Kantorovich potentials (prices in economic terms), linear programming. I Integrality of the transference plan, transport maps. 9 Introduction The discrete case Measures The Euclidean case Optimal transport Problem Find the best transference plan T which minimizes the cost C(T ) among all the admissible plans. The linear programming structure: given positive coefficients mi , nj and ci,j find the quantities Ti,j minimizing the linear functional C(T ) = X ci,j Ti,j i,j under the linear/convex constraints X Ti,j ≥ 0, Ti,j = mi , j X Ti,j = mj i In vector notation: ~ · T~ : min C A0 T~ ≥ 0, A1 T~ = ~b In the discrete case existence of the optimal plan is easy; more important are 3 foundamental properties: I Cyclical monotonicity of the optimal transference plan. I Dual characterization, Kantorovich potentials (prices in economic terms), linear programming. I Integrality of the transference plan, transport maps. Introduction The discrete case Measures The Euclidean case Optimal transport Problem Find the best transference plan T which minimizes the cost C(T ) among all the admissible plans. The linear programming structure: given positive coefficients mi , nj and ci,j find the quantities Ti,j minimizing the linear functional C(T ) = X ci,j Ti,j i,j under the linear/convex constraints X Ti,j ≥ 0, Ti,j = mi , j X Ti,j = mj i In vector notation: ~ · T~ : min C A0 T~ ≥ 0, A1 T~ = ~b In the discrete case existence of the optimal plan is easy; more important are 3 foundamental properties: I Cyclical monotonicity of the optimal transference plan. I Dual characterization, Kantorovich potentials (prices in economic terms), linear programming. I Integrality of the transference plan, transport maps. Introduction The discrete case Measures The Euclidean case Cyclical monotonicity Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e. T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 ) T11 x1 x2 y1 T21 y2 T33 x3 T42 x4 y3 T43 The associated (unitary) cost is c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 ) if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1 Theorem (Rachev-Ruschendorf ) If T is optimal the cost of any rearranged configuration by a cyclical permutation cannot decrease. 10 Introduction The discrete case Measures The Euclidean case Cyclical monotonicity Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e. T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 ) y1 x2 y2 x3 y3 x4 The associated (unitary) cost is c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 ) if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1 Theorem (Rachev-Ruschendorf ) If T is optimal the cost of any rearranged configuration by a cyclical permutation cannot decrease. 10 Introduction The discrete case Measures The Euclidean case Cyclical monotonicity Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e. T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 ) y1 x2 y2 x3 y3 x4 The associated (unitary) cost is c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 ) if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1 Theorem (Rachev-Ruschendorf ) If T is optimal the cost of any rearranged configuration by a cyclical permutation cannot decrease. 10 Introduction The discrete case Measures The Euclidean case Cyclical monotonicity Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e. T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 ) y1 σ x2 y2 x3 σ y3 x4 The associated (unitary) cost is c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 ) if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1 Theorem (Rachev-Ruschendorf ) If T is optimal the cost of any rearranged configuration by a cyclical permutation cannot decrease. 10 Introduction The discrete case Measures The Euclidean case Cyclical monotonicity Consider an aribtrary collection of couples (x, y) joined by a transport ray , i.e. T (x, y) > 0: in the picture (x2 , y1 ), (x3 , y2 ), (x4 , y3 ) y1 σ x2 y2 x3 σ y3 x4 The associated (unitary) cost is c(x2 , y1 ) + c(x3 , y2 ) + c(x4 , y3 ) ≤ c(x2 , y2 ) + c(x3 , y3 ) + c(x4 , y1 ) if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1 Theorem (Rachev-Ruschendorf ) If T is optimal the cost of any rearranged configuration by a cyclical permutation cannot decrease. 10 Introduction The discrete case Measures The Euclidean case Cyclical monotonicity is also sufficient Theorem If T is a cyclically monotone admissible plan then it is optimal. 11 Introduction The discrete case Measures The Euclidean case The dual problem: optimal prices Linear programming: the dual problem gives a crucial insight on the structure of the optimal transference plan. Economic interpretation: a transport company offers to take care the transportation job: they will pay the price u(x) to buy a unit placed at the point x and they will sell it at y for the price v(y). To be competitive, the prices should be more convenient than the transportation cost c(x, y): v(y) − u(x) ≤ c(x, y) x ∈ X, y ∈ Y (*) The total profit for the company is X X P(u, v) := n(y)v(y) − m(x)u(x) y∈Y x∈X and their problem is to find the prices which maximaize the profits max P(u, v) among all the competitive prices (u, v) satisfying (*) Clearly C(T ) ≥ P(u, v) for every admissible trasnference plan T and every couple of competitive prices u, v. 12 Introduction The discrete case Measures The Euclidean case The dual problem: optimal prices Linear programming: the dual problem gives a crucial insight on the structure of the optimal transference plan. Economic interpretation: a transport company offers to take care the transportation job: they will pay the price u(x) to buy a unit placed at the point x and they will sell it at y for the price v(y). To be competitive, the prices should be more convenient than the transportation cost c(x, y): v(y) − u(x) ≤ c(x, y) x ∈ X, y ∈ Y (*) The total profit for the company is X X P(u, v) := n(y)v(y) − m(x)u(x) y∈Y x∈X and their problem is to find the prices which maximaize the profits max P(u, v) among all the competitive prices (u, v) satisfying (*) Clearly C(T ) ≥ P(u, v) for every admissible trasnference plan T and every couple of competitive prices u, v. 12 Introduction The discrete case Measures The Euclidean case The dual problem: optimal prices Linear programming: the dual problem gives a crucial insight on the structure of the optimal transference plan. Economic interpretation: a transport company offers to take care the transportation job: they will pay the price u(x) to buy a unit placed at the point x and they will sell it at y for the price v(y). To be competitive, the prices should be more convenient than the transportation cost c(x, y): v(y) − u(x) ≤ c(x, y) x ∈ X, y ∈ Y (*) The total profit for the company is X X P(u, v) := n(y)v(y) − m(x)u(x) y∈Y x∈X and their problem is to find the prices which maximaize the profits max P(u, v) among all the competitive prices (u, v) satisfying (*) Clearly C(T ) ≥ P(u, v) for every admissible trasnference plan T and every couple of competitive prices u, v. 12 Introduction The discrete case Measures The Euclidean case The dual problem: optimal prices Linear programming: the dual problem gives a crucial insight on the structure of the optimal transference plan. Economic interpretation: a transport company offers to take care the transportation job: they will pay the price u(x) to buy a unit placed at the point x and they will sell it at y for the price v(y). To be competitive, the prices should be more convenient than the transportation cost c(x, y): v(y) − u(x) ≤ c(x, y) x ∈ X, y ∈ Y (*) The total profit for the company is X X P(u, v) := n(y)v(y) − m(x)u(x) y∈Y x∈X and their problem is to find the prices which maximaize the profits max P(u, v) among all the competitive prices (u, v) satisfying (*) Clearly C(T ) ≥ P(u, v) for every admissible trasnference plan T and every couple of competitive prices u, v. 12 Introduction The discrete case Measures The Euclidean case Duality theorem Theorem (Min-max and “complementary slackness”) An admissible transference plan T is optimal if and only if there exist competitive prices (u, v) such that C(T ) = P(u, v). In particular min C(T ) = max P(u, v). T (u,v) Moreover, the “slackness” S(x, y) := c(x, y) − u(x) − v(y) ≥ 0 satisfies the “complementary slackness principle” T (x, y)S(x, y) = 0 i.e. T (x, y) > 0 ⇒ S(x, y) = 0. “If x and y are connected through an optimal transport ray then their respective prices u(x) e v(y) are maximal: v(y) − u(x) = c(x, y).” Introduction The discrete case Measures The Euclidean case Duality theorem Theorem (Min-max and “complementary slackness”) An admissible transference plan T is optimal if and only if there exist competitive prices (u, v) such that C(T ) = P(u, v). In particular min C(T ) = max P(u, v). T (u,v) Moreover, the “slackness” S(x, y) := c(x, y) − u(x) − v(y) ≥ 0 satisfies the “complementary slackness principle” T (x, y)S(x, y) = 0 i.e. T (x, y) > 0 ⇒ S(x, y) = 0. “If x and y are connected through an optimal transport ray then their respective prices u(x) e v(y) are maximal: v(y) − u(x) = c(x, y).” Introduction The discrete case Measures The Euclidean case Duality theorem Theorem (Min-max and “complementary slackness”) An admissible transference plan T is optimal if and only if there exist competitive prices (u, v) such that C(T ) = P(u, v). In particular min C(T ) = max P(u, v). T (u,v) Moreover, the “slackness” S(x, y) := c(x, y) − u(x) − v(y) ≥ 0 satisfies the “complementary slackness principle” T (x, y)S(x, y) = 0 i.e. T (x, y) > 0 ⇒ S(x, y) = 0. “If x and y are connected through an optimal transport ray then their respective prices u(x) e v(y) are maximal: v(y) − u(x) = c(x, y).” Introduction The discrete case Measures The Euclidean case Duality theorem Theorem (Min-max and “complementary slackness”) An admissible transference plan T is optimal if and only if there exist competitive prices (u, v) such that C(T ) = P(u, v). In particular min C(T ) = max P(u, v). T (u,v) Moreover, the “slackness” S(x, y) := c(x, y) − u(x) − v(y) ≥ 0 satisfies the “complementary slackness principle” T (x, y)S(x, y) = 0 i.e. T (x, y) > 0 ⇒ S(x, y) = 0. “If x and y are connected through an optimal transport ray then their respective prices u(x) e v(y) are maximal: v(y) − u(x) = c(x, y).” Introduction The discrete case Measures The Euclidean case Duality via Von Neumann min-max min T X Ti,j ≥ 0, ci,j Ti,j : i,j X Ti,j = mi , j X Ti,j = nj . i Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint min T X i,j ci,j Ti,j = min max T S,u,v X − X i,j ui S,u,v X = max u,v = max u,v T j i “ ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j = max min X Si,j Ti,j ” X “X ” Ti,j − mi + vj Ti,j − mj j = min max S,u,v X i,j “X i T ci,j Ti,j − X “ ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j vj nj − ui mi : ci,j − Si,j − ui − vj = 0 vj nj − ui mi : ci,j − ui − vj ≥ 0. i,j X i,j 14 Introduction The discrete case Measures The Euclidean case Duality via Von Neumann min-max min T X Ti,j ≥ 0, ci,j Ti,j : i,j X Ti,j = mi , j X Ti,j = nj . i Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint min T X i,j ci,j Ti,j = min max T S,u,v X − X i,j ui S,u,v X = max u,v = max u,v T j i “ ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j = max min X Si,j Ti,j ” X “X ” Ti,j − mi + vj Ti,j − mj j = min max S,u,v X i,j “X i T ci,j Ti,j − X “ ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j vj nj − ui mi : ci,j − Si,j − ui − vj = 0 vj nj − ui mi : ci,j − ui − vj ≥ 0. i,j X i,j 14 Introduction The discrete case Measures The Euclidean case Duality via Von Neumann min-max min T X Ti,j ≥ 0, ci,j Ti,j : i,j X Ti,j = mi , j X Ti,j = nj . i Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint min T X i,j ci,j Ti,j = min max T S,u,v X − X i,j ui S,u,v X = max u,v = max u,v T j “ i ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j = max min X Si,j Ti,j ” X “X ” Ti,j − mi + vj Ti,j − mj j = min max S,u,v X i,j “X i T ci,j Ti,j − X “ ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j vj nj − ui mi : ci,j − Si,j − ui − vj = 0 vj nj − ui mi : ci,j − ui − vj ≥ 0. i,j X i,j 14 Introduction The discrete case Measures The Euclidean case Duality via Von Neumann min-max min T X Ti,j ≥ 0, ci,j Ti,j : i,j X Ti,j = mi , j X Ti,j = nj . i Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint min T X i,j ci,j Ti,j = min max T S,u,v X − X i,j ui S,u,v X = max u,v = max u,v T j “ i ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j = max min X Si,j Ti,j ” X “X ” Ti,j − mi + vj Ti,j − mj j = min max S,u,v X i,j “X i T ci,j Ti,j − X “ ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j vj nj − ui mi : ci,j − Si,j − ui − vj = 0 vj nj − ui mi : ci,j − ui − vj ≥ 0. i,j X i,j 14 Introduction The discrete case Measures The Euclidean case Duality via Von Neumann min-max min T X Ti,j ≥ 0, ci,j Ti,j : i,j X Ti,j = mi , j X Ti,j = nj . i Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint min T X i,j ci,j Ti,j = min max T S,u,v X − X i,j ui S,u,v X = max u,v = max u,v T j “ i ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j = max min X Si,j Ti,j ” X “X ” Ti,j − mi + vj Ti,j − mj j = min max S,u,v X i,j “X i T ci,j Ti,j − X “ ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j vj nj − ui mi : ci,j − Si,j − ui − vj = 0 vj nj − ui mi : ci,j − ui − vj ≥ 0. i,j X i,j 14 Introduction The discrete case Measures The Euclidean case Duality via Von Neumann min-max min T X Ti,j ≥ 0, ci,j Ti,j : i,j X Ti,j = mi , j X Ti,j = nj . i Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint min T X i,j ci,j Ti,j = min max T S,u,v X − X i,j ui S,u,v X = max u,v = max u,v T j “ i ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j = max min X Si,j Ti,j ” X “X ” Ti,j − mi + vj Ti,j − mj j = min max S,u,v X i,j “X i T ci,j Ti,j − X “ ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j vj nj − ui mi : ci,j − Si,j − ui − vj = 0 vj nj − ui mi : ci,j − ui − vj ≥ 0. i,j X i,j 14 Introduction The discrete case Measures The Euclidean case Duality via Von Neumann min-max min T X Ti,j ≥ 0, ci,j Ti,j : i,j X Ti,j = mi , j X Ti,j = nj . i Introduce Lagrange multipliers Si,j ≥ 0, ui , vj for the constraint min T X i,j ci,j Ti,j = min max T S,u,v X − X i,j ui S,u,v X = max u,v = max u,v T j “ i ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j = max min X Si,j Ti,j ” X “X ” Ti,j − mi + vj Ti,j − mj j = min max S,u,v X i,j “X i T ci,j Ti,j − X “ ” Ti,j ci,j − Si,j − ui + vj + vj nj − ui mi i,j vj nj − ui mi : ci,j − Si,j − ui − vj = 0 vj nj − ui mi : ci,j − ui − vj ≥ 0. i,j X i,j 14 Introduction The discrete case Measures The Euclidean case Integrality Theorem If the initial and final configuration m(x), n(y) ∈ N are integers then there exists an integer optimal transference plan T , i.e. T (x, y) ∈ N. In other words, there is no need to split unitary quantities in order to realize the optimal transport. Corollary If m(x) ≡ 1 and n(y) are integers, then the transference plan T is associated to a transport map t : X → Y so that T (x, y) > 0 ⇔ y = t(x). If moreover n(y) ≡ 1 then the map t is one-to-one. Roughly speaking: from every point x ∈ X start a unique transport ray and mass is not splitted in various directions. Introduction The discrete case Measures The Euclidean case Integrality Theorem If the initial and final configuration m(x), n(y) ∈ N are integers then there exists an integer optimal transference plan T , i.e. T (x, y) ∈ N. In other words, there is no need to split unitary quantities in order to realize the optimal transport. Corollary If m(x) ≡ 1 and n(y) are integers, then the transference plan T is associated to a transport map t : X → Y so that T (x, y) > 0 ⇔ y = t(x). If moreover n(y) ≡ 1 then the map t is one-to-one. Roughly speaking: from every point x ∈ X start a unique transport ray and mass is not splitted in various directions. Introduction The discrete case Measures The Euclidean case Integrality Theorem If the initial and final configuration m(x), n(y) ∈ N are integers then there exists an integer optimal transference plan T , i.e. T (x, y) ∈ N. In other words, there is no need to split unitary quantities in order to realize the optimal transport. Corollary If m(x) ≡ 1 and n(y) are integers, then the transference plan T is associated to a transport map t : X → Y so that T (x, y) > 0 ⇔ y = t(x). If moreover n(y) ≡ 1 then the map t is one-to-one. Roughly speaking: from every point x ∈ X start a unique transport ray and mass is not splitted in various directions. Introduction The discrete case Measures The Euclidean case Outline 1 A short historical tour 2 The “discrete” case, duality and linear programming 3 The measure-theoretic setting 4 Euclidean spaces: geometry and transport maps 16 Introduction The discrete case Measures The Euclidean case Measure data I I I X, Y discrete spaces X, Y topological spaces (R, RN , locally compact spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here RN . The cost a (lower-semi) continuous function c : X × Y → R ∪ {+∞}. The initial and final configurations m(x), n(y) a couple of Borel measures µ, ν on X and Y . The mass is normalized to 1. Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in A, ν(B) denotes the resources expected in B. Rm Transport plan T a measure γ on X × Y : γ(A × B) is the mass coming from A and transported in B. Admissibility: the marginals of γ are thus fixed (γ is a coupling between µ and ν) γ(A × Y ) = µ(A), ν γ γ(X × B) = ν(B) |x − y| = 0 ν Γ(µ, ν) : collection of all the admissible trasnference plan/couplings. µ µ Rm The cost of a transference plan γ is X x,y Z c(x, y)T (x, y) C(γ) := c(x, y) dγ(x, y). X×Y 17 Introduction The discrete case Measures The Euclidean case Measure data I I I X, Y discrete spaces X, Y topological spaces (R, RN , locally compact spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here RN . The cost a (lower-semi) continuous function c : X × Y → R ∪ {+∞}. The initial and final configurations m(x), n(y) a couple of Borel measures µ, ν on X and Y . The mass is normalized to 1. Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in A, ν(B) denotes the resources expected in B. Rm Transport plan T a measure γ on X × Y : γ(A × B) is the mass coming from A and transported in B. Admissibility: the marginals of γ are thus fixed (γ is a coupling between µ and ν) γ(A × Y ) = µ(A), ν γ γ(X × B) = ν(B) |x − y| = 0 ν Γ(µ, ν) : collection of all the admissible trasnference plan/couplings. µ µ The cost of a transference plan γ is X x,y Z c(x, y)T (x, y) C(γ) := c(x, y) dγ(x, y). X×Y Rm Introduction The discrete case Measures The Euclidean case Measure data I I I X, Y discrete spaces X, Y topological spaces (R, RN , locally compact spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here RN . The cost a (lower-semi) continuous function c : X × Y → R ∪ {+∞}. The initial and final configurations m(x), n(y) a couple of Borel measures µ, ν on X and Y . The mass is normalized to 1. Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in A, ν(B) denotes the resources expected in B. Rm Transport plan T a measure γ on X × Y : γ(A × B) is the mass coming from A and transported in B. Admissibility: the marginals of γ are thus fixed (γ is a coupling between µ and ν) γ(A × Y ) = µ(A), ν γ γ(X × B) = ν(B) |x − y| = 0 ν Γ(µ, ν) : collection of all the admissible trasnference plan/couplings. µ µ The cost of a transference plan γ is X x,y Z c(x, y)T (x, y) C(γ) := c(x, y) dγ(x, y). X×Y Rm Introduction The discrete case Measures The Euclidean case Measure data I I I X, Y discrete spaces X, Y topological spaces (R, RN , locally compact spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here RN . The cost a (lower-semi) continuous function c : X × Y → R ∪ {+∞}. The initial and final configurations m(x), n(y) a couple of Borel measures µ, ν on X and Y . The mass is normalized to 1. Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in A, ν(B) denotes the resources expected in B. Rm Transport plan T a measure γ on X × Y : γ(A × B) is the mass coming from A and transported in B. Admissibility: the marginals of γ are thus fixed (γ is a coupling between µ and ν) γ(A × Y ) = µ(A), ν γ γ(X × B) = ν(B) |x − y| = 0 ν Γ(µ, ν) : collection of all the admissible trasnference plan/couplings. µ µ Rm The cost of a transference plan γ is X x,y Z c(x, y)T (x, y) C(γ) := c(x, y) dγ(x, y). X×Y 17 Introduction The discrete case Measures The Euclidean case Measure data I I I X, Y discrete spaces X, Y topological spaces (R, RN , locally compact spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): here RN . The cost a (lower-semi) continuous function c : X × Y → R ∪ {+∞}. The initial and final configurations m(x), n(y) a couple of Borel measures µ, ν on X and Y . The mass is normalized to 1. Given A ⊂ X, B ⊂ Y µ(A) denotes the quantity of resources available in A, ν(B) denotes the resources expected in B. Rm Transport plan T a measure γ on X × Y : γ(A × B) is the mass coming from A and transported in B. Admissibility: the marginals of γ are thus fixed (γ is a coupling between µ and ν) γ(A × Y ) = µ(A), ν γ γ(X × B) = ν(B) |x − y| = 0 ν Γ(µ, ν) : collection of all the admissible trasnference plan/couplings. µ µ Rm The cost of a transference plan γ is X x,y Z c(x, y)T (x, y) C(γ) := c(x, y) dγ(x, y). X×Y 17 Introduction The discrete case Measures The Euclidean case Transport and probability Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN } µ= transport map, yi = t(xi ), X t# µ = ν = mi δyi . In term of measures X ν(B) = mi = i:yi ∈ B X X mi = P i mi δxi . t:= mi = µ(t−1 (B)) i:xi ∈t−1 (B ) i:t(xi )∈B In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X) we define ν = t# µ ⇔ ν(B) = µ(t−1 (B)). In probability: P is a probability measure on the probability space Ω, X : Ω → X is a random variable, X# P ∈ P(X ) is the law of X, Change of variable formula: Z Z φ(t(x)) dµ(x) = X X# P(A) = P[X ∈ A]. φ(y) dν(y) Y Z Expectation: Z E[φ(X)] = φ(X(ω)) dP(ω) = Ω φ(x) d(X# P) X 18 Introduction The discrete case Measures The Euclidean case Transport and probability Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN } µ= transport map, yi = t(xi ), X t# µ = ν = mi δyi . In term of measures X ν(B) = mi = i:yi ∈ B X X mi = P i mi δxi . t:= mi = µ(t−1 (B)) i:xi ∈t−1 (B ) i:t(xi )∈B In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X) we define ν = t# µ ⇔ ν(B) = µ(t−1 (B)). In probability: P is a probability measure on the probability space Ω, X : Ω → X is a random variable, X# P ∈ P(X ) is the law of X, Change of variable formula: Z Z φ(t(x)) dµ(x) = X X# P(A) = P[X ∈ A]. φ(y) dν(y) Y Z Expectation: Z E[φ(X)] = φ(X(ω)) dP(ω) = Ω φ(x) d(X# P) X Introduction The discrete case Measures The Euclidean case Transport and probability Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN } µ= transport map, yi = t(xi ), X t# µ = ν = mi δyi . In term of measures X ν(B) = mi = i:yi ∈ B X X mi = P i mi δxi . t:= mi = µ(t−1 (B)) i:xi ∈t−1 (B ) i:t(xi )∈B In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X) we define ν = t# µ ⇔ ν(B) = µ(t−1 (B)). In probability: P is a probability measure on the probability space Ω, X : Ω → X is a random variable, X# P ∈ P(X ) is the law of X, Change of variable formula: Z Z φ(t(x)) dµ(x) = X X# P(A) = P[X ∈ A]. φ(y) dν(y) Y Z Expectation: Z E[φ(X)] = φ(X(ω)) dP(ω) = Ω φ(x) d(X# P) X Introduction The discrete case Measures The Euclidean case Transport and probability Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN } µ= transport map, yi = t(xi ), X t# µ = ν = mi δyi . In term of measures X ν(B) = mi = i:yi ∈ B X X mi = P i mi δxi . t:= mi = µ(t−1 (B)) i:xi ∈t−1 (B ) i:t(xi )∈B In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X) we define ν = t# µ ⇔ ν(B) = µ(t−1 (B)). In probability: P is a probability measure on the probability space Ω, X : Ω → X is a random variable, X# P ∈ P(X ) is the law of X, Change of variable formula: Z Z φ(t(x)) dµ(x) = X X# P(A) = P[X ∈ A]. φ(y) dν(y) Y Z Expectation: Z E[φ(X)] = φ(X(ω)) dP(ω) = Ω φ(x) d(X# P) X Introduction The discrete case Measures The Euclidean case Transport and probability Discrete setting: {x1 , · · · , xN }, {m1 , · · · , mN } µ= transport map, yi = t(xi ), X t# µ = ν = mi δyi . In term of measures X ν(B) = mi = i:yi ∈ B X X mi = P i mi δxi . t:= mi = µ(t−1 (B)) i:xi ∈t−1 (B ) i:t(xi )∈B In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X) we define ν = t# µ ⇔ ν(B) = µ(t−1 (B)). In probability: P is a probability measure on the probability space Ω, X : Ω → X is a random variable, X# P ∈ P(X ) is the law of X, Change of variable formula: Z Z φ(t(x)) dµ(x) = X X# P(A) = P[X ∈ A]. φ(y) dν(y) Y Z Expectation: Z E[φ(X)] = φ(X(ω)) dP(ω) = Ω φ(x) d(X# P) X Introduction The discrete case Measures The Euclidean case The general problem Problem Given two Borel probability measures µ ∈ P(X) and ν ∈ P(Y ) find an admissible trasnference plan γ ∈ Γ(µ, ν) minimizing the toal cost min γ ∈Γ(µ,ν) C(γ) Kantorovich potentials: functions u : X → R, v : Y → R such that v(y) − u(x) ≤ c(x, y) X x u(x)m(x) + X Z v(y)n(y) (Π(c)) Z P(u, v) := u(x) dµ(x) + X y Problem (Dual formulation) Find a couple of Kantorovich potentials (u, v) ∈ Π(c) maximizing max P(u, v). Π(c) v(y) dν(y) Y Introduction The discrete case Measures The Euclidean case The general problem Problem Given two Borel probability measures µ ∈ P(X) and ν ∈ P(Y ) find an admissible trasnference plan γ ∈ Γ(µ, ν) minimizing the toal cost min γ ∈Γ(µ,ν) C(γ) Kantorovich potentials: functions u : X → R, v : Y → R such that v(y) − u(x) ≤ c(x, y) X x u(x)m(x) + X Z v(y)n(y) (Π(c)) Z P(u, v) := u(x) dµ(x) + X y Problem (Dual formulation) Find a couple of Kantorovich potentials (u, v) ∈ Π(c) maximizing max P(u, v). Π(c) v(y) dν(y) Y Introduction The discrete case Measures The Euclidean case The general problem Problem Given two Borel probability measures µ ∈ P(X) and ν ∈ P(Y ) find an admissible trasnference plan γ ∈ Γ(µ, ν) minimizing the toal cost min γ ∈Γ(µ,ν) C(γ) Kantorovich potentials: functions u : X → R, v : Y → R such that v(y) − u(x) ≤ c(x, y) X x u(x)m(x) + X Z v(y)n(y) (Π(c)) Z P(u, v) := u(x) dµ(x) + X y Problem (Dual formulation) Find a couple of Kantorovich potentials (u, v) ∈ Π(c) maximizing max P(u, v). Π(c) v(y) dν(y) Y Introduction The discrete case Measures The Euclidean case A foundamental theorem Assume that the cost is continuous and feasible, e.g. ZZ C(µ ⊗ ν) = c(x, y) d(µ ⊗ ν)(x, y) < +∞ (sufficient feasibility codition) X×Y Theorem Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c). Duality C(γ opt ) = min C(γ) = max P(u, v) = P(uopt , vopt ). Γ(µ,ν) Slackness For every (x, y) ∈ supp(γ) ( Π(c) connection by a transport ray) c(x, y) = vopt (y) − uopt (x). Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N } c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ). 20 Introduction The discrete case Measures The Euclidean case A foundamental theorem Assume that the cost is continuous and feasible, e.g. ZZ C(µ ⊗ ν) = c(x, y) d(µ ⊗ ν)(x, y) < +∞ (sufficient feasibility codition) X×Y Theorem Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c). Duality C(γ opt ) = min C(γ) = max P(u, v) = P(uopt , vopt ). Γ(µ,ν) Slackness For every (x, y) ∈ supp(γ) ( Π(c) connection by a transport ray) c(x, y) = vopt (y) − uopt (x). Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N } c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ). 20 Introduction The discrete case Measures The Euclidean case A foundamental theorem Assume that the cost is continuous and feasible, e.g. ZZ C(µ ⊗ ν) = c(x, y) d(µ ⊗ ν)(x, y) < +∞ (sufficient feasibility codition) X×Y Theorem Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c). Duality C(γ opt ) = min C(γ) = max P(u, v) = P(uopt , vopt ). Γ(µ,ν) Slackness For every (x, y) ∈ supp(γ) ( Π(c) connection by a transport ray) c(x, y) = vopt (y) − uopt (x). Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N } c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ). 20 Introduction The discrete case Measures The Euclidean case A foundamental theorem Assume that the cost is continuous and feasible, e.g. ZZ C(µ ⊗ ν) = c(x, y) d(µ ⊗ ν)(x, y) < +∞ (sufficient feasibility codition) X×Y Theorem Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c). Duality C(γ opt ) = min C(γ) = max P(u, v) = P(uopt , vopt ). Γ(µ,ν) Slackness For every (x, y) ∈ supp(γ) ( Π(c) connection by a transport ray) c(x, y) = vopt (y) − uopt (x). Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N } c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ). 20 Introduction The discrete case Measures The Euclidean case A foundamental theorem Assume that the cost is continuous and feasible, e.g. ZZ C(µ ⊗ ν) = c(x, y) d(µ ⊗ ν)(x, y) < +∞ (sufficient feasibility codition) X×Y Theorem Existence There rexists an optimal transference plan γ opt ∈ Γ(µ, ν) and a couple of optimal Kantorovich potentials (uopt , vopt ) ∈ Π(c). Duality C(γ opt ) = min C(γ) = max P(u, v) = P(uopt , vopt ). Γ(µ,ν) Slackness For every (x, y) ∈ supp(γ) ( Π(c) connection by a transport ray) c(x, y) = vopt (y) − uopt (x). Cyclical monotonicity For every (x1 , y1 ), (x2 , y2 ), · · · , (xN , yN ) in the support of γ and every permutation σ : {1, 2, · · · N } → {1, 2, · · · , N } c(x1 , y1 ) + · · · + c(xN , yN ) ≤ c(x1 , yσ(1) ) + · · · c(xN , yσ(N ) ). 20 Introduction The discrete case Measures The Euclidean case Outline 1 A short historical tour 2 The “discrete” case, duality and linear programming 3 The measure-theoretic setting 4 Euclidean spaces: geometry and transport maps 21 Introduction The discrete case Measures The Euclidean case Some important questions I Uniqueness of the optimal transference plan I Integrality I Links with the geometry: the cost function (x, y) depends on the distance between x and y (|x − y| when X = Y = Rd ) I I existence of a transport map. Regularity of Kantorovich potentials Further information when the measures µ = f L d L d and ν = gL d L d are absolutely continuous with respect to the Lebesgue measure: Z Z µ(A) = f (x) dx, ν(B) = g(y) dy. A B All these questions are strictly linked! From now on we will consider the Euclidean case X = Y = Rd . Introduction The discrete case Measures The Euclidean case Some important questions I Uniqueness of the optimal transference plan I Integrality I Links with the geometry: the cost function (x, y) depends on the distance between x and y (|x − y| when X = Y = Rd ) I Regularity of Kantorovich potentials I existence of a transport map. Further information when the measures µ = f L d L d and ν = gL d L d are absolutely continuous with respect to the Lebesgue measure: Z Z µ(A) = f (x) dx, ν(B) = g(y) dy. A B All these questions are strictly linked! From now on we will consider the Euclidean case X = Y = Rd . 22 Introduction The discrete case Measures The Euclidean case Some important questions I Uniqueness of the optimal transference plan I Integrality I Links with the geometry: the cost function (x, y) depends on the distance between x and y (|x − y| when X = Y = Rd ) I Regularity of Kantorovich potentials I existence of a transport map. Further information when the measures µ = f L d L d and ν = gL d L d are absolutely continuous with respect to the Lebesgue measure: Z Z µ(A) = f (x) dx, ν(B) = g(y) dy. A B All these questions are strictly linked! From now on we will consider the Euclidean case X = Y = Rd . 22 Introduction The discrete case Measures The Euclidean case Some important questions I Uniqueness of the optimal transference plan I Integrality I Links with the geometry: the cost function (x, y) depends on the distance between x and y (|x − y| when X = Y = Rd ) I Regularity of Kantorovich potentials I existence of a transport map. Further information when the measures µ = f L d L d and ν = gL d L d are absolutely continuous with respect to the Lebesgue measure: Z Z µ(A) = f (x) dx, ν(B) = g(y) dy. A B All these questions are strictly linked! From now on we will consider the Euclidean case X = Y = Rd . 22 Introduction The discrete case Measures The Euclidean case Some important questions I Uniqueness of the optimal transference plan I Integrality I Links with the geometry: the cost function (x, y) depends on the distance between x and y (|x − y| when X = Y = Rd ) I Regularity of Kantorovich potentials I existence of a transport map. Further information when the measures µ = f L d L d and ν = gL d L d are absolutely continuous with respect to the Lebesgue measure: Z Z µ(A) = f (x) dx, ν(B) = g(y) dy. A B All these questions are strictly linked! From now on we will consider the Euclidean case X = Y = Rd . 22 Introduction The discrete case Measures The Euclidean case Some important questions I Uniqueness of the optimal transference plan I Integrality I Links with the geometry: the cost function (x, y) depends on the distance between x and y (|x − y| when X = Y = Rd ) I Regularity of Kantorovich potentials I existence of a transport map. Further information when the measures µ = f L d L d and ν = gL d L d are absolutely continuous with respect to the Lebesgue measure: Z Z µ(A) = f (x) dx, ν(B) = g(y) dy. A B All these questions are strictly linked! From now on we will consider the Euclidean case X = Y = Rd . 22 Introduction The discrete case Measures The Euclidean case Integrality and transport maps At the continuous level the integrality condition could be informally stated by asking that (almost) every point x is the starting point of at most one transport ray. We can say that y is connected to x by a transport ray if (x, y) ∈ supp γ; thus we have (x, y 1 ), (x, y 2 ) ∈ supp γ ⇒ y1 = y2 =: t(x) a property which should hold µ-almost everywhere. t : X → Y is called transport map induced by the plan γ. It satisfies if A = t−1 (B) then µ(A) = ν(B) = γ(A × B). Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t is differentiable Z Z Z µ(A) = f (x) dx = ν(B) = g(y) dy = g(t(x))| det Dt(x)| dx A B A so that f (x) = g(t(x))| det Dt(x)|. 23 Introduction The discrete case Measures The Euclidean case Integrality and transport maps At the continuous level the integrality condition could be informally stated by asking that (almost) every point x is the starting point of at most one transport ray. We can say that y is connected to x by a transport ray if (x, y) ∈ supp γ; thus we have (x, y 1 ), (x, y 2 ) ∈ supp γ ⇒ y1 = y2 =: t(x) a property which should hold µ-almost everywhere. t : X → Y is called transport map induced by the plan γ. It satisfies if A = t−1 (B) then µ(A) = ν(B) = γ(A × B). Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t is differentiable Z Z Z µ(A) = f (x) dx = ν(B) = g(y) dy = g(t(x))| det Dt(x)| dx A B A so that f (x) = g(t(x))| det Dt(x)|. 23 Introduction The discrete case Measures The Euclidean case Integrality and transport maps At the continuous level the integrality condition could be informally stated by asking that (almost) every point x is the starting point of at most one transport ray. We can say that y is connected to x by a transport ray if (x, y) ∈ supp γ; thus we have (x, y 1 ), (x, y 2 ) ∈ supp γ ⇒ y1 = y2 =: t(x) a property which should hold µ-almost everywhere. t : X → Y is called transport map induced by the plan γ. It satisfies if A = t−1 (B) then µ(A) = ν(B) = γ(A × B). Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t is differentiable Z Z Z µ(A) = f (x) dx = ν(B) = g(y) dy = g(t(x))| det Dt(x)| dx A B A so that f (x) = g(t(x))| det Dt(x)|. 23 Introduction The discrete case Measures The Euclidean case Integrality and transport maps At the continuous level the integrality condition could be informally stated by asking that (almost) every point x is the starting point of at most one transport ray. We can say that y is connected to x by a transport ray if (x, y) ∈ supp γ; thus we have (x, y 1 ), (x, y 2 ) ∈ supp γ ⇒ y1 = y2 =: t(x) a property which should hold µ-almost everywhere. t : X → Y is called transport map induced by the plan γ. It satisfies if A = t−1 (B) then µ(A) = ν(B) = γ(A × B). Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t is differentiable Z Z Z µ(A) = f (x) dx = ν(B) = g(y) dy = g(t(x))| det Dt(x)| dx A B A so that f (x) = g(t(x))| det Dt(x)|. 23 Introduction The discrete case Measures The Euclidean case Existence and uniqueness of the optimal transport map: c(x, y) = 12 |x − y|2 Theorem (Brenier (1989)) Siano µ = f dx, ν = g dy, c(x, y) := 1 |x 2 − y|2 I There exists a unique optimal transference plan γ and it is associated to a transport map t. I The Kantorovich potentials are perturbations of convex functions; more precisely 1 |x|2 + u(x) = φ(x) 2 and 1 2 |y| − v(y) = ψ(y) 2 are convex and ψ is the Legendre transform of φ ψ(y) = φ∗ (y) = sup hy, xi − φ(x). x I t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is positive definite. I φ solves Monge-Ampére equation det D2 φ(x) = f (x) g(∇φ(x)) 24 Introduction The discrete case Measures The Euclidean case Existence and uniqueness of the optimal transport map: c(x, y) = 12 |x − y|2 Theorem (Brenier (1989)) Siano µ = f dx, ν = g dy, c(x, y) := 1 |x 2 − y|2 I There exists a unique optimal transference plan γ and it is associated to a transport map t. I The Kantorovich potentials are perturbations of convex functions; more precisely 1 |x|2 + u(x) = φ(x) 2 and 1 2 |y| − v(y) = ψ(y) 2 are convex and ψ is the Legendre transform of φ ψ(y) = φ∗ (y) = sup hy, xi − φ(x). x I t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is positive definite. I φ solves Monge-Ampére equation det D2 φ(x) = f (x) g(∇φ(x)) 24 Introduction The discrete case Measures The Euclidean case Existence and uniqueness of the optimal transport map: c(x, y) = 12 |x − y|2 Theorem (Brenier (1989)) Siano µ = f dx, ν = g dy, c(x, y) := 1 |x 2 − y|2 I There exists a unique optimal transference plan γ and it is associated to a transport map t. I The Kantorovich potentials are perturbations of convex functions; more precisely 1 |x|2 + u(x) = φ(x) 2 and 1 2 |y| − v(y) = ψ(y) 2 are convex and ψ is the Legendre transform of φ ψ(y) = φ∗ (y) = sup hy, xi − φ(x). x I t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is positive definite. I φ solves Monge-Ampére equation det D2 φ(x) = f (x) g(∇φ(x)) 24 Introduction The discrete case Measures The Euclidean case Existence and uniqueness of the optimal transport map: c(x, y) = 12 |x − y|2 Theorem (Brenier (1989)) Siano µ = f dx, ν = g dy, c(x, y) := 1 |x 2 − y|2 I There exists a unique optimal transference plan γ and it is associated to a transport map t. I The Kantorovich potentials are perturbations of convex functions; more precisely 1 |x|2 + u(x) = φ(x) 2 and 1 2 |y| − v(y) = ψ(y) 2 are convex and ψ is the Legendre transform of φ ψ(y) = φ∗ (y) = sup hy, xi − φ(x). x I t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is positive definite. I φ solves Monge-Ampére equation det D2 φ(x) = f (x) g(∇φ(x)) 24 Introduction The discrete case Measures The Euclidean case Existence and uniqueness of the optimal transport map: c(x, y) = 12 |x − y|2 Theorem (Brenier (1989)) Siano µ = f dx, ν = g dy, c(x, y) := 1 |x 2 − y|2 I There exists a unique optimal transference plan γ and it is associated to a transport map t. I The Kantorovich potentials are perturbations of convex functions; more precisely 1 |x|2 + u(x) = φ(x) 2 and 1 2 |y| − v(y) = ψ(y) 2 are convex and ψ is the Legendre transform of φ ψ(y) = φ∗ (y) = sup hy, xi − φ(x). x I t(x) = ∇φ(x) = x − ∇u(x) is the gradient of a convex function, it is essentially injective, a.e. differentiable, differenziabile, and Dt = D2 φ is positive definite. I φ solves Monge-Ampére equation det D2 φ(x) = f (x) g(∇φ(x)) Introduction The discrete case Measures The Euclidean case Brenier theorem µ = f dx, ν = g dx are absolutely continuous in Rd . Rd The optimal coupling γ ∈ Γo (µ, ν) is concentrated on the graph of a ν cyclically monotone map t: γ γ = (i × t)# µ Z |x − t(x)|2 dµ(x) W2 (µ, ν) = R ν d µ µ Rd t can be recovered by the optimal Kantorovich potentials u − v satisfying Z Z v(y) − u(x) ≤ |x − y|2 , W22 (µ, ν) = v(y) dν(y) − u(x) dµ(x) by t(x) = x + ∇u(x) = ∇ “1 2 ” |x|2 + u(x) , 1 |x|2 + u(x) 2 is convex. 25 Introduction The discrete case Measures The Euclidean case Brenier theorem µ = f dx, ν = g dx are absolutely continuous in Rd . Rd The optimal coupling γ ∈ Γo (µ, ν) is concentrated on the graph of a ν cyclically monotone map t: t γ = (i × t)# µ Z |x − t(x)|2 dµ(x) W2 (µ, ν) = R ν d µ µ Rd t can be recovered by the optimal Kantorovich potentials u − v satisfying Z Z v(y) − u(x) ≤ |x − y|2 , W22 (µ, ν) = v(y) dν(y) − u(x) dµ(x) by t(x) = x + ∇u(x) = ∇ “1 2 ” |x|2 + u(x) , 1 |x|2 + u(x) 2 is convex. 25 Introduction The discrete case Measures The Euclidean case Brenier theorem µ = f dx, ν = g dx are absolutely continuous in Rd . Rd The optimal coupling γ ∈ Γo (µ, ν) is concentrated on the graph of a ν cyclically monotone map t: t γ = (i × t)# µ Z |x − t(x)|2 dµ(x) W2 (µ, ν) = R ν d µ µ Rd t can be recovered by the optimal Kantorovich potentials u − v satisfying Z Z v(y) − u(x) ≤ |x − y|2 , W22 (µ, ν) = v(y) dν(y) − u(x) dµ(x) by t(x) = x + ∇u(x) = ∇ “1 2 ” |x|2 + u(x) , 1 |x|2 + u(x) 2 is convex. 25 Introduction The discrete case Measures The Euclidean case Brenier theorem µ = f dx, ν = g dx are absolutely continuous in Rd . Rd The optimal coupling γ ∈ Γo (µ, ν) is concentrated on the graph of a ν cyclically monotone map t: t γ = (i × t)# µ Z |x − t(x)|2 dµ(x) W2 (µ, ν) = R ν d µ µ Rd t can be recovered by the optimal Kantorovich potentials u − v satisfying Z Z v(y) − u(x) ≤ |x − y|2 , W22 (µ, ν) = v(y) dν(y) − u(x) dµ(x) by t(x) = x + ∇u(x) = ∇ “1 2 ” |x|2 + u(x) , 1 |x|2 + u(x) 2 is convex. 25 Introduction The discrete case Measures The Euclidean case Extensions and applications I Strictly convex costs c(x, y) = h(|x − y|): Gangbo-McCann,. . . (’96-) I Monge problem c(x, y) = |x − y|: Sudakov (’79), Ambrosio (2000),. . . , Bianchini, Champion-De Pascale,. . . I Regularity: (Caffarelli,. . . (’92-), Wang, Trudinger, Loeper, Villani, McCann,) I Isoperimetric and functional inequalities: Gromov, Villani, Otto, McCann, Maggi, Figalli, Pratelli, . . . I Hilbert and Wiener spaces: Feyel-Ustunel, Ambrosio-Gigli-S., (’04-), . . . I Riemannian manifold, Ricci flow: McCann, Sturm, Villani, Lott, Topping, Carfora . . . (’98-)) I ... Introduction The discrete case Measures The Euclidean case A distance between probability measures The quadratic cost c(x, y) = |x − y|2 induces a distance between probability measures with finite quadratic moment (P2 (Rd )): the so-called Kantorovich-Rubinstein-Wasserstein distance ZZ “ ”1/2 “ ”1/2 W2 (µ, ν) := C(µ, ν) = min |x − y|2 dγ(x, y) γ ∈Γ(µ,ν ) This distance has a simple interpretation in the case of discrete measures: if N N 1 X 1 X δxk e ν = δy allora µ= N k=1 N k=1 k W22 (µ, ν) = min σ N 1 X |xk − yσ(k) |2 , N k=1 σ permutation of {1, 2, · · · , N } P2 (Rd ), W2 is a complete and separable metric space, the distance W2 is associated to the weak convergence of measures: 8Z Z > < ζ(x) dµn (x) → ζ(x) dµ(x) W2 (µn , µ) → 0 ⇔ > : per ogni ζ ∈ C 0 (Rd ), |ζ(x)| ≤ A|x|2 + B. 27 Introduction The discrete case Measures The Euclidean case A distance between probability measures The quadratic cost c(x, y) = |x − y|2 induces a distance between probability measures with finite quadratic moment (P2 (Rd )): the so-called Kantorovich-Rubinstein-Wasserstein distance ZZ “ ”1/2 “ ”1/2 W2 (µ, ν) := C(µ, ν) = min |x − y|2 dγ(x, y) γ ∈Γ(µ,ν ) This distance has a simple interpretation in the case of discrete measures: if N N 1 X 1 X µ= δxk e ν = δy allora N k=1 N k=1 k W22 (µ, ν) = min σ N 1 X |xk − yσ(k) |2 , N k=1 σ permutation of {1, 2, · · · , N } P2 (Rd ), W2 is a complete and separable metric space, the distance W2 is associated to the weak convergence of measures: 8Z Z > < ζ(x) dµn (x) → ζ(x) dµ(x) W2 (µn , µ) → 0 ⇔ > : per ogni ζ ∈ C 0 (Rd ), |ζ(x)| ≤ A|x|2 + B. 27 Introduction The discrete case Measures The Euclidean case A distance between probability measures The quadratic cost c(x, y) = |x − y|2 induces a distance between probability measures with finite quadratic moment (P2 (Rd )): the so-called Kantorovich-Rubinstein-Wasserstein distance ZZ “ ”1/2 “ ”1/2 W2 (µ, ν) := C(µ, ν) = min |x − y|2 dγ(x, y) γ ∈Γ(µ,ν ) This distance has a simple interpretation in the case of discrete measures: if N N 1 X 1 X µ= δxk e ν = δy allora N k=1 N k=1 k W22 (µ, ν) = min σ N 1 X |xk − yσ(k) |2 , N k=1 σ permutation of {1, 2, · · · , N } P2 (Rd ), W2 is a complete and separable metric space, the distance W2 is associated to the weak convergence of measures: 8Z Z > < ζ(x) dµn (x) → ζ(x) dµ(x) W2 (µn , µ) → 0 ⇔ > : per ogni ζ ∈ C 0 (Rd ), |ζ(x)| ≤ A|x|2 + B. 27 Introduction The discrete case Measures The Euclidean case Weak convergence, lower semicontinuity, and compactness Definition (Weak convergence) A sequence µn ∈ P(Rm ) converges weakly to µ ∈ P(Rm ) if Z Z lim ϕ(x) dµn (x) = ϕ(x) dµ(x) ∀ϕ ∈ Cb0 (Rd ) n→+∞ I Rm Rm Test functions ϕ can be equivalently choosen in Cc0 (Rd ) or in Cc∞ (Rd ), as for distributional convergence. I If Xn → X pointwise, then (Xn )# P * X# P. I If ζ : Rd → [0, +∞] is just lower semicontinuous (no boundedness is required) and µn * µ then Z Z lim inf ζ(x) dµn (x) ≥ ζ(x) dµ(x). n→+∞ I Rd Rd Prokhorov Theorem: A set Γ ⊂ P(Rd ) is weakly relatively compact iff it is tight, i.e. for every ε > 0 there exists a compact set K b Rd : µ(Rd \ K) ≤ ε ∀ µ ∈ Γ. 28 Introduction The discrete case Measures The Euclidean case Weak convergence, lower semicontinuity, and compactness Definition (Weak convergence) A sequence µn ∈ P(Rm ) converges weakly to µ ∈ P(Rm ) if Z Z lim ϕ(x) dµn (x) = ϕ(x) dµ(x) ∀ϕ ∈ Cb0 (Rd ) n→+∞ I Rm Rm Test functions ϕ can be equivalently choosen in Cc0 (Rd ) or in Cc∞ (Rd ), as for distributional convergence. I If Xn → X pointwise, then (Xn )# P * X# P. I If ζ : Rd → [0, +∞] is just lower semicontinuous (no boundedness is required) and µn * µ then Z Z lim inf ζ(x) dµn (x) ≥ ζ(x) dµ(x). n→+∞ I Rd Rd Prokhorov Theorem: A set Γ ⊂ P(Rd ) is weakly relatively compact iff it is tight, i.e. for every ε > 0 there exists a compact set K b Rd : µ(Rd \ K) ≤ ε ∀ µ ∈ Γ. 28 Introduction The discrete case Measures The Euclidean case Weak convergence, lower semicontinuity, and compactness Definition (Weak convergence) A sequence µn ∈ P(Rm ) converges weakly to µ ∈ P(Rm ) if Z Z lim ϕ(x) dµn (x) = ϕ(x) dµ(x) ∀ϕ ∈ Cb0 (Rd ) n→+∞ I Rm Rm Test functions ϕ can be equivalently choosen in Cc0 (Rd ) or in Cc∞ (Rd ), as for distributional convergence. I If Xn → X pointwise, then (Xn )# P * X# P. I If ζ : Rd → [0, +∞] is just lower semicontinuous (no boundedness is required) and µn * µ then Z Z lim inf ζ(x) dµn (x) ≥ ζ(x) dµ(x). n→+∞ I Rd Rd Prokhorov Theorem: A set Γ ⊂ P(Rd ) is weakly relatively compact iff it is tight, i.e. for every ε > 0 there exists a compact set K b Rd : µ(Rd \ K) ≤ ε ∀ µ ∈ Γ. 28 Introduction The discrete case Measures The Euclidean case Weak convergence, lower semicontinuity, and compactness Definition (Weak convergence) A sequence µn ∈ P(Rm ) converges weakly to µ ∈ P(Rm ) if Z Z lim ϕ(x) dµn (x) = ϕ(x) dµ(x) ∀ϕ ∈ Cb0 (Rd ) n→+∞ I Rm Rm Test functions ϕ can be equivalently choosen in Cc0 (Rd ) or in Cc∞ (Rd ), as for distributional convergence. I If Xn → X pointwise, then (Xn )# P * X# P. I If ζ : Rd → [0, +∞] is just lower semicontinuous (no boundedness is required) and µn * µ then Z Z lim inf ζ(x) dµn (x) ≥ ζ(x) dµ(x). n→+∞ I Rd Rd Prokhorov Theorem: A set Γ ⊂ P(Rd ) is weakly relatively compact iff it is tight, i.e. for every ε > 0 there exists a compact set K b Rd : µ(Rd \ K) ≤ ε ∀ µ ∈ Γ. 28 Introduction The discrete case Measures The Euclidean case Optimal couplings and triangular inequality Lower semicontinuity and tightness: the minimum problem nZ o W22 (µ1 , µ2 ) := min |x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 ) m R ×Rm is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling could exist. Connecting a sequence of measures, disintegration and Kolmogorov theorem: if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · · such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P. In particular ˆ ˜ W22 (µj , µj+1 ) = E |Xj − Xj+1 |2 (Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general if h, k are not consecutive. Application: W2 is a distance, triangular inequality. W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) “ ˆ ˜”1/2 “ ˆ ˜”1/2 W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2 = E |(X1 − X2 ) + (X2 − X3 )|2 “ ˆ ˜”1/2 “ ˆ ˜”1/2 ≤ E |X1 − X2 |2 + E |X2 − X3 |2 = W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) Introduction The discrete case Measures The Euclidean case Optimal couplings and triangular inequality Lower semicontinuity and tightness: the minimum problem nZ o W22 (µ1 , µ2 ) := min |x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 ) m R ×Rm is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling could exist. Connecting a sequence of measures, disintegration and Kolmogorov theorem: if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · · such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P. In particular ˆ ˜ W22 (µj , µj+1 ) = E |Xj − Xj+1 |2 (Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general if h, k are not consecutive. Application: W2 is a distance, triangular inequality. W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) “ ˆ ˜”1/2 “ ˆ ˜”1/2 W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2 = E |(X1 − X2 ) + (X2 − X3 )|2 “ ˆ ˜”1/2 “ ˆ ˜”1/2 ≤ E |X1 − X2 |2 + E |X2 − X3 |2 = W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) 29 Introduction The discrete case Measures The Euclidean case Optimal couplings and triangular inequality Lower semicontinuity and tightness: the minimum problem nZ o W22 (µ1 , µ2 ) := min |x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 ) m R ×Rm is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling could exist. Connecting a sequence of measures, disintegration and Kolmogorov theorem: if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · · such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P. In particular ˆ ˜ W22 (µj , µj+1 ) = E |Xj − Xj+1 |2 (Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general if h, k are not consecutive. Application: W2 is a distance, triangular inequality. W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) “ ˆ ˜”1/2 “ ˆ ˜”1/2 W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2 = E |(X1 − X2 ) + (X2 − X3 )|2 “ ˆ ˜”1/2 “ ˆ ˜”1/2 ≤ E |X1 − X2 |2 + E |X2 − X3 |2 = W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) 29 Introduction The discrete case Measures The Euclidean case Optimal couplings and triangular inequality Lower semicontinuity and tightness: the minimum problem nZ o W22 (µ1 , µ2 ) := min |x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 ) m R ×Rm is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling could exist. Connecting a sequence of measures, disintegration and Kolmogorov theorem: if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · · such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P. In particular ˆ ˜ W22 (µj , µj+1 ) = E |Xj − Xj+1 |2 (Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general if h, k are not consecutive. Application: W2 is a distance, triangular inequality. W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) “ ˆ ˜”1/2 “ ˆ ˜”1/2 W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2 = E |(X1 − X2 ) + (X2 − X3 )|2 “ ˆ ˜”1/2 “ ˆ ˜”1/2 + E |X2 − X3 |2 = W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) ≤ E |X1 − X2 |2 29 Introduction The discrete case Measures The Euclidean case Optimal couplings and triangular inequality Lower semicontinuity and tightness: the minimum problem nZ o W22 (µ1 , µ2 ) := min |x1 − x2 |2 dµ(x1 , x2 ) : µ ∈ Γ(µ1 , µ2 ) m R ×Rm is attained: Γo (µ1 , µ2 ) denotes the collection (closed, convex set) of all the optimal couplings in P2 (Rm × Rm ). In general more than one optimal coupling could exist. Connecting a sequence of measures, disintegration and Kolmogorov theorem: if µ1,2 ∈ Γo (µ1 , µ2 ), µ2,3 ∈ Γo (µ2 , µ3 ), · · · , µj,j+1 ∈ Γo (µj , µj+1 ) then there exists a probability measure P and random variables X1 , X2 , X3 , · · · , Xj , Xj+1 , · · · such that µ1,2 = (X1 , X2 )# P, · · · , µj,j+1 = (Xj , Xj+1 )# P. In particular ˆ ˜ W22 (µj , µj+1 ) = E |Xj − Xj+1 |2 (Xh , Xk )# P ∈ Γ(µh , µk ) but it is not optimal in general if h, k are not consecutive. Application: W2 is a distance, triangular inequality. W2 (µ1 , µ3 ) ≤ W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) “ ˆ ˜”1/2 “ ˆ ˜”1/2 W2 (µ1 , µ3 ) ≤ E |X1 − X3 |2 = E |(X1 − X2 ) + (X2 − X3 )|2 “ ˆ ˜”1/2 “ ˆ ˜”1/2 + E |X2 − X3 |2 = W2 (µ1 , µ2 ) + W2 (µ2 , µ3 ) ≤ E |X1 − X2 |2 29 Introduction The discrete case Measures The Euclidean case “Soft” properties ⇔ Weak convergence + convergence of the quadratic moments. I Convergence with respect to W I Completeness (if one considers all the probability measures in P2 (Rm )). I Lower semicontinuity with respect to weak/distributional convergence I Convexity (but linear segments are not geodesics!) I Existence of (constant speed, minimizing) geodesics connecting arbitrary measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t. W2 (µ0 , µ1 ) = L10 [µ], W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ). Introduction The discrete case Measures The Euclidean case “Soft” properties ⇔ Weak convergence + convergence of the quadratic moments. I Convergence with respect to W I Completeness (if one considers all the probability measures in P2 (Rm )). I Lower semicontinuity with respect to weak/distributional convergence I Convexity (but linear segments are not geodesics!) I Existence of (constant speed, minimizing) geodesics connecting arbitrary measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t. W2 (µ0 , µ1 ) = L10 [µ], W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ). 30 Introduction The discrete case Measures The Euclidean case “Soft” properties ⇔ Weak convergence + convergence of the quadratic moments. I Convergence with respect to W I Completeness (if one considers all the probability measures in P2 (Rm )). I Lower semicontinuity with respect to weak/distributional convergence I Convexity (but linear segments are not geodesics!) I Existence of (constant speed, minimizing) geodesics connecting arbitrary measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t. W2 (µ0 , µ1 ) = L10 [µ], W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ). 30 Introduction The discrete case Measures The Euclidean case “Soft” properties ⇔ Weak convergence + convergence of the quadratic moments. I Convergence with respect to W I Completeness (if one considers all the probability measures in P2 (Rm )). I Lower semicontinuity with respect to weak/distributional convergence I Convexity (but linear segments are not geodesics!) I Existence of (constant speed, minimizing) geodesics connecting arbitrary measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t. W2 (µ0 , µ1 ) = L10 [µ], W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ). 30 Introduction The discrete case Measures The Euclidean case “Soft” properties ⇔ Weak convergence + convergence of the quadratic moments. I Convergence with respect to W I Completeness (if one considers all the probability measures in P2 (Rm )). I Lower semicontinuity with respect to weak/distributional convergence I Convexity (but linear segments are not geodesics!) I Existence of (constant speed, minimizing) geodesics connecting arbitrary measures µ0 , µ1 : they are curves µ : t ∈ [0, 1] 7→ µt s.t. W2 (µ0 , µ1 ) = L10 [µ], W2 (µs , µt ) = |t − s| W2 (µ0 , µ1 ).

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download A quick introduction to Optimal Transport