Rémi Flamary

Professional website

Discrete optimal transport

Short introduction to optimal transport (OT)

Data and definitions

OT aims at computing a minimal cost transportation between a source probability distribution 409241564a05494fd1a2d09cc061b9c8, and a target distribution e77a5e539e2d8f28add78ae5799a2e96. Here we consider the case when those distributions are only available through a finite number of samples, and can be written as

where 5deb4921eb077a641138b33e1f710b07 is the Dirac at location 602f309090abc26e1916129553aa0767. a0fd2b89929b5a31d60c7a163983cd9d and 1925160f84a2c9f3afffdfbe492e12ce are probability masses associated to the 865c0c0b4ab0e063e5caa3387c1a8741-th sample, belonging to the probability simplex, i.e. 0690f4d621738a8829bdd1a53e60e6f6. The source and target samples are in matrices 8700e03c1f59c30db807ff90c7d87fca and e5e3a02dca6b421512608a496504316c, respectively. The set of probabilistic couplings between these two distributions is then the set of doubly stochastic matrices d55bf8ca0f4d710e3d3fb9710444e0f9 defined as
where 32fa3dd9ba6571105c35ed36d9d12ae6 is a d-dimensional vector of ones.

Optimisation problem

The Kantorovitch formulation of the optimal transport is:

where e756e0fe9a70065375e5dca452382352 is the Frobenius dot product and 586cf4d1c3cfed351e86df55aa52c011 is the cost function matrix of term d5cd3e2e37dd0ee47dc82b6b414d756a related to the energy needed to move a probability mass from 8effc81c662ebe225ce08e2ff7db735f to 8a0eeb567b2b38569f9b1c785d245ca5. In our setting, this cost is chosen as the Euclidian distance between the two locations, i.e. a102136ef2cd9153a783323a1ef3a904, but other types of metric could be considered, such as Riemannian distances over a manifold.

One can also add a regularization term to the optimal transport. In this case the problem becomes

where 34b3dd1be1c4c572f94faced941f0a78 is a regularization term that can be for instance an entropy-based regularization such as
This term has been recently proposed by Cuturi [2] as a way to regularize the transport and the optimization problem is much more easy to solve thanks to an efficient algorithm called Sinkhorn-Knopp.


When the optimal transportation plan 5e715b19eab571fe23d528bd65ac0730 has been estimated,one can interpolate the distributions from the source to the target. The natural interpolation corresponds to a displacement of the diracs from the source to target distribution

where 22bafd6a4d2255a6c8b96ac73b92ea79 is the parametrizes the transport from the source distribution 409241564a05494fd1a2d09cc061b9c8 (when 3e8f7b0adf6d7024b951f29a18225e4a)to the target distribution e77a5e539e2d8f28add78ae5799a2e96 (when b73c3280b6f85a6ac520af103083f535 ).

One can also transport the source distribution by expressing the new transported distribution 851164f02a3f6bac9141b71bbf7fb67d as linear combination of elements from 9fd76ecf97aed76986a43378ab044da9. The transportation of the source samples ceaeb88542f10848ab62713a48e51c31 onto that target samples 9fd76ecf97aed76986a43378ab044da9 is computed with:

The interpolation is then expressed as

Those two interpolations have been used in numerous publisations and are illustrated in the following.


We illustrate the OT problem solution and resulting interpolation for a simples 2D dataset in the following. In this particular case the number of source and target samples are equal and the probability masses are all equal to .

Data set

data data

On the left, we plot the 2D position of the toy example, when the mouse pointer is on the image, you can see the index of each point. On the right, the corresponding cost matrix 05d05e751a80db7375eae13c25f0ca13, the xias correspond to the index of the samples on the right.

Problem solution

result_plot result


interp_dirac interp_bary
Optimisation problem Interpolation time t in [0,1]
No regul. (LP)
Entropic regul. (Sinkhorn)
0 1

This section illusttrate the two interpolation procedures (diracs and cernter of mass) for two different regularization scheme. When no regularization is used, the solution is a permutation matrix and the mass is tranfered from a unique source sample to a unique target sample. When an entropic regularization is used, the transportation matrix is not sparse anymore and the mass is spread onto several target samples. Use the controls at the end of the page to change the optimal transport regularization and interpolation time 22bafd6a4d2255a6c8b96ac73b92ea79.


A complete reference to Optimal Transport has been writtten by C. Villani [1] and is available on the web. The entropic regularization has been proposed in the machine learning community by M. Cuturi [2]. Other regularization such as Laplacian regularization have been proposed in [3].

[1] C. Villani. Optimal transport: old and new. Grundlehren der mathematischen Wissenschaften. Springer, 2009.

[2] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transportation. In NIPS, pages 2292–2300. 2013.

[3] S. Ferradans, N. Papadakis, J. Rabin, G. Peyré, and J.-F. Aujol. Regularized discrete optimal transport. In Scale Space and Variational Methods in Computer Vision, SSVM, pages 428–439, 2013.