Class Regularized Optimal Transport
Old implementation
This code is an old implementation and is available here for historical
reasons. We strongly recommend the use of the POT Toolbox for domain
adaptation
Description
Python toolbox for computing optimal transport with class regularization.
This is the code that has been used for the numerical experiments in the paper:
N. Courty, R. Flamary, D. Tuia, Domain adaptation with regularized optimal transport, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), 2014.
[Abstract]
[BibTeX]
[PDF]
[Code]
Abstract: We present a new and original method to solve the domain adaptation problem using optimal transport. By searching for the best
transportation plan between the probability distribution functions of a
source and a target domain, a non-linear and invertible transformation
of the learning samples can be estimated. Any standard machine learning method can then be applied on the transformed set, which makes
our method very generic. We propose a new optimal transport algorithm
that incorporates label information in the optimization: this is achieved
by combining an efficient matrix scaling technique together with a majoration of a non-convex regularization term. By using the proposed optimal transport with label regularization, we obtain significant increase in
performance compared to the original transport solution. The proposed
algorithm is computationally efficient and effective, as illustrated by its
evaluation on a toy example and a challenging real life vision dataset,
against which it achieves competitive results with respect to state-of-the-art methods.
BibTeX:
@inproceedings{courty2014domain,
author = {Courty, N. and Flamary, R. and Tuia, D.},
title = {Domain adaptation with regularized optimal transport},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)},
editor = {},
year = {2014}
}
Solvers
We provide regularized optimal transport solvers of the form:
$$
\begin{equation*}
\min_{\boldsymbol{\gamma}\in {\bf \mathcal{P}}} \quad\quad \left < \mathbf{M},\boldsymbol{\gamma} \right >_F \quad +\quad \Omega(\boldsymbol{\gamma})
\end{equation*}
$$
where ${\bf \mathcal{P}}=\{\boldsymbol{\gamma}\mathbf{1}=\boldsymbol{\mu}_t,\boldsymbol{\gamma}^\top\mathbf{1}=\boldsymbol{\mu}_s,\boldsymbol{\gamma}\geq 0\}$ is the convex set of matrices satisfying marginal $\boldsymbol{\mu}_s$ and $\boldsymbol{\mu}_t$ and $\mathbf{M}$ is a transportation cost matrix.
The regularization term $\Omega(\boldsymbol{\gamma})$ can be :
- Classic LP transport: $\quad\Omega(\boldsymbol{\gamma})=0$
- Sinkhorn regularization : $\quad\Omega(\boldsymbol{\gamma})=\frac{1}{\lambda}\sum_{i,j} \gamma_{i,j} \log \gamma_{i,j}$
- Sinkhorn + Class regularization : $\quad\Omega(\boldsymbol{\gamma})=\frac{1}{\lambda}\sum_{i,j} \gamma_{i,j} \log \gamma_{i,j}+ \eta \sum_j \sum_c || \gamma_{\mathcal{I}_c,j}||_q^p$
Installation
Python dependencies
- Numpy, Matplotlib, Scipy
- cvxopt
Short documentation
- transport.py python module containing all the optimal transport solvers.
Entry points:
- example_visu_tranport.py script illustrating the different OT approaches
- run_vision_dataset loop for comparaing domain adaptation on the computer vision dataset
Aknowlegments
- Marco Cuturi for providing us with the matlab version of Sinkhorn.