Rémi Flamary

Professional website



I am Professor in the Applied Mathematics department and CMAP Laboratory from École Polytechnique. I was previously associate professor at Université Côte d'Azur in the Department of Electronics and in the Lagrange Laboratory that is part of the Observatoire de la Côte d'Azur. I was a PhD student and teaching assistant at the LITIS Laboratory and my PhD advisor was Alain Rakotomamonjy at Rouen University.

On this website, you can find a list of my publications and download the corresponding software/code. Some of my french teaching material is also available.

Research Interests

  • Machine learning and statistical signal processing
    • Classification, supervised learning
    • Kernel methods, Support Vector Machines
    • Optimization with sparsity, variable selection, mixed norms, non convex regularization
    • Feature learning, data representation, kernel learning
    • Convolutional neural networks, filter learning, image reconstruction
    • Optimal transport, domain adaptation
  • Applications
    • Biomedical engineering, Brain-Computer Interfaces
    • Remote sensing and hyperspectral Imaging
    • Energy and climate
    • Astronomical image processing

Wordcloud of my research interests.

Recent work

C. Vincent-Cuaz, R. Flamary, M. Corneli, T. Vayer, N. Courty, Template based Graph Neural Network with Optimal Transport Distances, Neural Information Processing Systems (NeurIPS), 2022.
Abstract: Current Graph Neural Networks (GNN) architectures generally rely on two important components: node features embedding through message passing, and aggregation with a specialized form of pooling. The structural (or topological) information is implicitly taken into account in these two steps. We propose in this work a novel point of view, which places distances to some learnable graph templates at the core of the graph representation. This distance embedding is constructed thanks to an optimal transport distance: the Fused Gromov-Wasserstein (FGW) distance, which encodes simultaneously feature and structure dissimilarities by solving a soft graph-matching problem. We postulate that the vector of FGW distances to a set of template graphs has a strong discriminative power, which is then fed to a non-linear classifier for final predictions. Distance embedding can be seen as a new layer, and can leverage on existing message passing techniques to promote sensible feature representations. Interestingly enough, in our work the optimal set of template graphs is also learnt in an end-to-end fashion by differentiating through this layer. After describing the corresponding learning procedure, we empirically validate our claim on several synthetic and real life graph classification datasets, where our method is competitive or surpasses kernel and GNN state-of-the-art approaches. We complete our experiments by an ablation study and a sensitivity analysis to parameters.
author = { Vincent-Cuaz, Cédric and Flamary, Rémi and Corneli, Marco and Vayer, Titouan and Courty, Nicolas},
title = {Template based Graph Neural Network with Optimal Transport   Distances},
booktitle = {Neural Information Processing Systems (NeurIPS)},
year = {2022}
A. Thual, H. Tran, T. Zemskova, N. Courty, R. Flamary, S. Dehaene, B. Thirion, Aligning individual brains with Fused Unbalanced Gromov-Wasserstein, Neural Information Processing Systems (NeurIPS), 2022.
Abstract: Individual brains vary in both anatomy and functional organization, even within a given species. Inter-individual variability is a major impediment when trying to draw generalizable conclusions from neuroimaging data collected on groups of subjects. Current co-registration procedures rely on limited data, and thus lead to very coarse inter-subject alignments. In this work, we present a novel method for inter-subject alignment based on Optimal Transport, denoted as Fused Unbalanced Gromov Wasserstein (FUGW). The method aligns cortical surfaces based on the similarity of their functional signatures in response to a variety of stimulation settings, while penalizing large deformations of individual topographic organization. We demonstrate that FUGW is well-suited for whole-brain landmark-free alignment. The unbalanced feature allows to deal with the fact that functional areas vary in size across subjects. Our results show that FUGW alignment significantly increases between-subject correlation of activity for independent functional data, and leads to more precise mapping at the group level.
author = { Thual, Alexis and Tran, Huy and Zemskova, Tatiana and Courty, Nicolas and Flamary, Rémi and Dehaene, Stanislas and Thirion, Bertrand},
title = {Aligning individual brains with Fused Unbalanced Gromov-Wasserstein},
booktitle = {Neural Information Processing Systems (NeurIPS)},
year = {2022}
C. Vincent-Cuaz, R. Flamary, M. Corneli, T. Vayer, N. Courty, Semi-relaxed Gromov Wasserstein divergence with applications on graphs, International Conference on Learning Representations (ICLR), 2022.
Abstract: Comparing structured objects such as graphs is a fundamental operation involved in many learning tasks. To this end, the Gromov-Wasserstein (GW) distance, based on Optimal Transport (OT), has proven to be successful in handling the specific nature of the associated objects. More specifically, through the nodes connectivity relations, GW operates on graphs, seen as probability measures over specific spaces. At the core of OT is the idea of conservation of mass, which imposes a coupling between all the nodes from the two considered graphs. We argue in this paper that this property can be detrimental for tasks such as graph dictionary or partition learning, and we relax it by proposing a new semi-relaxed Gromov-Wasserstein divergence. Aside from immediate computational benefits, we discuss its properties, and show that it can lead to an efficient graph dictionary learning algorithm. We empirically demonstrate its relevance for complex tasks on graphs such as partitioning, clustering and completion.
author = {Vincent-Cuaz, Cédric and Flamary, Rémi and Corneli, Marco and   Vayer, Titouan and Courty, Nicolas},
title = {Semi-relaxed Gromov Wasserstein divergence with applications on graphs},
booktitle = {International Conference on Learning Representations (ICLR)},
year = {2022}
L. Chapel, R. Flamary, H. Wu, C. Févotte, G. Gasso, Unbalanced Optimal Transport through Non-negative Penalized Linear Regression, Neural Information Processing Systems (NeurIPS), 2021.
Abstract: This paper addresses the problem of Unbalanced Optimal Transport (UOT) in which the marginal conditions are relaxed (using weighted penalties in lieu of equality) and no additional regularization is enforced on the OT plan. In this context, we show that the corresponding optimization problem can be reformulated as a non-negative penalized linear regression problem. This reformulation allows us to propose novel algorithms inspired from inverse problems and nonnegative matrix factorization. In particular, we consider majorization-minimization which leads in our setting to efficient multiplicative updates for a variety of penalties. Furthermore, we derive for the first time an efficient algorithm to compute the regularization path of UOT with quadratic penalties. The proposed algorithm provides a continuity of piece-wise linear OT plans converging to the solution of balanced OT (corresponding to infinite penalty weights). We perform several numerical experiments on simulated and real data illustrating the new algorithms, and provide a detailed discussion about more sophisticated optimization tools that can further be used to solve OT problems thanks to our reformulation.
author = {Chapel, Laetitia and Flamary, Rémi and Wu, Haoran and Févotte, Cédric   and Gasso, Gilles},
title = {Unbalanced Optimal Transport through Non-negative Penalized Linear Regression},
booktitle = {Neural Information Processing Systems (NeurIPS)},
year = {2021}
K. Fatras, B. Bhushan Damodaran, S. Lobry, R. Flamary, D. Tuia, N. Courty, Wasserstein Adversarial Regularization for learning with label noise, Pattern Analysis and Machine Intelligence, IEEE Transactions on , 2021.
Abstract: Noisy labels often occur in vision datasets, especially when they are obtained from crowdsourcing or Web scraping. We propose a new regularization method, which enables learning robust classifiers in presence of noisy data. To achieve this goal, we propose a new adversarial regularization scheme based on the Wasserstein distance. Using this distance allows taking into account specific relations between classes by leveraging the geometric properties of the labels space. Our Wasserstein Adversarial Regularization (WAR) encodes a selective regularization, which promotes smoothness of the classifier between some classes, while preserving sufficient complexity of the decision boundary between others. We first discuss how and why adversarial regularization can be used in the context of label noise and then show the effectiveness of our method on five datasets corrupted with noisy labels: in both benchmarks and real datasets, WAR outperforms the state-of-the-art competitors.
author = { Fatras, Kilian and Bhushan Damodaran, Bharath and Lobry, Sylvain and Flamary, Rémi and Tuia, Devis and Courty, Nicolas},
title = {Wasserstein Adversarial Regularization for learning with label          noise},
journal = { Pattern Analysis and Machine Intelligence, IEEE Transactions on },
year = {2021}
C. Vincent-Cuaz, T. Vayer, R. Flamary, M. Corneli, N. Courty, Online Graph Dictionary Learning, International Conference on Machine Learning (ICML), 2021.
Abstract: Dictionary learning is a key tool for representation learning that explains the data as linear combination of few basic elements. Yet, this analysis is not amenable in the context of graph learning, as graphs usually belong to different metric spaces. We fill this gap by proposing a new online Graph Dictionary Learning approach, which uses the Gromov Wasserstein divergence for the data fitting term. In our work, graphs are encoded through their nodes' pairwise relations and modeled as convex combination of graph atoms, i.e. dictionary elements, estimated thanks to an online stochastic algorithm, which operates on a dataset of unregistered graphs with potentially different number of nodes. Our approach naturally extends to labeled graphs, and is completed by a novel upper bound that can be used as a fast approximation of Gromov Wasserstein in the embedding space. We provide numerical evidences showing the interest of our approach for unsupervised embedding of graph datasets and for online graph subspace estimation and tracking.
author = {Vincent-Cuaz, Cédric and Vayer, Titouan and Flamary, Rémi and Corneli, Marco and Courty, Nicolas},
title = {Online Graph Dictionary Learning},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2021}
K. Fatras, T. Séjourné, N. Courty, R. Flamary, Unbalanced minibatch Optimal Transport; applications to Domain Adaptation, International Conference on Machine Learning (ICML), 2021.
Abstract: Optimal transport distances have found many applications in machine learning for their capacity to compare non-parametric probability distributions. Yet their algorithmic complexity generally prevents their direct use on large scale datasets. Among the possible strategies to alleviate this issue, practitioners can rely on computing estimates of these distances over subsets of data, \em i.e. minibatches. While computationally appealing, we highlight in this paper some limits of this strategy, arguing it can lead to undesirable smoothing effects. As an alternative, we suggest that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behavior. We discuss the associated theoretical properties, such as unbiased estimators, existence of gradients and concentration bounds. Our experimental study shows that in challenging problems associated to domain adaptation, the use of unbalanced optimal transport leads to significantly better results, competing with or surpassing recent baselines.
author = {Fatras, Kilian and Séjourné, Thibault and Courty, Nicolas and   Flamary, Rémi},
title = {Unbalanced minibatch Optimal Transport; applications to Domain Adaptation},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2021}
R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, L. Gautheron, N. T. Gayraud, H. Janati, A. Rakotomamonjy , I. Redko, A. Rolet, A. Schutz, V. S. a. D. J. Sutherland, R. Tavenard, A. Tong, T. Vayer, POT: Python Optimal Transport, Journal of Machine Learning Research, Vol. 22, N. 78, pp 1-8, 2021.
Abstract: Optimal transport has recently been reintroduced to the machine learning community thanks in part to novel efficient optimization procedures allowing for medium to large scale applications. We propose a Python toolbox that implements several key optimal transport ideas for the machine learning community. The toolbox contains implementations of a number of founding works of OT for machine learning such as Sinkhorn algorithm and Wasserstein barycenters, but also provides generic solvers that can be used for conducting novel fundamental research. This toolbox, named POT for Python Optimal Transport, is open source with an MIT license.
author = { Rémi Flamary and Nicolas Courty and Alexandre Gramfort and   Mokhtar Z. Alaya and Aurélie Boisbunon and Stanislas Chambon and Laetitia
  Chapel and Adrien Corenflos and Kilian Fatras and Nemo Fournier and Léo
  Gautheron and Nathalie T.H. Gayraud and Hicham Janati and Alain Rakotomamonjy
  and Ievgen Redko and Antoine Rolet and Antony Schutz and Vivien Seguy and
  Danica J. Sutherland and Romain Tavenard and Alexander Tong and Titouan
title = {POT: Python Optimal Transport},
journal = { Journal of Machine Learning Research},
volume = { 22},
number = { 78},
pages = { 1-8},
year = {2021}


Release of the version 0.8 of POT Python Optimal Transport


As the maintainer of the POT Python Optimal Transport toolbox I am very happy to announce the new release 0.8 of the toolbox. It contains several new major features:

  • OpenMP implementation for the exact OT solver.
  • Backend for solving OT problems on Numpy/Pytorch/jax arrays (CPU or GPU)
  • Differentiable solvers for compatible backends.
  • Several new examples in the documentation
  • Compiled wheels ARM on Mac and Raspberry PI.

More details are in the release notes.

Elected as an ELLIS Scholar


I am honored to have been elected as an ELLIS Scholar in the Paris ELLIS Unit. ELLIS is the European Lab for Learning and Intelligent Systems and whose aim is to promote machine learning and modern AI research in europe.

Optimal Transport for Machine Learning Workshop at NeurIPS 2021


We are organizing with Jason Altschuler, Charlotte Bunne, Laetitia Chapel, Alexandra Suvorikova, Marco Cuturi and Gabriel Peyré the fourth OTML Workshop at NeurIPS 2021 on 13 December 2021.

Plenary Speakers
Keynote Talks

This workshop is organized with the following partners: ELLIS, 3IA Côte d'Azur, Prairie Institute.