Rémi Flamary

Site web professionel

Home

Je suis Professeur au sein du département de Mathématiques Appliquées et au Laboratoire CMAP de l'École Polytechnique. J'étais précédemment Maître de Conférence à l'Université Côte d'Azur au sein du département d'Électronique et du Laboratoire Lagrange de l'Observatoire de la Côte d'Azur. J'ai préparé une thèse, sous la direction d'Alain Rakotomamonjy, à l'Université de Rouen et au Laboratoire LITIS.

Sur ce site web, vous trouverez une liste de mes publications, des supports de cours et de présentations, ainsi que divers logiciels et code source.

Intérêts de recherche

  • Apprentissage statistique et traitement statistique du signal
    • Transport Optimal pour l'apprentissage, TO sur les graphes
    • Adaptation de domaine, transfer et multi-tache, décalage de données
    • Optimisation avec sélection de variable, normes mixtes, non-convexes
    • Apprentissage de représentations, apprentissage profond
  • Applications d’apprentissage statistique
    • Traitement de signaux biomédicaux, Interfaces Cerveaux-Machine
    • Télédétection et imagerie hyperspectrale
    • Énergie et climat
    • Traitement d'images astrophysiques

Travaux récents

P. Krzakala, J. Yang, R. Flamary, F. d'Alché-Buc, C. Laclau, M. Labeau, Any2Graph: Deep End-To-End Supervised Graph Prediction With An Optimal Transport Loss, Neural Information Processing Systems (NeurIPS), 2024.
Abstract: We propose Any2graph, a generic framework for end-to-end Supervised Graph Prediction (SGP) i.e. a deep learning model that predicts an entire graph for any kind of input. The framework is built on a novel Optimal Transport loss, the Partially-Masked Fused Gromov-Wasserstein, that exhibits all necessary properties (permutation invariance, differentiability and scalability) and is designed to handle any-sized graphs. Numerical experiments showcase the versatility of the approach that outperform existing competitors on a novel challenging synthetic dataset and a variety of real-world tasks such as map construction from satellite image (Sat2Graph) or molecule prediction from fingerprint (Fingerprint2Graph).
BibTeX:
@inproceedings{krzakala2024endtoend,
author = {Paul Krzakala and Junjie Yang and Rémi Flamary and Florence d'Alché-Buc and Charlotte Laclau and Matthieu Labeau},
title = {Any2Graph: Deep End-To-End Supervised Graph Prediction With An Optimal Transport Loss},
booktitle = {Neural Information Processing Systems (NeurIPS)},
year = {2024}
}
T. Gnassounou, R. Flamary, A. Gramfort, Convolutional Monge Mapping Normalization for learning on biosignals, Neural Information Processing Systems (NeurIPS), 2023.
Abstract: In many machine learning applications on signals and biomedical data, especially electroencephalogram (EEG), one major challenge is the variability of the data across subjects, sessions, and hardware devices. In this work, we propose a new method called Convolutional Monge Mapping Normalization (CMMN), which consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data. CMMN relies on novel closed-form solutions for optimal transport mappings and barycenters and provides individual test time adaptation to new data without needing to retrain a prediction model. Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture when adapting between subjects, sessions, and even datasets collected with different hardware. Notably our performance gain is on par with much more numerically intensive Domain Adaptation (DA) methods and can be used in conjunction with those for even better performances.
BibTeX:
@inproceedings{gnassounou2023convolutional,
author = {Gnassounou, Théo and Flamary, Rémi and Gramfort, Alexandre},
title = {Convolutional Monge Mapping Normalization for learning on biosignals},
booktitle = {Neural Information Processing Systems (NeurIPS)},
year = {2023}
}
H. Van Assel, T. Vayer, R. Flamary, N. Courty, SNEkhorn: Dimension Reduction with Symmetric Entropic Affinities, Neural Information Processing Systems (NeurIPS), 2023.
Abstract: Many approaches in machine learning rely on a weighted graph to encode the similarities between samples in a dataset. Entropic affinities (EAs), which are notably used in the popular Dimensionality Reduction (DR) algorithm t-SNE, are particular instances of such graphs. To ensure robustness to heterogeneous sampling densities, EAs assign a kernel bandwidth parameter to every sample in such a way that the entropy of each row in the affinity matrix is kept constant at a specific value, whose exponential is known as perplexity. EAs are inherently asymmetric and row-wise stochastic, but they are used in DR approaches after undergoing heuristic symmetrization methods that violate both the row-wise constant entropy and stochasticity properties. In this work, we uncover a novel characterization of EA as an optimal transport problem, allowing a natural symmetrization that can be computed efficiently using dual ascent. The corresponding novel affinity matrix derives advantages from symmetric doubly stochastic normalization in terms of clustering performance, while also effectively controlling the entropy of each row thus making it particularly robust to varying noise levels. Following, we present a new DR algorithm, SNEkhorn, that leverages this new affinity matrix. We show its clear superiority to state-of-the-art approaches with several indicators on both synthetic and real-world datasets.
BibTeX:
@inproceedings{van2023snekhorn,
author = {Van Assel, Hugues and Vayer, Titouan and Flamary, Rémi and Courty, Nicolas},
title = {SNEkhorn: Dimension Reduction with Symmetric Entropic Affinities},
booktitle = {Neural Information Processing Systems (NeurIPS)},
year = {2023}
}
C. Vincent-Cuaz, R. Flamary, M. Corneli, T. Vayer, N. Courty, Template based Graph Neural Network with Optimal Transport Distances, Neural Information Processing Systems (NeurIPS), 2022.
Abstract: Current Graph Neural Networks (GNN) architectures generally rely on two important components: node features embedding through message passing, and aggregation with a specialized form of pooling. The structural (or topological) information is implicitly taken into account in these two steps. We propose in this work a novel point of view, which places distances to some learnable graph templates at the core of the graph representation. This distance embedding is constructed thanks to an optimal transport distance: the Fused Gromov-Wasserstein (FGW) distance, which encodes simultaneously feature and structure dissimilarities by solving a soft graph-matching problem. We postulate that the vector of FGW distances to a set of template graphs has a strong discriminative power, which is then fed to a non-linear classifier for final predictions. Distance embedding can be seen as a new layer, and can leverage on existing message passing techniques to promote sensible feature representations. Interestingly enough, in our work the optimal set of template graphs is also learnt in an end-to-end fashion by differentiating through this layer. After describing the corresponding learning procedure, we empirically validate our claim on several synthetic and real life graph classification datasets, where our method is competitive or surpasses kernel and GNN state-of-the-art approaches. We complete our experiments by an ablation study and a sensitivity analysis to parameters.
BibTeX:
@inproceedings{vincentcuaz2022template,
author = { Vincent-Cuaz, Cédric and Flamary, Rémi and Corneli, Marco and Vayer, Titouan and Courty, Nicolas},
title = {Template based Graph Neural Network with Optimal Transport   Distances},
booktitle = {Neural Information Processing Systems (NeurIPS)},
year = {2022}
}
A. Thual, H. Tran, T. Zemskova, N. Courty, R. Flamary, S. Dehaene, B. Thirion, Aligning individual brains with Fused Unbalanced Gromov-Wasserstein, Neural Information Processing Systems (NeurIPS), 2022.
Abstract: Individual brains vary in both anatomy and functional organization, even within a given species. Inter-individual variability is a major impediment when trying to draw generalizable conclusions from neuroimaging data collected on groups of subjects. Current co-registration procedures rely on limited data, and thus lead to very coarse inter-subject alignments. In this work, we present a novel method for inter-subject alignment based on Optimal Transport, denoted as Fused Unbalanced Gromov Wasserstein (FUGW). The method aligns cortical surfaces based on the similarity of their functional signatures in response to a variety of stimulation settings, while penalizing large deformations of individual topographic organization. We demonstrate that FUGW is well-suited for whole-brain landmark-free alignment. The unbalanced feature allows to deal with the fact that functional areas vary in size across subjects. Our results show that FUGW alignment significantly increases between-subject correlation of activity for independent functional data, and leads to more precise mapping at the group level.
BibTeX:
@inproceedings{thual2022aligning,
author = { Thual, Alexis and Tran, Huy and Zemskova, Tatiana and Courty, Nicolas and Flamary, Rémi and Dehaene, Stanislas and Thirion, Bertrand},
title = {Aligning individual brains with Fused Unbalanced Gromov-Wasserstein},
booktitle = {Neural Information Processing Systems (NeurIPS)},
year = {2022}
}
C. Vincent-Cuaz, R. Flamary, M. Corneli, T. Vayer, N. Courty, Semi-relaxed Gromov Wasserstein divergence with applications on graphs, International Conference on Learning Representations (ICLR), 2022.
Abstract: Comparing structured objects such as graphs is a fundamental operation involved in many learning tasks. To this end, the Gromov-Wasserstein (GW) distance, based on Optimal Transport (OT), has proven to be successful in handling the specific nature of the associated objects. More specifically, through the nodes connectivity relations, GW operates on graphs, seen as probability measures over specific spaces. At the core of OT is the idea of conservation of mass, which imposes a coupling between all the nodes from the two considered graphs. We argue in this paper that this property can be detrimental for tasks such as graph dictionary or partition learning, and we relax it by proposing a new semi-relaxed Gromov-Wasserstein divergence. Aside from immediate computational benefits, we discuss its properties, and show that it can lead to an efficient graph dictionary learning algorithm. We empirically demonstrate its relevance for complex tasks on graphs such as partitioning, clustering and completion.
BibTeX:
@inproceedings{vincent2022semi,
author = {Vincent-Cuaz, Cédric and Flamary, Rémi and Corneli, Marco and   Vayer, Titouan and Courty, Nicolas},
title = {Semi-relaxed Gromov Wasserstein divergence with applications on graphs},
booktitle = {International Conference on Learning Representations (ICLR)},
year = {2022}
}
L. Chapel, R. Flamary, H. Wu, C. Févotte, G. Gasso, Unbalanced Optimal Transport through Non-negative Penalized Linear Regression, Neural Information Processing Systems (NeurIPS), 2021.
Abstract: This paper addresses the problem of Unbalanced Optimal Transport (UOT) in which the marginal conditions are relaxed (using weighted penalties in lieu of equality) and no additional regularization is enforced on the OT plan. In this context, we show that the corresponding optimization problem can be reformulated as a non-negative penalized linear regression problem. This reformulation allows us to propose novel algorithms inspired from inverse problems and nonnegative matrix factorization. In particular, we consider majorization-minimization which leads in our setting to efficient multiplicative updates for a variety of penalties. Furthermore, we derive for the first time an efficient algorithm to compute the regularization path of UOT with quadratic penalties. The proposed algorithm provides a continuity of piece-wise linear OT plans converging to the solution of balanced OT (corresponding to infinite penalty weights). We perform several numerical experiments on simulated and real data illustrating the new algorithms, and provide a detailed discussion about more sophisticated optimization tools that can further be used to solve OT problems thanks to our reformulation.
BibTeX:
@inproceedings{chapel2021unbalanced,
author = {Chapel, Laetitia and Flamary, Rémi and Wu, Haoran and Févotte, Cédric   and Gasso, Gilles},
title = {Unbalanced Optimal Transport through Non-negative Penalized Linear Regression},
booktitle = {Neural Information Processing Systems (NeurIPS)},
year = {2021}
}
K. Fatras, B. Bhushan Damodaran, S. Lobry, R. Flamary, D. Tuia, N. Courty, Wasserstein Adversarial Regularization for learning with label noise, Pattern Analysis and Machine Intelligence, IEEE Transactions on , 2021.
Abstract: Noisy labels often occur in vision datasets, especially when they are obtained from crowdsourcing or Web scraping. We propose a new regularization method, which enables learning robust classifiers in presence of noisy data. To achieve this goal, we propose a new adversarial regularization scheme based on the Wasserstein distance. Using this distance allows taking into account specific relations between classes by leveraging the geometric properties of the labels space. Our Wasserstein Adversarial Regularization (WAR) encodes a selective regularization, which promotes smoothness of the classifier between some classes, while preserving sufficient complexity of the decision boundary between others. We first discuss how and why adversarial regularization can be used in the context of label noise and then show the effectiveness of our method on five datasets corrupted with noisy labels: in both benchmarks and real datasets, WAR outperforms the state-of-the-art competitors.
BibTeX:
@article{damodaran2021wasserstein,
author = { Fatras, Kilian and Bhushan Damodaran, Bharath and Lobry, Sylvain and Flamary, Rémi and Tuia, Devis and Courty, Nicolas},
title = {Wasserstein Adversarial Regularization for learning with label          noise},
journal = { Pattern Analysis and Machine Intelligence, IEEE Transactions on },
year = {2021}
}
C. Vincent-Cuaz, T. Vayer, R. Flamary, M. Corneli, N. Courty, Online Graph Dictionary Learning, International Conference on Machine Learning (ICML), 2021.
Abstract: Dictionary learning is a key tool for representation learning that explains the data as linear combination of few basic elements. Yet, this analysis is not amenable in the context of graph learning, as graphs usually belong to different metric spaces. We fill this gap by proposing a new online Graph Dictionary Learning approach, which uses the Gromov Wasserstein divergence for the data fitting term. In our work, graphs are encoded through their nodes' pairwise relations and modeled as convex combination of graph atoms, i.e. dictionary elements, estimated thanks to an online stochastic algorithm, which operates on a dataset of unregistered graphs with potentially different number of nodes. Our approach naturally extends to labeled graphs, and is completed by a novel upper bound that can be used as a fast approximation of Gromov Wasserstein in the embedding space. We provide numerical evidences showing the interest of our approach for unsupervised embedding of graph datasets and for online graph subspace estimation and tracking.
BibTeX:
@inproceedings{vincent2021online,
author = {Vincent-Cuaz, Cédric and Vayer, Titouan and Flamary, Rémi and Corneli, Marco and Courty, Nicolas},
title = {Online Graph Dictionary Learning},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2021}
}
K. Fatras, T. Séjourné, N. Courty, R. Flamary, Unbalanced minibatch Optimal Transport; applications to Domain Adaptation, International Conference on Machine Learning (ICML), 2021.
Abstract: Optimal transport distances have found many applications in machine learning for their capacity to compare non-parametric probability distributions. Yet their algorithmic complexity generally prevents their direct use on large scale datasets. Among the possible strategies to alleviate this issue, practitioners can rely on computing estimates of these distances over subsets of data, \em i.e. minibatches. While computationally appealing, we highlight in this paper some limits of this strategy, arguing it can lead to undesirable smoothing effects. As an alternative, we suggest that the same minibatch strategy coupled with unbalanced optimal transport can yield more robust behavior. We discuss the associated theoretical properties, such as unbiased estimators, existence of gradients and concentration bounds. Our experimental study shows that in challenging problems associated to domain adaptation, the use of unbalanced optimal transport leads to significantly better results, competing with or surpassing recent baselines.
BibTeX:
@inproceedings{fatras2021unbalanced,
author = {Fatras, Kilian and Séjourné, Thibault and Courty, Nicolas and   Flamary, Rémi},
title = {Unbalanced minibatch Optimal Transport; applications to Domain Adaptation},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2021}
}
R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, L. Gautheron, N. T. Gayraud, H. Janati, A. Rakotomamonjy , I. Redko, A. Rolet, A. Schutz, V. S. a. D. J. Sutherland, R. Tavenard, A. Tong, T. Vayer, POT: Python Optimal Transport, Journal of Machine Learning Research, Vol. 22, N. 78, pp 1-8, 2021.
Abstract: Optimal transport has recently been reintroduced to the machine learning community thanks in part to novel efficient optimization procedures allowing for medium to large scale applications. We propose a Python toolbox that implements several key optimal transport ideas for the machine learning community. The toolbox contains implementations of a number of founding works of OT for machine learning such as Sinkhorn algorithm and Wasserstein barycenters, but also provides generic solvers that can be used for conducting novel fundamental research. This toolbox, named POT for Python Optimal Transport, is open source with an MIT license.
BibTeX:
@article{flamary2021pot,
author = { Rémi Flamary and Nicolas Courty and Alexandre Gramfort and   Mokhtar Z. Alaya and Aurélie Boisbunon and Stanislas Chambon and Laetitia
  Chapel and Adrien Corenflos and Kilian Fatras and Nemo Fournier and Léo
  Gautheron and Nathalie T.H. Gayraud and Hicham Janati and Alain Rakotomamonjy
  and Ievgen Redko and Antoine Rolet and Antony Schutz and Vivien Seguy and
  Danica J. Sutherland and Romain Tavenard and Alexander Tong and Titouan
  Vayer},
title = {POT: Python Optimal Transport},
journal = { Journal of Machine Learning Research},
volume = { 22},
number = { 78},
pages = { 1-8},
year = {2021}
}

News

NeurIPS 2023

2023-12-01

Je serai présent à NeurIPS 2023 à la Nouvelle Orléans. J'y présenterai avec mes formidables co-auteurs deux posters et je suis un orateur invité au workshop Optimal Transport for Machine Learning (OTML).

N'hésitez pas à venir me voir et voir mes collaborateurs à nos posters ou lors du workshop OTML (nous avons aussi des posters là-bas).

T. Gnassounou, R. Flamary, A. Gramfort, Convolutional Monge Mapping Normalization for learning on biosignals, Neural Information Processing Systems (NeurIPS), 2023.
Abstract: In many machine learning applications on signals and biomedical data, especially electroencephalogram (EEG), one major challenge is the variability of the data across subjects, sessions, and hardware devices. In this work, we propose a new method called Convolutional Monge Mapping Normalization (CMMN), which consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data. CMMN relies on novel closed-form solutions for optimal transport mappings and barycenters and provides individual test time adaptation to new data without needing to retrain a prediction model. Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture when adapting between subjects, sessions, and even datasets collected with different hardware. Notably our performance gain is on par with much more numerically intensive Domain Adaptation (DA) methods and can be used in conjunction with those for even better performances.
BibTeX:
@inproceedings{gnassounou2023convolutional,
author = {Gnassounou, Théo and Flamary, Rémi and Gramfort, Alexandre},
title = {Convolutional Monge Mapping Normalization for learning on biosignals},
booktitle = {Neural Information Processing Systems (NeurIPS)},
editor = {},
year = {2023}
} 
H. Van Assel, T. Vayer, R. Flamary, N. Courty, SNEkhorn: Dimension Reduction with Symmetric Entropic Affinities, Neural Information Processing Systems (NeurIPS), 2023.
Abstract: Many approaches in machine learning rely on a weighted graph to encode the similarities between samples in a dataset. Entropic affinities (EAs), which are notably used in the popular Dimensionality Reduction (DR) algorithm t-SNE, are particular instances of such graphs. To ensure robustness to heterogeneous sampling densities, EAs assign a kernel bandwidth parameter to every sample in such a way that the entropy of each row in the affinity matrix is kept constant at a specific value, whose exponential is known as perplexity. EAs are inherently asymmetric and row-wise stochastic, but they are used in DR approaches after undergoing heuristic symmetrization methods that violate both the row-wise constant entropy and stochasticity properties. In this work, we uncover a novel characterization of EA as an optimal transport problem, allowing a natural symmetrization that can be computed efficiently using dual ascent. The corresponding novel affinity matrix derives advantages from symmetric doubly stochastic normalization in terms of clustering performance, while also effectively controlling the entropy of each row thus making it particularly robust to varying noise levels. Following, we present a new DR algorithm, SNEkhorn, that leverages this new affinity matrix. We show its clear superiority to state-of-the-art approaches with several indicators on both synthetic and real-world datasets.
BibTeX:
@inproceedings{van2023snekhorn,
author = {Van Assel, Hugues and Vayer, Titouan and Flamary, Rémi and Courty, Nicolas},
title = {SNEkhorn: Dimension Reduction with Symmetric Entropic Affinities},
booktitle = {Neural Information Processing Systems (NeurIPS)},
editor = {},
year = {2023}
} 

La stratégie du moindre effort pour apprendre aux machines

2023-04-12

Gabriel Peyré et moi avons présenté le 13 mars 2023 à Sorbonne Université à Jussieu, une conférence pour un large public où nous avons discuté de l'utilisation du transport optimal et de la théorie du moindre effort dans les applications d'intelligence artificielle.

Je met à disposition les supports de présentation et le lien vers la vidéo sur le site de la Société Mathématique de France.

Présentation orale à NeurIPS 2022

2022-11-20

Les travaux de thèse de Cédric Vincent-Cuaz sur le Transport Optimal pour les réseau de neurones sur graph ont été acceptés pour une présentation orale très selective à NeurIPS 2022.

Cédric et moi serons présents à la Nouvelle Orleans pour NeurIPS. N'hésitez pas à venir nous voir à notre poster.

C. Vincent-Cuaz, R. Flamary, M. Corneli, T. Vayer, N. Courty, Template based Graph Neural Network with Optimal Transport Distances, Neural Information Processing Systems (NeurIPS), 2022.
Abstract: Current Graph Neural Networks (GNN) architectures generally rely on two important components: node features embedding through message passing, and aggregation with a specialized form of pooling. The structural (or topological) information is implicitly taken into account in these two steps. We propose in this work a novel point of view, which places distances to some learnable graph templates at the core of the graph representation. This distance embedding is constructed thanks to an optimal transport distance: the Fused Gromov-Wasserstein (FGW) distance, which encodes simultaneously feature and structure dissimilarities by solving a soft graph-matching problem. We postulate that the vector of FGW distances to a set of template graphs has a strong discriminative power, which is then fed to a non-linear classifier for final predictions. Distance embedding can be seen as a new layer, and can leverage on existing message passing techniques to promote sensible feature representations. Interestingly enough, in our work the optimal set of template graphs is also learnt in an end-to-end fashion by differentiating through this layer. After describing the corresponding learning procedure, we empirically validate our claim on several synthetic and real life graph classification datasets, where our method is competitive or surpasses kernel and GNN state-of-the-art approaches. We complete our experiments by an ablation study and a sensitivity analysis to parameters.
BibTeX:
@inproceedings{vincentcuaz2022template,
author = { Vincent-Cuaz, Cédric and Flamary, Rémi and Corneli, Marco and Vayer, Titouan and Courty, Nicolas},
title = {Template based Graph Neural Network with Optimal Transport   Distances},
booktitle = {Neural Information Processing Systems (NeurIPS)},
editor = {},
year = {2022}
} 

Optimal Transport for Machine Learning tutorial at Hi! Paris Summer School 2022

2022-06-15

Je donnerai un tutoriel sur le Transport optimal pour l'apprentissage automatique pour l'Hi! Paris Summer School 2022 le 4 juillet 2022 à l'Ecole Polytechnique à Paris/Saclay, France.

Les supports de présentation sont disponibles ci-dessous (en anglais):

  • Part 1 : Intro to Optimal Transport [PDF].
  • Part 2: Optimal Transform for Machine learning [PDF].