We kindly invite you to the following BIDSA seminar:
“Differentiable Ranking and Sorting using Optimal Transport”
Marco Cuturi (Google Brain / ENSAE)
Friday, January 17, 2020 | 12:30 | Room 3-e4-sr03 (third floor)
Bocconi University, Via Roentgen 1, Milan
Abstract:
We propose in this paper proxy operators for ranking and sorting that are differentiable. To do so, we leverage the fact that sorting can be seen as a particular instance of the optimal transport (OT) problem between two univariate uniform probability measures, the first measure being supported on the family of input values of interest, the second supported on a family of increasing target values (e.g. 1,2,...,n if the input array has n elements). Building upon this link, we propose generalized rank and sort operators by considering arbitrary target measures supported on m values, where m can be smaller than n. We recover differentiable algorithms by adding to this generic OT problem an entropic regularization, and approximate its outputs using Sinkhorn iterations. To illustrate the versatility of these operators, we use the soft-rank operator to propose a new classification training loss that is a differentiable proxy of the 0/1 loss. Using the soft-sort operator, we propose a new differentiable loss for trimmed regression.
Speaker:
Marco Cuturi joined Google Brain, in Paris, in October 2018. He graduated from ENSAE (2001), ENS Cachan (Master MVA, 2002) and holds a PhD in applied maths obtained in 2005 at Ecole des Mines de Paris. He worked as a post-doctoral researcher at the Institute of Statistical Mathematics, Tokyo, between 11/2005 and 3/2007. He worked in the financial industry between 4/2007 and 9/2008. After working at the ORFE department of Princeton University between 02/2009 and 08/2010 as a lecturer, he was at the Graduate School of Informatics of Kyoto University between 9/2010 and 9/2016 as a tenured associate professor from 10/2013. He then joined ENSAE, the french national school for statistics and economics, in 9/2016, where he still teaches. His recent proposal to solve optimal transport using an entropic regularization has re-ignited interest in optimal transport and Wasserstein distances in the machine learning community. His work has recently focused on applying that loss function to problems involving probability distributions, e.g. topic models / dictionary learning for text and images, parametric inference for generative models, regression with a Wasserstein loss and probabilistic embeddings for words.
Best regards,
Giacomo Zanella
Please note: if you are a guest and you do not have a Bocconi ID Card to access to the Bocconi Buildings, please communicate your participation by sending an email to c.monetti(a)unibocconi.it