{ "id": "1905.11885", "version": "v1", "published": "2019-05-28T15:28:07.000Z", "updated": "2019-05-28T15:28:07.000Z", "title": "Differentiable Sorting using Optimal Transport:The Sinkhorn CDF and Quantile Operator", "authors": [ "Marco Cuturi", "Olivier Teboul", "Jean-Philippe Vert" ], "categories": [ "cs.LG", "stat.ML" ], "abstract": "Sorting an array is a fundamental routine in machine learning, one that is used to compute rank-based statistics, cumulative distribution functions (CDFs), quantiles, or to select closest neighbors and labels. The sorting function is however piece-wise constant (the sorting permutation of a vector does not change if the entries of that vector are infinitesimally perturbed) and therefore has no gradient information to back-propagate. We propose a framework to sort elements that is algorithmically differentiable. We leverage the fact that sorting can be seen as a particular instance of the optimal transport (OT) problem on $\\mathbb{R}$, from input values to a predefined array of sorted values (e.g. $1,2,\\dots,n$ if the input array has $n$ elements). Building upon this link , we propose generalized CDFs and quantile operators by varying the size and weights of the target presorted array. Because this amounts to using the so-called Kantorovich formulation of OT, we call these quantities K-sorts, K-CDFs and K-quantiles. We recover differentiable algorithms by adding to the OT problem an entropic regularization, and approximate it using a few Sinkhorn iterations. We call these operators S-sorts, S-CDFs and S-quantiles, and use them in various learning settings: we benchmark them against the recently proposed neuralsort [Grover et al. 2019], propose applications to quantile regression and introduce differentiable formulations of the top-k accuracy that deliver state-of-the art performance.", "revisions": [ { "version": "v1", "updated": "2019-05-28T15:28:07.000Z" } ], "analyses": { "keywords": [ "optimal transport", "quantile operator", "sinkhorn cdf", "differentiable sorting", "deliver state-of-the art performance" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }