{ "id": "1303.3235", "version": "v2", "published": "2013-03-13T17:56:28.000Z", "updated": "2015-03-22T17:44:37.000Z", "title": "On the Entropy of Couplings", "authors": [ "Mladen Kovačević", "Ivan Stanojević", "Vojin Šenk" ], "comment": "18 pages (single-column). Compared to v1, the material is reorganized, Section IV.C is removed (the results will possible appear elsewhere), Propositions 3.2 and 3.7 are added. Accepted for publication in Information and Computation", "categories": [ "cs.IT", "math.IT" ], "abstract": "In this paper, some general properties of Shannon information measures are investigated over sets of probability distributions with restricted marginals. Certain optimization problems associated with these functionals are shown to be NP-hard, and their special cases are found to be essentially information-theoretic restatements of well-known computational problems, such as the SUBSET SUM and the 3-PARTITION. The notion of minimum entropy coupling is introduced and its relevance is demonstrated in information-theoretic, computational, and statistical contexts. Finally, a family of pseudometrics (on the space of discrete probability distributions) defined by these couplings is studied, in particular their relation to the total variation distance, and a new characterization of the conditional entropy is given.", "revisions": [ { "version": "v1", "updated": "2013-03-13T17:56:28.000Z", "abstract": "This paper studies bivariate distributions with fixed marginals from an information-theoretic perspective. In particular, continuity and related properties of various information measures (Shannon entropy, conditional entropy, mutual information, R\\'enyi entropy) on the set of all such distributions are investigated. The notion of minimum entropy coupling is introduced, and it is shown that it defines a family of (pseudo)metrics on the space of all probability distributions in the same way as the so-called maximal coupling defines the total variation distance. Some basic properties of these pseudometrics are established, in particular their relation to the total variation distance, and a new characterization of the conditional entropy is given. Finally, some natural optimization problems associated with the above information measures are identified and shown to be NP-hard. Their special cases are found to be essentially information-theoretic restatements of well-known computational problems, such as the \\textsc{Subset sum} and \\textsc{Partition} problems.", "comment": "11 pages (double-column)", "journal": null, "doi": null }, { "version": "v2", "updated": "2015-03-22T17:44:37.000Z" } ], "analyses": { "subjects": [ "94A17", "60E99", "68Q17" ], "keywords": [ "total variation distance", "paper studies bivariate distributions", "conditional entropy", "information measures", "natural optimization problems" ], "note": { "typesetting": "TeX", "pages": 18, "language": "en", "license": "arXiv", "status": "editable", "adsabs": "2013arXiv1303.3235K" } } }