arXiv:2305.05098 Abstract | arXiv Analytics

arXiv:2305.05098 [cs.LG]Abstract References Reviews Resources

Who Needs Decoders? Efficient Estimation of Sequence-level Attributes

Yassir Fathullah, Puria Radmard, Adian Liusie, Mark J. F. Gales

Published 2023-05-09Version 1

State-of-the-art sequence-to-sequence models often require autoregressive decoding, which can be highly expensive. However, for some downstream tasks such as out-of-distribution (OOD) detection and resource allocation, the actual decoding output is not needed just a scalar attribute of this sequence. In these scenarios, where for example knowing the quality of a system's output to predict poor performance prevails over knowing the output itself, is it possible to bypass the autoregressive decoding? We propose Non-Autoregressive Proxy (NAP) models that can efficiently predict general scalar-valued sequence-level attributes. Importantly, NAPs predict these metrics directly from the encodings, avoiding the expensive autoregressive decoding stage. We consider two sequence-to-sequence task: Machine Translation (MT); and Automatic Speech Recognition (ASR). In OOD for MT, NAPs outperform a deep ensemble while being significantly faster. NAPs are also shown to be able to predict performance metrics such as BERTScore (MT) or word error rate (ASR). For downstream tasks, such as data filtering and resource optimization, NAPs generate performance predictions that outperform predictive uncertainty while being highly inference efficient.

Categories: cs.LG, cs.AI, cs.CL

Keywords: efficient estimation, needs decoders, naps generate performance predictions, predict poor performance prevails, downstream tasks

Related articles: Most relevant | Search more

arXiv:2211.03782 [cs.LG] (Published 2022-11-07)

On minimal variations for unsupervised representation learning

Vivien Cabannes, Alberto Bietti, Randall Balestriero

arXiv:2309.17002 [cs.LG] (Published 2023-09-29)

Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks

Hao Chen et al.

arXiv:2307.08623 [cs.LG] (Published 2023-07-14)

HYTREL: Hypergraph-enhanced Tabular Data Representation Learning

Pei Chen, Soumajyoti Sarkar, Leonard Lausen, Balasubramaniam Srinivasan, Sheng Zha, Ruihong Huang, George Karypis

arXiv Analytics

arXiv:2305.05098 [cs.LG]Abstract References Reviews Resources

Who Needs Decoders? Efficient Estimation of Sequence-level Attributes

Links

Toolbox

arXiv:2305.05098 [cs.LG]AbstractReferencesReviewsResources

Who Needs Decoders? Efficient Estimation of Sequence-level Attributes

Links

Toolbox

arXiv:2305.05098 [cs.LG]Abstract References Reviews Resources