arXiv:1509.08062 Abstract | arXiv Analytics

arXiv:1509.08062 [cs.LG]Abstract References Reviews Resources

End-to-End Text-Dependent Speaker Verification

Georg Heigold, Ignacio Moreno, Samy Bengio, Noam Shazeer

Published 2015-09-27Version 1

In this paper we present a data-driven, integrated approach to speaker verification, which maps a test utterance and a few reference utterances directly to a single score for verification and jointly optimizes the system's components using the same evaluation protocol and metric as at test time. Such an approach will result in simple and efficient systems, requiring little domain-specific knowledge and making few model assumptions. We implement the idea by formulating the problem as a single neural network architecture, including the estimation of a speaker model on only a few utterances, and evaluate it on our internal "Ok Google" benchmark for text-dependent speaker verification. The proposed approach appears to be very effective for big data applications like ours that require highly accurate, easy-to-maintain systems with a small footprint.

Comments: submitted to ICASSP 2016

Categories: cs.LG, cs.SD

Keywords: end-to-end text-dependent speaker verification, single neural network architecture, big data applications, requiring little domain-specific knowledge, reference utterances

Related articles: Most relevant | Search more

arXiv:2104.13968 [cs.LG] (Published 2021-04-28)

Tail-Net: Extracting Lowest Singular Triplets for Big Data Applications

Gurpreet Singh, Soumyajit Gupta

arXiv:2403.11395 [cs.LG] (Published 2024-03-18)

Automated data processing and feature engineering for deep learning and big data applications: a survey

Alhassan Mumuni amd Fuseini Mumuni

arXiv:1512.04011 [cs.LG] (Published 2015-12-13)

L1-Regularized Distributed Optimization: A Communication-Efficient Primal-Dual Framework