Publication

해외 컨퍼런스Improving Noise Robustness in Self-supervised Pre-trained Model for Speaker Verification

(2024) Interspeech
2024-09-01

Improving Noise Robustness in Self-supervised Pre-trained Model for Speaker Verification [link]

Chan-yeong Lim, Hyun-seo Shin, Ju-ho Kim, Jungwoo Heo, Kyo-Won Koo, Seung-bin Kim, Ha-Jin Yu


Abstract

Adopting self-supervised pre-trained models (PMs) in speaker verification (SV) has shown remarkable performance, but their noise robustness is largely unexplored. In the field of automatic speech recognition, additional training strategies enhance the robustness of the models before fine-tuning to improve performance in noisy environments. However, directly applying these strategies to SV risks distorting speaker information. We propose a noise adaptive warm-up training for speaker verification (NAW-SV). The NAW-SV guides the PM to extract consistent representations in noisy conditions using teacher-student learning. In this approach, to prevent the speaker information distortion problem, we introduce a novel loss function called extended angular prototypical network loss, which assists in considering speaker information and exploring robust speaker embedding space. We validated our proposed framework on the noise-synthesized VoxCeleb1 test set, demonstrating promising robustness.



본사이트의 모든 제작물의 저작권은 IRLab에 있으며, 무단복제나 도용은 저작권법(96조)에 의해 금지되어 있습니다.

COPYRIGHT ©  IRLab . Ltd. ALL RIGHTS RESERVED.