Home
About

About the Conference Welcome from the Chair Conference Committees Area Chairs Organizers ISCA
Calls

Papers Surveys Satellite Workshops Tutorials Show & Tell Special Sessions & Challenges Areas & Topics Important Dates
Authors

Author Resources Submission Policy ISCA Ethics Paper Submission Presentation Guidelines
Program

Program at a Glance Technical Program Presentation Videos Presentation Guidelines Keynotes Satellite Workshops Tutorials Special Sessions & Challenges Show & Tell
Student Information

Student Events Travel Grants
Venue & Travel

Conference Venue & Accommodations Transportations Visa About Shanghai
Registration

Registration Overview & Fees ISCA Membership ISCA Code of Conduct Online Registration
Sponsorships & Exhibition

Sponsors Virtual Booth Satellite Events Acknowledgement
Contact

Contact Us

Program

Program at a Glance

Technical Program

Presentation Videos

Presentation Guidelines

Satellite Workshops

Special Sessions & Challenges

Learning Techniques for Speaker Recognition I

Position: Home > Program > Technical Program > Wednesday 20:30-21:30(GMT+8), October 28 > Learning Techniques for Speaker Recognition I >

Wed-2-12-1 In defence of metric learning for speaker recognition

Joon Son Chung(University of Oxford), Jaesung Huh(Naver Corporation), Seongkyu Mun(Naver Corp.), Minjae Lee(Naver Corporation), Hee Soo Heo(Naver Corporation), Soyeon Choe(Naver Corporation), Chiheon Ham(Naver Corporation), Sunghwan Jung(Naver Corporation), Bong-Jin Lee(Naver Corporation) and Icksang Han(Naver Corporation)

Abstract: The objective of this paper is open-set speaker recognition of unseen speakers, where ideal embeddings should be able to condense information into a compact utterance-level representation that has small intra-speaker and large inter-speaker distance. A popular belief in speaker recognition is that networks trained with classification objectives outperform metric learning methods. In this paper, we present an extensive evaluation of most popular loss functions for speaker recognition on the VoxCeleb dataset. We demonstrate that the vanilla triplet loss shows competitive performance compared to classification-based losses, and those trained with our proposed metric learning objective outperform state-of-the-art methods.

Paper

prev No More

next Wed-2-12-2 Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs

About

About the Conference

Welcome from the Chair

Conference Committees

Calls

Satellite Workshops

Special Sessions & Challenges

Important Dates

Program

Program at a Glance

Technical Program

Presentation Videos

Presentation Guidelines

Satellite Workshops

Special Sessions & Challenges

Student Information

Venue & Travel

Conference Venue & Accommodations

Transportations

Sponsorships & Exhibition

Satellite Events

Acknowledgement