Home
About

About the Conference Welcome from the Chair Conference Committees Area Chairs Organizers ISCA
Calls

Papers Surveys Satellite Workshops Tutorials Show & Tell Special Sessions & Challenges Areas & Topics Important Dates
Authors

Author Resources Submission Policy ISCA Ethics Paper Submission Presentation Guidelines
Program

Program at a Glance Technical Program Presentation Videos Presentation Guidelines Keynotes Satellite Workshops Tutorials Special Sessions & Challenges Show & Tell
Student Information

Student Events Travel Grants
Venue & Travel

Conference Venue & Accommodations Transportations Visa About Shanghai
Registration

Registration Overview & Fees ISCA Membership ISCA Code of Conduct Online Registration
Sponsorships & Exhibition

Sponsors Virtual Booth Satellite Events Acknowledgement
Contact

Contact Us

Program

Program at a Glance

Technical Program

Presentation Videos

Presentation Guidelines

Satellite Workshops

Special Sessions & Challenges

General Topics in Speech Recognition

Position: Home > Program > Technical Program > Thursday 20:30-21:30(GMT+8), October 29 > General Topics in Speech Recognition >

Thu-2-8-2 Training Keyword Spotting Models on Non-IID Data with Federated Learning

Andrew Hard(Google Inc.), Kurt Partridge(Google Inc.), Cameron Nguyen(Google Inc.), Niranjan Subrahmanya(Google Inc.), Aishanee Shah(Google Inc.), Pai Zhu(Google Inc.), Ignacio Moreno(Google Inc.) and Rajiv Mathews(Google Inc.)

Abstract: We demonstrate that a production-quality keyword-spotting model can be trained on-device using federated learning and achieve comparable false accept and false reject rates to a centrally-trained model. To overcome the algorithmic constraints associated with fitting on-device data (which are inherently non-independent and identically distributed), we conduct thorough empirical studies of optimization algorithms and hyperparameter configurations using large-scale federated simulations. To overcome resource constraints, we replace memory-intensive MTR data augmentation with SpecAugment, which reduces the false reject rate by 56%. Finally, to label examples (given the zero visibility into on-device data), we explore teacher-student training.

Paper

prev Thu-2-8-1 State sequence pooling training of acoustic models for keyword spotting

next Thu-2-8-3 CLASS LM AND WORD MAPPING FOR CONTEXTUAL BIASING IN END-TO-END ASR

About

About the Conference

Welcome from the Chair

Conference Committees

Calls

Satellite Workshops

Special Sessions & Challenges

Important Dates

Program

Program at a Glance

Technical Program

Presentation Videos

Presentation Guidelines

Satellite Workshops

Special Sessions & Challenges

Student Information

Venue & Travel

Conference Venue & Accommodations

Transportations

Sponsorships & Exhibition

Satellite Events

Acknowledgement