Home
About

About the Conference Welcome from the Chair Conference Committees Area Chairs Organizers ISCA
Calls

Papers Surveys Satellite Workshops Tutorials Show & Tell Special Sessions & Challenges Areas & Topics Important Dates
Authors

Author Resources Submission Policy ISCA Ethics Paper Submission Presentation Guidelines
Program

Program at a Glance Technical Program Presentation Videos Presentation Guidelines Keynotes Satellite Workshops Tutorials Special Sessions & Challenges Show & Tell
Student Information

Student Events Travel Grants
Venue & Travel

Conference Venue & Accommodations Transportations Visa About Shanghai
Registration

Registration Overview & Fees ISCA Membership ISCA Code of Conduct Online Registration
Sponsorships & Exhibition

Sponsors Virtual Booth Satellite Events Acknowledgement
Contact

Contact Us

Program

Program at a Glance

Technical Program

Presentation Videos

Presentation Guidelines

Satellite Workshops

Special Sessions & Challenges

New Trends in Self-Supervised Speech Processing

Position: Home > Program > Technical Program > Thursday 19:15-20:15(GMT+8), October 29 > New Trends in Self-Supervised Speech Processing >

Thu-SS-1-6-7 Understanding Self-Attention of Self-Supervised Audio Transformers

Shu-wen Yang(National Taiwan University), Andy T. Liu(College of Electrical Engineering and Computer Science, National Taiwan University) and Hung-yi Lee(National Taiwan University (NTU))

Abstract: Self-supervised Audio Transformers (SAT) enable great success in many downstream speech applications like ASR, but how they work has not been widely explored yet. In this work, we present multiple strategies for the analysis of attention mechanisms in SAT. We categorize attentions into explainable categories, where we discover each category possesses its own unique functionality. We provide a visualization tool for understanding multi-head self-attention, importance ranking strategies for identifying critical attention, and attention refinement techniques to improve model performance.

Paper

prev Thu-SS-1-6-6 Defense for black-box attacks on anti-spoofing models by self-supervised learning

next Thu-SS-1-6-8 A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning

About

About the Conference

Welcome from the Chair

Conference Committees

Calls

Satellite Workshops

Special Sessions & Challenges

Important Dates

Program

Program at a Glance

Technical Program

Presentation Videos

Presentation Guidelines

Satellite Workshops

Special Sessions & Challenges

Student Information

Venue & Travel

Conference Venue & Accommodations

Transportations

Sponsorships & Exhibition

Satellite Events

Acknowledgement