Wed-3-9-9 Rescore in a Flash: Compact, Cache Efficient Hashing Data Structures for N-gram Language Models

Grant Strimel(Amazon.com), Ariya Rastrow(Amazon.com), Gautam Tiwari(Amazon.com), Adrien Pierard(Amazon.com) and Jon Webb(Amazon.com)

Abstract: We introduce DashHashLM, an efficient data structure that stores an n-gram language model compactly while making minimal trade-offs on runtime lookup latency. The data structure implements a finite state transducer with a lossless structural compression and outperforms comparable implementations when considering lookup speed in the small-footprint setting. DashHashLM introduces several optimizations to language model compression which are designed to minimize expected memory accesses. We also present variations of DashHashLM appropriate for scenarios with different memory and latency constraints. We detail the algorithm and justify our design choices with comparative experiments on a speech recognition task. Specifically, we show that with roughly a 10% increase in memory size, compared to a highly optimized, compressed baseline n-gram representation, our proposed data structure can achieve up to a 6x query speedup.

Paper

prev Wed-3-9-8 Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition

next No More

About

About the Conference

Welcome from the Chair

Conference Committees

Calls