Rishika Agarwal(Apple), Xiaochuan Niu(Apple), Pranay Dighe(Apple), Srikanth Vishnubhotla(Apple), Sameer Badaskar(Apple) and Devang Naik(Apple)
Abstract:
False triggers in voice assistants are unintended invocations of
the assistant, which not only degrade the user experience but
may also compromise privacy. False trigger mitigation (FTM)
is a process to detect the false trigger events and respond appropriately
to the user. In this paper, we propose a novel solution
to the FTM problem by introducing a parallel ASR decoding
process with a special language model trained from “out-ofdomain”
data sources. Such language model is complementary
to the existing language model optimized for the assistant task.
A bidirectional lattice RNN (Bi-LRNN) classifier trained from
the lattices generated by the complementary language model
shows a 38.34% relative reduction of the false trigger (FT) rate
at the fixed rate of 0:4% false suppression (FS) of correct invocations,
compared to the current Bi-LRNN model. In addition,
we propose to train a parallel Bi-LRNN model based on
the decoding lattices from both language models, and examine
various ways of implementation. The resulting model leads to
further reduction in the false trigger rate by 10.8%.