[PDF][PDF] Speech coding and audio preprocessing for mitigating and detecting audio adversarial examples on automatic speech recognition

K Rajaratnam, B Alshemali, J Kalita - Machine Learning in …, 2018 - faculty.uccs.edu
Machine Learning in Computer Vision and Natural Language Processing, 2018faculty.uccs.edu
An adversarial attack is an exploitative process in which minute changes are made to a
natural input, causing that input to be misclassified by a neural model. Due to recent trends
in speech processing, this has become a noticeable issue in speech recognition models. In
late 2017, an attack was shown to be quite effective against the Speech Commands
classification model. Limited-vocabulary classifiers, such as the Speech Commands model,
are used quite frequently for managing automated attendants in traditional telephony and …
Abstract
An adversarial attack is an exploitative process in which minute changes are made to a natural input, causing that input to be misclassified by a neural model. Due to recent trends in speech processing, this has become a noticeable issue in speech recognition models. In late 2017, an attack was shown to be quite effective against the Speech Commands classification model. Limited-vocabulary classifiers, such as the Speech Commands model, are used quite frequently for managing automated attendants in traditional telephony and voice over IP (VoIP) contexts. As such, this research examines the effectiveness of VoIP speech coding in mitigating audio adversarial attacks when compared to more primitive forms of audio preprocessing and shows that an ensemble defense in tandem with speech coding is more robust than other forms of preprocessing defenses in mitigating adversarial examples. This research also proposes a new metric for evaluating preprocessing defenses against adversarial attacks. Additionally, this research explores using speech coding and various other forms of preprocessing for detecting adversarial examples.
faculty.uccs.edu