(Translated by https://www.hiragana.jp/)
SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems | ACM Transactions on Privacy and Security

research-article

SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems

Authors:

Jiangshan Zhang,

Shengzhi Zhang,

Shanqing GuoAuthors Info & Claims

ACM Transactions on Privacy and Security, Volume 25, Issue 3

Article No.: 17, Pages 1 - 31

https://doi.org/10.1145/3510582

Published: 19 May 2022 Publication History

Abstract

With the wide use of Automatic Speech Recognition (ASR) in applications such as human machine interaction, simultaneous interpretation, audio transcription, and so on, its security protection becomes increasingly important. Although recent studies have brought to light the weaknesses of popular ASR systems that enable out-of-band signal attack, adversarial attack, and so on, and further proposed various remedies (signal smoothing, adversarial training, etc.), a systematic understanding of ASR security (both attacks and defenses) is still missing, especially on how realistic such threats are and how general existing protection could be. In this article, we present our systematization of knowledge for ASR security and provide a comprehensive taxonomy for existing work based on a modularized workflow. More importantly, we align the research in this domain with that on security in Image Recognition System (IRS), which has been extensively studied, using the domain knowledge in the latter to help understand where we stand in the former. Generally, both IRS and ASR are perceptual systems. Their similarities allow us to systematically study existing literature in ASR security based on the spectrum of attacks and defense solutions proposed for IRS, and pinpoint the directions of more advanced attacks and the directions potentially leading to more effective protection in ASR. In contrast, their differences, especially the complexity of ASR compared with IRS, help us learn unique challenges and opportunities in ASR security. Particularly, our experimental study shows that transfer attacks across ASR models are feasible, even in the absence of knowledge about models (even their types) and training data.

References

[1]

Sajjad Abdoli, Luiz G. Hafemann, Jerome Rony, Ismail Ben Ayed, Patrick Cardinal, and Alessandro L. Koerich. 2019. Universal adversarial audio perturbations. IEEE Trans. Pattern Anal. Mach. Intell. (2019).

[2]

Hadi Abdullah, Washington Garcia, Christian Peeters, Patrick Traynor, Kevin R. B. Butler, and Joseph Wilson. 2019. Practical hidden voice attacks against speech and speaker recognition systems. In Proceedings of the 26th Annual Network and Distributed System Security Symposium (NDSS).

[3]

Hadi Abdullah, Muhammad Sajidur Rahman, Washington Garcia, Logan Blue, Kevin Warren, Anurag Swarnim Yadav, Tom Shrimpton, and Patrick Traynor. 2021. Hear “No Evil,” see “Kenansville”: Efficient and transferable black-box attacks on speech recognition and voice identification systems. In 42nd IEEE Symposium on Security and Privacy.

[4]

Hadi Abdullah, Muhammad Sajidur Rahman, Christian Peeters, Cassidy Gibson, Washington Garcia, Vincent Bindschaedler, Thomas Shrimpton, and Patrick Traynor. 2021. Beyond \(L\_p\) clipping: Equalization-based psychoacoustic attacks against ASRs. arXiv preprint arXiv:2110.13250 (2021).

[5]

Hadi Abdullah, Kevin Warren, Vincent Bindschaedler, Nicolas Papernot, and Patrick Traynor. 2021. SoK: The faults in our ASRs: An overview of attacks against automatic speech recognition and speaker identification systems. In 42nd IEEE Symposium on Security and Privacy.

[6]

Victor Akinwande, Celia Cintas, Skyler Speakman, and Srihari Sridharan. 2020. Identifying audio adversarial examples via anomalous pattern detection. arXiv preprint arXiv:2002.05463 (2020).

[7]

Amr Alanwar, Bharathan Balaji, Yuan Tian, Shuo Yang, and Mani Srivastava. 2017. EchoSafe: Sonar-based verifiable interaction with intelligent digital agents. In 1st ACM Workshop on the Internet of Safe Things. 38–43.

[8]

Efthimios Alepis and Constantinos Patsakis. 2017. Monkey says, monkey does: Security and privacy on voice assistants. IEEE Access 5 (2017), 17841–17851.

[9]

Moustafa Alzantot, Bharathan Balaji, and Mani Srivastava. 2017. Did you hear that? Adversarial examples against automatic speech recognition. In NIPS 2017 Machine Deception Workshop.

[10]

Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, et al. 2016. Deep Speech 2: End-to-end speech recognition in English and Mandarin. In International Conference on Machine Learning. 173–182.

Digital Library

[11]

Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. 2018. Synthesizing robust adversarial examples. In International Conference on Machine Learning. PMLR, 284–293.

[12]

Jacob Benesty, Shoji Makino, and Jingdong Chen. 2006. Speech Enhancement. Springer Science & Business Media.

[13]

Mary K. Bispham, Ioannis Agrafiotis, and Michael Goldsmith. 2019. Nonsense attacks on Google Assistant and missense attacks on Amazon Alexa. (2019).

[14]

Logan Blue, Hadi Abdullah, Luis Vargas, and Patrick Traynor. 2018. 2MA: Verifying voice commands via two microphone authentication. In Asia Conference on Computer and Communications Security. 89–100.

Digital Library

[15]

Connor Bolton, Sara Rampazzi, Chaohao Li, Andrew Kwong, Wenyuan Xu, and Kevin Fu. 2018. Blue Note: How intentional acoustic interference damages availability and integrity in hard disk drives and operating systems. In IEEE Symposium on Security and Privacy (SP). IEEE, 1048–1062.

[16]

Junyoung Byun, Hyojun Go, and Changick Kim. 2021. Small input noise is enough to defend against query-based black-box attacks. arXiv preprint arXiv:2101.04829 (2021).

[17]

Yulong Cao, Ningfei Wang, Chaowei Xiao, Dawei Yang, Jin Fang, Ruigang Yang, Qi Alfred Chen, Mingyan Liu, and Bo Li. 2021. Invisible for both camera and lidar: Security of multi-sensor fusion based perception in autonomous driving under physical-world attacks. In IEEE Symposium on Security and Privacy (SP). IEEE, 176–194.

[18]

D. Caputo, L. Verderame, A. Merlo, A. Ranieri, and L. Caviglione. 2020. Are you (Google) home? Detecting users’ presence through traffic analysis of smart speakers. ITASEC 2020.

[19]

J.-F. Cardoso and Beate H. Laheld. 1996. Equivariant adaptive source separation. IEEE Trans. Sig. Process. 44, 12 (1996), 3017–3030.

Digital Library

[20]

Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden voice commands. In 25th USENIX Security Symposium (USENIX Security’16). 513–530.

[21]

Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy (SP). IEEE, 39–57.

[22]

Nicholas Carlini and David Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In IEEE Security and Privacy Workshops (SPW). IEEE, 1–7.

[23]

Lucy Chai, Thavishi Illandara, and Zhongxia Yan. [n.d.]. Private speech adversaries. ([n. d.]).

[24]

Kuei-Huan Chang, Po-Hao Huang, Honggang Yu, Yier Jin, and Ting-Chi Wang. 2020. Audio adversarial examples generation with recurrent neural networks. In 25th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 488–493.

Digital Library

[25]

Guangke Chen, Sen Chen, Lingling Fan, Xiaoning Du, Zhe Zhao, Fu Song, and Yang Liu. 2019. Who is real Bob? Adversarial attacks on speaker recognition systems. arXiv preprint arXiv:1911.01840 (2019).

[26]

Jingdong Chen, Jacob Benesty, Yiteng Huang, and Simon Doclo. 2006. New insights into the noise reduction Wiener filter. IEEE Trans. Audio, Speech Lang. Process. 14, 4 (2006), 1218–1234.

Digital Library

[27]

Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. 2017. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In 10th ACM Workshop on Artificial Intelligence and Security. 15–26.

[28]

Steven Chen, Nicholas Carlini, and David Wagner. 2020. Stateful detection of black-box adversarial attacks. In 1st ACM Workshop on Security and Privacy on Artificial Intelligence. 30–39.

[29]

Si Chen, Kui Ren, Sixu Piao, Cong Wang, Qian Wang, Jian Weng, Lu Su, and Aziz Mohaisen. 2017. You can hear but you cannot steal: Defending against voice impersonation attacks on smartphones. In IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, 183–195.

[30]

Tao Chen, Kai-Kuang Ma, and Li-Hui Chen. 1999. Tri-state median filter for image denoising. IEEE Trans. Image Process. 8, 12 (1999), 1834–1838.

Digital Library

[31]

Tao Chen, Longfei Shangguan, Zhenjiang Li, and Kyle Jamieson. 2020. Metamorph: Injecting inaudible commands into over-the-air voice controlled systems. NDSS (2020).

[32]

Yuxin Chen, Huiying Li, Steven Nagels, Zhijing Li, Pedro Lopes, Ben Y. Zhao, and Haitao Zheng. 2019. Understanding the effectiveness of ultrasonic microphone jammer. arXiv preprint arXiv:1904.08490 (2019).

[33]

Yuxuan Chen, Xuejing Yuan, Jiangshan Zhang, Yue Zhao, Shengzhi Zhang, Kai Chen, and XiaoFeng Wang. 2020. Devil’s Whisper: A general approach for physical adversarial attacks against commercial black-box speech recognition devices. In 29th USENIX Security Symposium (USENIX Security’20).

[34]

Moustapha Cisse, Yossi Adi, Natalia Neverova, and Joseph Keshet. 2017. Houdini: Fooling deep structured prediction models. NIPS (2017).

[35]

Ronan Collobert, Christian Puhrsch, and Gabriel Synnaeve. 2016. Wav2Letter: An end-to-end convNet-based speech recognition system. arXiv preprint arXiv:1609.03193 (2016).

[36]

Nilaksh Das, Madhuri Shanbhogue, Shang-Tse Chen, Li Chen, Michael E. Kounavis, and Duen Horng Chau. 2018. Adagio: Interactive experimentation with adversarial attack and defense for audio. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 677–681.

[37]

Ambra Demontis, Marco Melis, Maura Pintor, Matthew Jagielski, Battista Biggio, Alina Oprea, Cristina Nita-Rotaru, and Fabio Roli. 2019. Why do adversarial attacks transfer? Explaining transferability of evasion and poisoning attacks. In 28th USENIX Security Symposium (USENIX Security’19). 321–338.

[38]

Guang Deng and L. W. Cahill. 1993. An adaptive Gaussian filter for noise reduction and edge detection. In IEEE Conference Record Nuclear Science Symposium and Medical Imaging Conference. IEEE, 1615–1619.

[39]

Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting adversarial attacks with momentum. In IEEE Conference on Computer Vision and Pattern Recognition. 9185–9193.

[40]

Tianyu Du, Shouling Ji, Jinfeng Li, Qinchen Gu, Ting Wang, and Raheem Beyah. 2020. SirenAttack: Generating adversarial audio for end-to-end acoustic systems. ASIACCS (2020).

[41]

Sri Harsha Dumpala, Imran Sheikh, Rupayan Chakraborty, and Sunil Kumar Kopparapu. 2019. Improving ASR robustness to perturbed speech using cycle-consistent generative adversarial networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5726–5730.

[42]

Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. 2018. Robust physical-world attacks on deep learning visual classification. In IEEE Conference on Computer Vision and Pattern Recognition. 1625–1634.

[43]

Huan Feng, Kassem Fawaz, and Kang G. Shin. 2017. Continuous authentication for voice assistants. In 23rd Annual International Conference on Mobile Computing and Networking. 343–355.

Digital Library

[44]

Kevin Fu and Wenyuan Xu. 2018. Risks of trusting the physics of sensors. Commun. ACM 61, 2 (2018), 20–23.

Digital Library

[45]

Taesik Gong, Alberto Gil C. P. Ramos, Sourav Bhattacharya, Akhil Mathur, and Fahim Kawsar. 2019. AudiDoS: Real-Time denial-of-service adversarial attacks on deep audio models. In 18th IEEE International Conference on Machine Learning And Applications (ICMLA). IEEE, 978–985.

[46]

Yuan Gong, Boyang Li, Christian Poellabauer, and Yiyu Shi. 2019. Real-time adversarial attacks. In 28th International Joint Conference on Artificial Intelligence. AAAI Press, 4672–4680.

Digital Library

[47]

Yuan Gong and Christian Poellabauer. 2018. Crafting adversarial examples for speech paralinguistics applications. DYnamic and Novel Advances in Machine Learning and Intelligent Cyber Security (DYNAMICS) Workshop.

[48]

Yuan Gong and Christian Poellabauer. 2018. Protecting voice controlled systems using sound source identification based on acoustic cues. In 27th International Conference on Computer Communication and Networks (ICCCN). IEEE, 1–9.

[49]

Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).

[50]

Qingli Guo, Jing Ye, Yiran Chen, Yu Hu, Yazhu Lan, Guohe Zhang, and Xiaowei Li. 2020. INOR—An intelligent noise reduction method to defend against adversarial audio examples. Neurocomputing (2020).

[51]

Qingli Guo, Jing Ye, Yu Hu, Guohe Zhang, Xiaowei Li, and Huawei Li. 2020. MultiPAD: A multivariant partition-based method for audio adversarial examples detection. IEEE Access 8 (2020), 63368–63380.

[52]

Joon Kuy Han, Hyoungshick Kim, and Simon S. Woo. 2019. Nickel to Lego: Using Foolgle to create adversarial examples to fool Google cloud speech-to-text API. In ACM SIGSAC Conference on Computer and Communications Security. 2593–2595.

Digital Library

[53]

Simon Haykin and Bernard Widrow. 2003. Least-mean-square Adaptive Filters. Vol. 31. John Wiley & Sons.

[54]

Yitao He, Junyu Bian, Xinyu Tong, Zihui Qian, Wei Zhu, Xiaohua Tian, and Xinbing Wang. 2019. Canceling inaudible voice commands against voice control systems. In 25th Annual International Conference on Mobile Computing and Networking. 1–15.

Digital Library

[55]

Qian Huang, Isay Katsman, Horace He, Zeqi Gu, Serge Belongie, and Ser-Nam Lim. 2019. Enhancing adversarial example transferability with an intermediate level attack. In IEEE International Conference on Computer Vision. 4733–4742.

[56]

Wenchao Huang, Yan Xiong, Xiang-Yang Li, Hao Lin, Xufei Mao, Panlong Yang, and Yunhao Liu. 2014. Shake and walk: Acoustic direction finding and fine-grained indoor localization using smartphones. In IEEE Conference on Computer Communications. IEEE, 370–378.

[57]

Shehzeen Hussain, Paarth Neekhara, Shlomo Dubnov, Julian McAuley, and Farinaz Koushanfar. 2021. WaveGuard: Understanding and mitigating audio adversarial examples. In 30th USENIX Security Symposium (USENIX Security’21).

[58]

Yeongjin Jang, Chengyu Song, Simon P. Chung, Tielei Wang, and Wenke Lee. 2014. A11y attacks: Exploiting accessibility in operating systems. In ACM SIGSAC Conference on Computer and Communications Security. 103–115.

Digital Library

[59]

Chaouki Kasmi and Jose Lopes Esteves. 2015. IEMI threats for information security: Remote command injection on modern smartphones. IEEE Trans. Electromag. Compatib. 57, 6 (2015), 1752–1755.

[60]

Shreya Khare, Rahul Aralikatte, and Senthil Mani. 2019. Adversarial black-box attacks on automatic speech recognition systems using multi-objective evolutionary optimization. In Interspeech Conference. 3208–3212.

[61]

Yehao Kong and Jiliang Zhang. 2019. Adversarial audio: A new information hiding method and backdoor for DNN-based speech recognition models. arXiv preprint arXiv:1904.03829 (2019).

[62]

Felix Kreuk, Yossi Adi, Moustapha Cisse, and Joseph Keshet. 2018. Fooling end-to-end speaker verification with adversarial examples. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1962–1966.

Digital Library

[63]

Deepak Kumar, Riccardo Paccagnella, Paul Murley, Eric Hennenfent, Joshua Mason, Adam Bates, and Michael Bailey. 2018. Skill squatting attacks on amazon alexa. In 27th USENIX Security Symposium (USENIX Security’18). 33–47.

[64]

Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2016. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016).

[65]

Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2016. Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236 (2016).

[66]

Hyun Kwon, Yongchul Kim, Hyunsoo Yoon, and Daeseon Choi. 2019. Selective audio adversarial example in evasion attack on speech recognition system. IEEE Trans. Inf. Forens. Secur. 15 (2019), 526–538.

Digital Library

[67]

Hyun Kwon, Hyunsoo Yoon, and Ki-Woong Park. 2019. POSTER: Detecting audio adversarial example through audio modification. In ACM SIGSAC Conference on Computer and Communications Security. 2521–2523.

Digital Library

[68]

Yann LeCun et al. 2015. LeNet-5, convolutional neural networks. 20, 5 (2015), 14. Retrieved from http://yann.lecun.com/exdb/lenet.

[69]

Yeonjoon Lee, Yue Zhao, Jiutian Zeng, Kwangwuk Lee, Nan Zhang, Faysal Hossain Shezan, Yuan Tian, Kai Chen, and XiaoFeng Wang. 2020. Using sonar for liveness detection to protect smart speakers against remote attackers. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 4, 1 (2020), 1–28.

Digital Library

[70]

Xinyu Lei, Guan-Hua Tu, Alex X. Liu, Kamran Ali, Chi-Yu Li, and Tian Xie. 2017. The insecurity of home digital voice assistants–Amazon Alexa as a case study. arXiv preprint arXiv:1712.03327 (2017).

[71]

Juncheng Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, and Florian Metze. 2019. Adversarial music: Real world audio adversary against wake-word detection system. In Conference on Advances in Neural Information Processing Systems. 11908–11918.

[72]

Xu Li, Jinghua Zhong, Xixin Wu, Jianwei Yu, Xunying Liu, and Helen Meng. 2020. Adversarial attacks on GMM I-Vector based speaker verification systems. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6579–6583.

[73]

Zhuohang Li, Cong Shi, Yi Xie, Jian Liu, Bo Yuan, and Yingying Chen. 2020. Practical adversarial attacks against speaker recognition systems. In 21st International Workshop on Mobile Computing Systems and Applications. 9–14.

[74]

Zhuohang Li, Yi Wu, Jian Liu, Yingying Chen, and Bo Yuan. 2020. AdvPulse: Universal, synchronization-free, and targeted audio adversarial attacks via subsecond perturbations. In ACM SIGSAC Conference on Computer and Communications Security. 1121–1134.

Digital Library

[75]

Jinpeng Liu and Ke Tang. 2013. Scaling up covariance matrix adaptation evolution strategy using cooperative coevolution. In International Conference on Intelligent Data Engineering and Automated Learning. Springer, 350–357.

Digital Library

[76]

Songxiang Liu, Haibin Wu, Hung-Yi Lee, and Helen Meng. 2019. Adversarial attacks on spoofing countermeasures of automatic speaker verification. ASRU (2019).

[77]

Xiaolei Liu, Xiaosong Zhang, Kun Wan, Qingxin Zhu, and Yufei Ding. 2020. Towards weighted-sampling audio adversarial example attack. AAAI (2020).

[78]

Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. 2016. Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770 (2016).

[79]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).

[80]

Dongyu Meng and Hao Chen. 2017. MagNet: A two-pronged defense against adversarial examples. In ACM SIGSAC Conference on Computer and Communications Security. 135–147.

Digital Library

[81]

Yan Meng, Zichang Wang, Wei Zhang, Peilin Wu, Haojin Zhu, Xiaohui Liang, and Yao Liu. 2018. WiVo: Enhancing the security of voice control system via wireless signal in IoT environment. In 18th ACM International Symposium on Mobile Ad Hoc Networking and Computing. 81–90.

[82]

Richard Mitev, Markus Miettinen, and Ahmad-Reza Sadeghi. 2019. Alexa lied to me: Skill-based man-in-the-middle attacks on virtual assistants. In ACM Asia Conference on Computer and Communications Security. 465–478.

Digital Library

[83]

Lindasalwa Muda, Mumtaj Begam, and Irraivan Elamvazuthi. 2010. Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. arXiv preprint arXiv:1003.4083 (2010).

[84]

Taiki Nakamura, Yuki Saito, Shinnosuke Takamichi, Yusuke Ijima, and Hiroshi Saruwatari. [n.d.]. V2S attack: Building DNN-based voice conversion from automatic speaker verification. In 10th ISCA Speech Synthesis Workshop. 161–165.

[85]

Shoei Nashimoto, Daisuke Suzuki, Takeshi Sugawara, and Kazuo Sakiyama. 2018. Sensor CON-Fusion: Defeating Kalman filter in signal injection attack. In Asia Conference on Computer and Communications Security. 511–524.

Digital Library

[86]

Paarth Neekhara, Shehzeen Hussain, Prakhar Pandey, Shlomo Dubnov, Julian McAuley, and Farinaz Koushanfar. 2019. Universal adversarial perturbations for speech recognition systems. In Interspeech Conference. 481–485.

[87]

Tae-Hyun Oh, Tali Dekel, Changil Kim, Inbar Mosseri, William T. Freeman, Michael Rubinstein, and Wojciech Matusik. 2019. Speech2Face: Learning the face behind a voice. In IEEE Conference on Computer Vision and Pattern Recognition. 7539–7548.

[88]

Ren Pang, Hua Shen, Xinyang Zhang, Shouling Ji, Yevgeniy Vorobeychik, Xiapu Luo, Alex Liu, and Ting Wang. 2020. A tale of evil twins: Adversarial inputs versus poisoned models. In ACM SIGSAC Conference on Computer and Communications Security. 85–99.

Digital Library

[89]

Ren Pang, Xinyang Zhang, Shouling Ji, Xiapu Luo, and Ting Wang. 2020. AdvMind: Inferring adversary intent of black-box attacks. In 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1899–1907.

Digital Library

[90]

Tianyu Pang, Chao Du, Yinpeng Dong, and Jun Zhu. 2018. Towards robust detection of adversarial examples. In Conference on Advances in Neural Information Processing Systems. 4579–4589.

[91]

Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. 2016. Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277 (2016).

[92]

Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In ACM on Asia conference on Computer and Communications Security. 506–519.

[93]

Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 372–387.

[94]

Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. In IEEE Symposium on Security and Privacy (SP). IEEE, 582–597.

[95]

Yao Qin, Nicholas Carlini, Ian Goodfellow, Garrison Cottrell, and Colin Raffel. 2019. Imperceptible, robust, and targeted adversarial examples for automatic speech recognition. ICML (2019).

[96]

Erwin Quiring, David Klein, Daniel Arp, Martin Johns, and Konrad Rieck. 2020. Adversarial preprocessing: Understanding and preventing image-scaling attacks in machine learning. In 29th USENIX Security Symposium (USENIX Security’20).

[97]

Krishan Rajaratnam and Jugal Kalita. 2018. Noise flooding for detecting audio adversarial examples against automatic speech recognition. In IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 197–201.

[98]

Krishan Rajaratnam, Kunal Shah, and Jugal Kalita. 2018. Isolated and ensemble audio preprocessing methods for detecting adversarial examples against automatic speech recognition. In 30th Conference on Computational Linguistics and Speech Processing (ROCLING’18). 16–30.

[99]

Andrew Slavin Ross and Finale Doshi-Velez. 2017. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. arXiv preprint arXiv:1711.09404 (2017).

[100]

Nirupam Roy, Haitham Hassanieh, and Romit Roy Choudhury. 2017. BackDoor: Making microphones hear inaudible sounds. In 15th Annual International Conference on Mobile Systems, Applications, and Services. 2–14.

Digital Library

[101]

Nirupam Roy, Sheng Shen, Haitham Hassanieh, and Romit Roy Choudhury. 2018. Inaudible voice commands: The long-range attack and defense. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI’18). 547–560.

[102]

Sara Sabour, Yanshuai Cao, Fartash Faghri, and David J. Fleet. 2015. Adversarial manipulation of deep representations. arXiv preprint arXiv:1511.05122 (2015).

[103]

Saeid Samizade, Zheng-Hua Tan, Chao Shen, and Xiaohong Guan. 2020. Adversarial example detection by classification for deep speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3102–3106.

[104]

Lea Schönherr, Katharina Kohls, Steffen Zeiler, Thorsten Holz, and Dorothea Kolossa. 2019. Adversarial attacks against automatic speech recognition systems via psychoacoustic hiding. NDSS (2019).

[105]

Lea Schönherr, Steffen Zeiler, Thorsten Holz, and Dorothea Kolossa. 2019. Robust over-the-air adversarial examples against automatic speech recognition systems. arXiv preprint arXiv:1908.01551 (2019).

[106]

Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K. Reiter. 2016. Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. In ACM SIGSAC Conference on Computer and Communications Security. 1528–1540.

Digital Library

[107]

Hao Shen, Weiming Zhang, Han Fang, Zehua Ma, and Nenghai Yu. 2019. JamSys: Coverage optimization of a microphone jamming system based on ultrasounds. IEEE Access 7 (2019), 67483–67496.

[108]

Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, et al. 2019. Lingvo: A modular and scalable framework for sequence-to-sequence modeling. arXiv preprint arXiv:1902.08295 (2019).

[109]

Yucheng Shi, Siyu Wang, and Yahong Han. 2019. Curls & Whey: Boosting black-box adversarial attacks. In IEEE Conference on Computer Vision and Pattern Recognition. 6519–6527.

[110]

Hocheol Shin, Dohyun Kim, Yujin Kwon, and Yongdae Kim. 2017. Illusion and dazzle: Adversarial optical channel exploits against lidars for automotive applications. In International Conference on Cryptographic Hardware and Embedded Systems. Springer, 445–467.

[111]

Yunmok Son, Hocheol Shin, Dongkwan Kim, Youngseok Park, Juhwan Noh, Kibum Choi, Jungwoo Choi, and Yongdae Kim. 2015. Rocking drones with intentional sound noise on gyroscopic sensors. In 24th USENIX Security Symposium (USENIX Security’15). 881–896.

[112]

Liwei Song and Prateek Mittal. 2017. Poster: Inaudible voice commands. In ACM SIGSAC Conference on Computer and Communications Security. 2583–2585.

Digital Library

[113]

Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. 2019. One pixel attack for fooling deep neural networks. IEEE Trans. Evolut. Comput. 23, 5 (2019), 828–841.

[114]

Vinod Subramanian, Emmanouil Benetos, and Mark B. Sandler. 2019. Robustness of adversarial attacks in sound event classification. (2019).

[115]

Takeshi Sugawara, Benjamin Cyr, Sara Rampazzi, Daniel Genkin, and Kevin Fu. [n.d.]. Light commands: Laser-Based audio injection attacks on voice-controllable systems. ([n.d.]).

[116]

Jiachen Sun, Yulong Cao, Qi Alfred Chen, and Z. Morley Mao. 2020. Towards robust lidar-based perception in autonomous driving: General black-box adversarial sensor attack and countermeasures. In 29th USENIX Security Symposium (USENIX Security’20). 877–894.

[117]

Sining Sun, Pengcheng Guo, Lei Xie, and Mei-Yuh Hwang. 2019. Adversarial regularization for attention based end-to-end robust speech recognition. IEEE/ACM Trans. Audio, Speech Lang. Process. 27, 11 (2019), 1826–1838.

Digital Library

[118]

Sining Sun, Ching-Feng Yeh, Mari Ostendorf, Mei-Yuh Hwang, and Lei Xie. 2018. Training augmentation with adversarial examples for robust speech recognition. In Interspeech Conference. 2404–2408.

[119]

Zheng Sun, Aveek Purohit, Raja Bose, and Pei Zhang. 2013. Spartacus: Spatially-aware interaction for mobile devices through energy-efficient audio sensing. In 11th Annual International Conference on Mobile Systems, Applications, and Services. 263–276.

Digital Library

[120]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).

[121]

Joseph Szurley and Zico J. Kolter. 2019. Perceptual based adversarial audio attacks. CoRR (2019).

[122]

Keiichi Tamura, Akitada Omagari, and Shuichi Hashida. 2019. Novel defense method against audio adversarial example for speech-to-text transcription neural networks. In IEEE 11th International Workshop on Computational Intelligence and Applications (IWCIA). IEEE, 115–120.

[123]

Rohan Taori, Amog Kamsetty, Brenton Chu, and Nikita Vemuri. 2019. Targeted adversarial examples for black box audio systems. In IEEE Security and Privacy Workshops (SPW). IEEE, 15–20.

[124]

Dang Ngoc Hoang Thanh, Serdar Engínoğlu, et al. 2019. An iterative mean filter for image denoising. IEEE Access 7 (2019), 167847–167859.

[125]

Xiaohai Tian, Rohan Kumar Das, and Haizhou Li. 2019. Black-box attacks on automatic speaker verification using feedback-controlled voice conversion. arXiv preprint arXiv:1909.07655 (2019).

[126]

Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, and Patrick McDaniel. 2017. Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204 (2017).

[127]

Timothy Trippel, Ofir Weisse, Wenyuan Xu, Peter Honeyman, and Kevin Fu. 2017. WALNUT: Waging doubt on the integrity of MEMS accelerometers with acoustic injection attacks. In IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 3–18.

[128]

Yazhou Tu, Zhiqiang Lin, Insup Lee, and Xiali Hei. 2018. Injected and delivered: Fabricating implicit control over actuation systems by spoofing inertial sensors. In 27th USENIX Security Symposium (USENIX Security’18). 1545–1562.

[129]

Jon Vadillo and Roberto Santana. 2019. Universal adversarial examples in speech command classification. arXiv preprint arXiv:1911.10182 (2019).

[130]

Jon Vadillo and Roberto Santana. 2022. On the human evaluation of universal audio adversarial perturbations. Comput. Secur. 112 (2022), 102495.

Digital Library

[131]

Tavish Vaidya, Yuankai Zhang, Micah Sherr, and Clay Shields. 2015. Cocaine noodles: Exploiting the gap between human and machine speech recognition. In 9th USENIX Workshop on Offensive Technologies (WOOT’15).

[132]

Ville Vestman, Tomi Kinnunen, Rosa González Hautamäki, and Md Sahidullah. 2020. Voice mimicry attacks assisted by automatic speaker verification. Comput. Speech Lang. 59 (2020), 36–54.

Digital Library

[133]

Chen Wang, S. Abhishek Anand, Jian Liu, Payton Walker, Yingying Chen, and Nitesh Saxena. 2019. Defeating hidden audio channel attacks on voice assistants via audio-induced surface vibrations. In 35th Annual Computer Security Applications Conference. 42–56.

Digital Library

[134]

Qian Wang, Kui Ren, Man Zhou, Tao Lei, Dimitrios Koutsonikolas, and Lu Su. 2016. Messages behind the sound: Real-time hidden acoustic signal capture with smartphones. In 22nd Annual International Conference on Mobile Computing and Networking. 29–41.

Digital Library

[135]

Xiong Wang, Sining Sun, Changhao Shan, Jingyong Hou, Lei Xie, Shen Li, and Xin Lei. 2019. Adversarial examples for improving end-to-end attention-based small-footprint keyword spotting. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6366–6370.

[136]

Xiong Wang, Sining Sun, Changhao Shan, Jingyong Hou, Lei Xie, Shen Li, and Xin Lei. 2019. Adversarial examples for improving end-to-end attention-based small-footprint keyword spotting. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6366–6370.

[137]

Yao Wang, Wandong Cai, Tao Gu, Wei Shao, Yannan Li, and Yong Yu. 2019. Secure your voice: An oral airflow-based continuous liveness detection for voice assistants. Proc. ACM Interact., Mob., Wear. Ubiq. Technol. 3, 4 (2019), 1–28.

Digital Library

[138]

Lei Wu, Zhanxing Zhu, Cheng Tai, and E. Weinan. 2018. Enhancing the transferability of adversarial examples with noise reduced gradient. (2018).

[139]

Yi Wu, Jian Liu, Yingying Chen, and Jerry Cheng. 2019. Semi-black-box attacks against speech recognition systems using adversarial samples. In IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN). IEEE, 1–5.

[140]

Qixue Xiao, Yufei Chen, Chao Shen, Yu Chen, and Kang Li. 2019. Seeing is not believing: Camouflage attacks on image scaling algorithms. In 28th USENIX Security Symposium (USENIX Security’19). 443–460.

[141]

Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L. Yuille. 2019. Improving transferability of adversarial examples with input diversity. In IEEE Conference on Computer Vision and Pattern Recognition. 2730–2739.

[142]

Yaxiong Xie, Zhenjiang Li, and Mo Li. 2018. Precise power delay profiling with commodity Wi-Fi. IEEE Trans. Mob. Comput. 18, 6 (2018), 1342–1355.

Digital Library

[143]

Yi Xie, Zhuohang Li, Cong Shi, Jian Liu, Yingying Chen, and Bo Yuan. 2020. Enabling fast and universal audio adversarial attack using generative model. arXiv preprint arXiv:2004.12261 (2020).

[144]

Yi Xie, Cong Shi, Zhuohang Li, Jian Liu, Yingying Chen, and Bo Yuan. 2020. Real-time, universal, and robust adversarial attacks against speaker recognition systems. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1738–1742.

[145]

Wenyuan Xu, Chen Yan, Weibin Jia, Xiaoyu Ji, and Jianhao Liu. 2018. Analyzing and enhancing the security of ultrasonic sensors for autonomous vehicles. IEEE Internet Things J. 5, 6 (2018), 5015–5029.

[146]

Zirui Xu, Fuxun Yu, and Xiang Chen. 2020. LanCe: A comprehensive and lightweight CNN defense methodology against physical adversarial attacks on embedded multimedia applications. In 25th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 470–475.

Digital Library

[147]

Hiromu Yakura and Jun Sakuma. 2019. Robust audio adversarial example for a physical attack. In 28th International Joint Conference on Artificial Intelligence. AAAI Press, 5334–5341.

[148]

Chen Yan, Hocheol Shin, Connor Bolton, Wenyuan Xu, Yongdae Kim, and Kevin Fu. 2020. SoK: A minimalist approach to formalizing analog sensor security. In IEEE Symposium on Security and Privacy (SP). 480–495.

[149]

Chen Yan, Wenyuan Xu, and Jianhao Liu. 2016. Can you trust autonomous vehicles: Contactless attacks against sensors of self-driving vehicle. DEF CON 24, 8 (2016), 109.

[150]

Chen Yan, Guoming Zhang, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. 2019. The feasibility of injecting inaudible voice commands to voice assistants. IEEE Trans. Depend. Secure Comput. (2019).

[151]

Qiben Yan, Kehai Liu, Qin Zhou, Hanqing Guo, and Ning Zhang. [n.d.]. SurfingAttack: Interactive hidden attack on voice assistants using ultrasonic guided waves. ([n.d.]).

[152]

Chao-Han Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, and Chin-Hui Lee. 2020. Characterizing speech adversarial examples using self-attention U-Net enhancement. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3107–3111.

[153]

Zhuolin Yang, Pin Yu Chen, Bo Li, and Dawn Song. 2019. Characterizing audio adversarial examples using temporal dependency. In 7th International Conference on Learning Representations.

[154]

Park Joon Young, Jo Hyo Jin, Samuel Woo, and Dong Hoon Lee. 2016. BadVoice: Soundless voice-control replay attack on modern smartphones. In 8th International Conference on Ubiquitous and Future Networks (ICUFN). IEEE, 882–887.

[155]

Xuejing Yuan, Yuxuan Chen, Aohui Wang, Kai Chen, Shengzhi Zhang, Heqing Huang, and Ian M. Molloy. 2018. All your Alexa are belong to us: A remote voice control attack against Echo. In IEEE Global Communications Conference (GLOBECOM). IEEE, 1–6.

Digital Library

[156]

Xuejing Yuan, Yuxuan Chen, Yue Zhao, Yunhui Long, Xiaokang Liu, Kai Chen, Shengzhi Zhang, Heqing Huang, Xiaofeng Wang, and Carl A. Gunter. 2018. CommanderSong: A systematic approach for practical adversarial voice recognition. In 27th USENIX Security Symposium (USENIX Security’18). 49–64.

[157]

Qiang Zeng, Jianhai Su, Chenglong Fu, Golam Kayas, Lannan Luo, Xiaojiang Du, Chiu C. Tan, and Jie Wu. 2019. A multiversion programming inspired approach to detecting audio adversarial examples. In 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 39–51.

[158]

Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. 2017. DolphinAttack: Inaudible voice commands. In ACM SIGSAC Conference on Computer and Communications Security. 103–117.

Digital Library

[159]

Hongting Zhang, Pan Zhou, Qiben Yan, and Xiao-Yang Liu. 2020. Generating robust audio adversarial examples with temporal dependency. In International Joint Conferences on Artificial Intelligence. 3167–3173.

[160]

Jiajie Zhang, Bingsheng Zhang, and Bincheng Zhang. 2019. Defending adversarial attacks on cloud-aided automatic speech recognition systems. In 7th International Workshop on Security in Cloud Computing. 23–31.

[161]

Nan Zhang, Xianghang Mi, Xuan Feng, XiaoFeng Wang, Yuan Tian, and Feng Qian. 2019. Dangerous skills: Understanding and mitigating security risks of voice-controlled third-party functions on virtual personal assistant systems. In IEEE Symposium on Security and Privacy (SP). IEEE, 1381–1396.

[162]

Rongjunchen Zhang, Xiao Chen, Sheng Wen, and James Zheng. 2019. Who activated my voice assistant? A stealthy attack on Android phones without users’ awareness. In International Conference on Machine Learning for Cyber Security. Springer, 378–396.

Digital Library

[163]

Yangyong Zhang, Lei Xu, Abner Mendoza, Guangliang Yang, Phakpoom Chinprutthiwong, and Guofei Gu. 2019. Life after speech recognition: Fuzzing semantic misinterpretation for voice assistant applications. NDSS.

[164]

Baolin Zheng, Peipei Jiang, Qian Wang, Qi Li, Chao Shen, Cong Wang, Yunjie Ge, Qingyang Teng, and Shenyi Zhang. 2021. Black-box adversarial attacks on commercial speech platforms with minimal information. arXiv preprint arXiv:2110.09714 (2021).

[165]

Bing Zhou, Mohammed Elbadry, Ruipeng Gao, and Fan Ye. 2017. BatTracker: High precision infrastructure-free mobile device tracking in indoor environments. In 15th ACM Conference on Embedded Network Sensor Systems. 1–14.

Digital Library

Cited By

Bou Abdo JZeadally S(2024)Disposable identities: Solving web trackingJournal of Information Security and Applications10.1016/j.jisa.2024.10382184(103821)Online publication date: Aug-2024
https://doi.org/10.1016/j.jisa.2024.103821
Wang ZHe SLi G(2024)Secure speech-recognition data transfer in the internet of things using a power system and a tried-and-true key generation techniqueCluster Computing10.1007/s10586-024-04649-3Online publication date: 29-Jul-2024
https://doi.org/10.1007/s10586-024-04649-3
Guo FSun ZChen YJu L(2023)Towards the transferable audio adversarial attack via ensemble methodsCybersecurity10.1186/s42400-023-00175-86:1Online publication date: 5-Dec-2023
https://doi.org/10.1186/s42400-023-00175-8
Show More Cited By

Index Terms

SoK: A Modularized Approach to Study the Security of Automatic Speech Recognition Systems
1. Security and privacy
  1. Software and application security
    1. Domain-specific security and privacy architectures

Recommendations

Automatic Speech Recognition Used for Intelligibility Assessment of Text-to-Speech Systems
Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction

Speech intelligibility is the most important parameter in evaluation of speech quality. In the contribution, a new objective intelligibility assessment of general speech processing algorithms is proposed. It is based on automatic recognition methods ...
Comparing humans and automatic speech recognition systems in recognizing dysarthric speech
Canadian AI'11: Proceedings of the 24th Canadian conference on Advances in artificial intelligence

Speech is a complex process that requires control and coordination of articulation, breathing, voicing, and prosody. Dysarthria is a manifestation of an inability to control and coordinate one or more of these aspects, which results in poorly ...
Cued Speech automatic recognition in normal-hearing and deaf subjects

This article discusses the automatic recognition of Cued Speech in French based on hidden Markov models (HMMs). Cued Speech is a visual mode which, by using hand shapes in different positions and in combination with lip patterns of speech, makes all the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Privacy and Security

ACM Transactions on Privacy and Security Volume 25, Issue 3

August 2022

288 pages

ISSN:2471-2566

EISSN:2471-2574

DOI:10.1145/3530305

Editor:
Ninghui Li
Purdue University, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 May 2022

Online AM: 29 March 2022

Accepted: 01 January 2022

Revised: 01 November 2021

Received: 01 March 2021

Published in TOPS Volume 25, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
1,173
Total Downloads

Downloads (Last 12 months)252
Downloads (Last 6 weeks)20

Reflects downloads up to 09 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bou Abdo JZeadally S(2024)Disposable identities: Solving web trackingJournal of Information Security and Applications10.1016/j.jisa.2024.10382184(103821)Online publication date: Aug-2024
https://doi.org/10.1016/j.jisa.2024.103821
Wang ZHe SLi G(2024)Secure speech-recognition data transfer in the internet of things using a power system and a tried-and-true key generation techniqueCluster Computing10.1007/s10586-024-04649-3Online publication date: 29-Jul-2024
https://doi.org/10.1007/s10586-024-04649-3
Guo FSun ZChen YJu L(2023)Towards the transferable audio adversarial attack via ensemble methodsCybersecurity10.1186/s42400-023-00175-86:1Online publication date: 5-Dec-2023
https://doi.org/10.1186/s42400-023-00175-8
Chen GZhao ZSong FChen SFan LWang FWang J(2023)Towards Understanding and Mitigating Audio Adversarial Examples for Speaker RecognitionIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.322067320:5(3970-3987)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1109/TDSC.2022.3220673
De J. Velásquez-Martínez EBecerra-Sánchez ADe La Rosa-Vargas JGonzález-Ramírez ERodarte-Rodríguez AZepeda-Valles GEscalante-García NOlvera-González J(2023)Combining Deep Learning with Domain Adaptation and Filtering Techniques for Speech Recognition in Noisy Environments2023 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)10.1109/ROPEC58757.2023.10409492(1-6)Online publication date: 18-Oct-2023
https://doi.org/10.1109/ROPEC58757.2023.10409492
Chen YKe BChen BChiu STu CKuo J(2023)Knowledge Distillation Based Defense for Audio Trigger Backdoor in Federated LearningGLOBECOM 2023 - 2023 IEEE Global Communications Conference10.1109/GLOBECOM54140.2023.10437601(4271-4276)Online publication date: 4-Dec-2023
https://doi.org/10.1109/GLOBECOM54140.2023.10437601
Li RXue M(2023)Defense Mechanisms Against Audio Adversarial Attacks: Recent Advances and Future DirectionsEdge Computing and IoT: Systems, Management and Security10.1007/978-3-031-28990-3_12(166-175)Online publication date: 31-Mar-2023
https://doi.org/10.1007/978-3-031-28990-3_12
Mukhamadiyev AKhujayarov IDjuraev OCho J(2022)Automatic Speech Recognition Method Based on Deep Learning Approaches for Uzbek LanguageSensors10.3390/s2210368322:10(3683)Online publication date: 12-May-2022
https://doi.org/10.3390/s22103683
Zong WChow YSusilo WKim JLe N(2022)Detecting Audio Adversarial Examples in Automatic Speech Recognition Systems Using Decision Boundary PatternsJournal of Imaging10.3390/jimaging81203248:12(324)Online publication date: 9-Dec-2022
https://doi.org/10.3390/jimaging8120324
Ye JLin FLiu XLiu B(2022)Your Voice is Not Yours? Black-Box Adversarial Attacks Against Speaker Recognition Systems2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00094(692-699)Online publication date: Dec-2022
https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00094

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents