(Translated by https://www.hiragana.jp/)
[2406.06798] The Reasonable Effectiveness of Speaker Embeddings for Violence Detection