(Translated by https://www.hiragana.jp/)
[2301.10295] Object Segmentation with Audio Context