(Translated by https://www.hiragana.jp/)
[2202.00259] Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics