(Translated by https://www.hiragana.jp/)
[2312.08514] TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking