(Translated by https://www.hiragana.jp/)
[2406.19263] Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding