(Translated by https://www.hiragana.jp/)
[2302.01275] ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs