(Translated by https://www.hiragana.jp/)
[2402.18096] No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization