vAttention

Aditya Desai, Shuo Yang, et al. | Oct 1, 2025

vAttention combines top-k and sampling-based sparse attention with explicit approximation guarantees, targeting reliable deployment of sparse decoding methods.