SLA

Jintao Zhang, Haoxu Wang, Kai Jiang, Shuo Yang, Kaiwen Zheng, Haocheng Xi, Ziteng Wang, Hongzhou Zhu, Min Zhao, Ion Stoica, Joseph E. Gonzalez, Jun Zhu, Jianfei Chen | Sep 29, 2025

SLA fuses sparse and linear attention into a fine-tunable mechanism that classifies attention weights into critical, marginal, and negligible categories, accelerating diffusion transformers without sacrificing generation quality.