Prism

Shan Yu, Shuo Yang, et al. | May 1, 2025

Prism focuses on cost-efficient multi-LLM serving through GPU sharing and dynamic memory coordination across models.