Prism focuses on cost-efficient multi-LLM serving through GPU sharing and dynamic memory coordination across models.
Prism
Shan Yu, Shuo Yang, et al.
|
May 1, 2025


Prism focuses on cost-efficient multi-LLM serving through GPU sharing and dynamic memory coordination across models.