UCCL is an extensible software transport layer designed for modern GPU networking workloads such as collective communication, KV-cache transfer, and RL weight transfer. It demonstrates substantial gains over NCCL on cloud GPUs and accelerates distributed training and LLM serving.
UCCL
Yang Zhou, Zhongjie Chen, Ziming Mao, ChonLam Lao, Shuo Yang, Pravein Govindan Kannan, Jiaqi Gao, Yilong Zhao, Yongji Wu, Kaichao You, Fengyuan Ren, Zhiying Xu, Costin Raiciu, Ion Stoica
|
Apr 24, 2025
