Hi, my name is

Shuo Yang.

I build full-stack machine learning systems.

I am a Ph.D. student in EECS at UC Berkeley, advised by Ion Stoica. I work on full-stack machine learning systems, from kernel optimization and efficient system design to text and multimodal algorithms, with the goal of making modern AI workloads efficient on real hardware.

About

I am a member of Sky Computing Lab and LMSYS. My work spans the full stack of machine learning systems: kernel optimization at the hardware-software boundary, efficient system design for large-scale inference and generation, and text and multimodal algorithms that benefit from those systems advances.

I am especially interested in algorithm-system co-design: building methods that are not only theoretically appealing, but also practical and efficient when deployed at scale. Recent projects include LLM serving, sparse attention, exact GPU K-Means, and efficient video generation.

Recent highlights include the Amazon AI PhD Fellowship, a research scientist internship at Amazon Neuron Science, and an upcoming research internship at Meta.

Previously, I graduated from the ACM Honors Class at Shanghai Jiao Tong University.