Bio
We haven't found any bio for you yet.
Researcher Links
Loading links...
Publications by Type
Loading publications…
The last 5 uploaded publications
Locality-aware Fair Scheduling in LLM Serving
Shiyi Cao, Yichuan Wang, Ziming Mao, P.-h.J. Hsu, Liangsheng Yin, Tian Xia, Dacheng Li, Shu Liu, Yuanhang Zhang, Yang Zhou, Ying Sheng, Joseph E. Gonzalez, Ion Stoica (2025). Locality-aware Fair Scheduling in LLM Serving. , DOI: https://doi.org/10.48550/arxiv.2501.14312.
Preprint85 days agoS-LoRA: Serving Thousands of Concurrent LoRA Adapters
Ying Sheng, Shiyi Cao, Dacheng Li, Coleman Hooper, Nick Lee, Shuo Yang, Christopher Chou, Banghua Zhu, Lianmin Zheng, Kurt Keutzer, Joseph E. Gonzalez, Ion Stoica (2023). S-LoRA: Serving Thousands of Concurrent LoRA Adapters. , DOI: https://doi.org/10.48550/arxiv.2311.03285.
Preprint85 days agoDISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training
Dacheng Li, Rulin Shao, Anze Xie, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Xuezhe Ma, Hao Zhang (2023). DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training. , DOI: https://doi.org/10.48550/arxiv.2310.03294.
Preprint85 days agoFairness in Serving Large Language Models
Ying Sheng, Shiyi Cao, Dacheng Li, Banghua Zhu, Zhuohan Li, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica (2023). Fairness in Serving Large Language Models. , DOI: https://doi.org/10.48550/arxiv.2401.00588.
Preprint85 days agoJudging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Lin Zi, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, Ion Stoica (2023). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. , DOI: https://doi.org/10.48550/arxiv.2306.05685.
Preprint85 days ago