Bio
We haven't found any bio for you yet.
Researcher Links
Loading links...
Publications by Type
Loading publications…
The last 5 uploaded publications
View all
PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications
Kuntai Du, Bowen Wang, Chen Zhang, Yi‐Ming Cheng, Qing Lan, H. S. Sang, Yihua Cheng, Jiayi Yao, Xiaoxuan Liu, Yong Qiao, Ion Stoica, Junchen Jiang (2025). PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications. , DOI: https://doi.org/10.48550/arxiv.2505.07203.
Preprint29 days agoPrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications
Kuntai Du, Bowen Wang, Chen Zhang, Yi‐Ming Cheng, Qing Lan, H. S. Sang, Yihua Cheng, Jiayi Yao, Xiaoxuan Liu, Yong Qiao, Ion Stoica, Junchen Jiang (2025). PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications. , DOI: https://doi.org/10.1145/3731569.3764834.
Article29 days ago