Raw Data Library
About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User Guide
Green Science
​
​
EN
Kurumsal BaşvuruSign inGet started
​
​

About
Aims and ScopeAdvisory Board Members
More
Who We Are?
User GuideGreen Science

Language

Kurumsal Başvuru

Sign inGet started
RDL logo

Verified research datasets. Instant access. Built for collaboration.

Navigation

About

Aims and Scope

Advisory Board Members

More

Who We Are?

Contact

Add Raw Data

User Guide

Legal

Privacy Policy

Terms of Service

Support

Got an issue? Email us directly.

Email: info@rawdatalibrary.netOpen Mail App
​
​

© 2026 Raw Data Library. All rights reserved.
PrivacyTermsContact
  1. Raw Data Library
  2. /
  3. Publications
  4. /
  5. Locality-aware Fair Scheduling in LLM Serving

Verified authors • Institutional access • DOI aware
50,000+ researchers120,000+ datasets90% satisfaction
Preprint
en
2025

Locality-aware Fair Scheduling in LLM Serving

0 Datasets

0 Files

en
2025
DOI: 10.48550/arxiv.2501.14312arxiv.org/abs/2501.14312

Get instant academic access to this publication’s datasets.

Create free accountHow it works

Frequently asked questions

Is access really free for academics and students?

Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.

How is my data protected?

Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.

Can I request additional materials?

Yes, message the author after sign-up to request supplementary files or replication code.

Advance your research today

Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.

Get free academic accessLearn more
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaboration
Access Research Data

Join our academic network to download verified datasets and collaborate with researchers worldwide.

Get Free Access
Institutional SSO
Secure
This PDF is not available in different languages.
No localized PDFs are currently available.
Ion Stoica
Ion Stoica

University of California, Berkeley

Verified
Shiyi Cao
Yichuan Wang
Ziming Mao
+10 more

Abstract

Large language model (LLM) inference workload dominates a wide variety of modern AI applications, ranging from multi-turn conversation to document analysis. Balancing fairness and efficiency is critical for managing diverse client workloads with varying prefix patterns. Unfortunately, existing fair scheduling algorithms for LLM serving, such as Virtual Token Counter (VTC), fail to take prefix locality into consideration and thus suffer from poor performance. On the other hand, locality-aware scheduling algorithms in existing LLM serving frameworks tend to maximize the prefix cache hit rate without considering fair sharing among clients. This paper introduces the first locality-aware fair scheduling algorithm, Deficit Longest Prefix Match (DLPM), which can maintain a high degree of prefix locality with a fairness guarantee. We also introduce a novel algorithm, Double Deficit LPM (D$^2$LPM), extending DLPM for the distributed setup that can find a balance point among fairness, locality, and load-balancing. Our extensive evaluation demonstrates the superior performance of DLPM and D$^2$LPM in ensuring fairness while maintaining high throughput (up to 2.87$\times$ higher than VTC) and low per-client (up to 7.18$\times$ lower than state-of-the-art distributed LLM serving system) latency.

How to cite this publication

Shiyi Cao, Yichuan Wang, Ziming Mao, P.-h.J. Hsu, Liangsheng Yin, Tian Xia, Dacheng Li, Shu Liu, Yuanhang Zhang, Yang Zhou, Ying Sheng, Joseph E. Gonzalez, Ion Stoica (2025). Locality-aware Fair Scheduling in LLM Serving. , DOI: https://doi.org/10.48550/arxiv.2501.14312.

Related publications

Why join Raw Data Library?

Quality

Datasets shared by verified academics with rich metadata and previews.

Control

Authors choose access levels; downloads are logged for transparency.

Free for Academia

Students and faculty get instant access after verification.

Publication Details

Type

Preprint

Year

2025

Authors

13

Datasets

0

Total Files

0

Language

en

DOI

https://doi.org/10.48550/arxiv.2501.14312

Join Research Community

Access datasets from 50,000+ researchers worldwide with institutional verification.

Get Free Access