← back to projects
SCORPIO — SLO-aware LLM serving
Serving the right requests at the right time by enforcing per-request TTFT and TPOT SLOs.
The idea — least-deadline-first + reject the doomed
Incoming
LDF Scheduler
Served
LDF
REJECT
Incoming requests each carry a deadline (short bar = tight, long bar = relaxed).
The LDF scheduler reorders them — tightest deadline goes first.
Requests whose deadline is already unattainable are immediately rejected,
freeing capacity for requests that can still be served in time.
The result — SLO-aware vs FCFS
FCFS — first-come-first-served ignores deadlines
met SLO
missed
met SLO
missed
met SLO
missed
SCORPIO — LDF + credit batching keeps all requests on time
Goodput
up to 14.4x higher
urgent deadline
medium deadline
relaxed deadline
met SLO
missed SLO
rejected (doomed)