projects

Amazing projects you want to know across the LLM stack and visualization.

training

project thumbnail

DeepSpeedPyTorch

ZenFlow

Stall-free async offloading for LLM training. Integrated into DeepSpeed via official PR.

paper code blog demo

project thumbnail

mLoRA

Fine-tuning LoRA adapters via highly-efficient pipeline parallelism across multiple GPUs.

paper code demo

project thumbnail

DLRover-RM

Resource optimization for deep recommendation model training in the cloud.

paper code demo

storage

project thumbnail

TStore

Tensor-centric storage layer for AI model hubs — compress checkpoints by exploiting their internal structure.

project thumbnail

ZipLLM

Efficient LLM storage via model-aware synergistic data deduplication and compression.

paper code demo

inference

project thumbnail

MorphServe

Workload-aware LLM serving via runtime layer swapping and KV cache resizing.

project thumbnail

λScale

Fast scaling for serverless LLM inference — cold start is no longer a death sentence.

paper code demo

project thumbnail

Scorpio

SLO-aware LLM serving — TTFT / TPOT guards with credit-based batching for heterogeneous deadlines.

paper code demo

vis

project thumbnail

IGenBench

First benchmark for text-to-infographic generation — 600 tests, 30 types, automated reliability checks.

project thumbnail

ViviDoc

Human-agent collaborative system for generating interactive educational documents from a single topic input.