← back to projects

ZipLLM — efficient LLM storage

Model-aware deduplication + BitX delta compression to store many LLM checkpoints cheaply.

The idea — dedup + delta compression

Base model
Fine-tune A
Fine-tune B
T0 (shared)
T1 (shared)
T2 (shared)
T3 (shared)
1× stored
T0 (shared)
T1 (unique)
T2 (shared)
T3 (unique)
T0 (shared)
T1 (shared)
T2 (unique)
T3 (unique)
BitX delta
BitX delta
BitX delta
BitX delta

Step 1 — deduplication: identical tensors across checkpoints collapse to one shared copy. Step 2 — BitX delta compression: each unique tensor is XOR-ed (⊕) against the base tensor, and the resulting small delta is losslessly compressed.

The result — storage shrinks by ~54%

Naïve
3 full copies
100% — baseline
Dedup only
shared once
shared + unique
−25%
ZipLLM
dedup + BitX
shared
deltas
−54%
Shared / deduplicated tensor Unique tensor (delta-compressed) BitX compressed delta Saved space