← back to projects

TStore — tensor-centric compression

Treating tensors, not models, as the unit of storage — clustering similar tensors then delta-compressing against each cluster center.

The idea — cluster similar tensors, then delta-compress

Model A
existing
Cluster A Cluster B ◆ fingerprint ◆ fingerprint
△ delta △ delta △ delta △ delta △ delta
Model B
new
stored on disk
ctr
ctr
Full center tensors stored once per cluster; members stored as tiny deltas.

Each incoming model is decomposed into tensors. A cheap bit-level TensorSketch fingerprint predicts which existing cluster each tensor is most similar to. The tensor is then delta-compressed against its cluster center and stored as a small residual — only the difference is kept on disk.

The result — storage footprint shrinks ~70%

Storage
used
~30% used
70% saved ↓
0% 50% 100%

TStore achieves a ~70% storage reduction — roughly 37% better than the prior state-of-the-art — by storing one full center tensor per cluster and only tiny delta residuals for all other tensors.

cluster-center tensor (full) delta (compressed residual) TensorSketch fingerprint saved space