← back to projects

TStore — tensor-centric compression

Treating tensors, not models, as the unit of storage — clustering similar tensors then delta-compressing against each cluster center.

The idea — cluster similar tensors, then delta-compress

Model A

existing

Cluster A Cluster B ◆ fingerprint ◆ fingerprint

△ delta △ delta △ delta △ delta △ delta

Model B

new

stored on disk

ctr

△

ctr

△

Full center tensors stored once per cluster; members stored as tiny deltas.

Each incoming model is decomposed into tensors. A cheap bit-level TensorSketch fingerprint predicts which existing cluster each tensor is most similar to. The tensor is then delta-compressed against its cluster center and stored as a small residual — only the difference is kept on disk.

The result — storage footprint shrinks ~70%

Storage
used

~30% used

70% saved ↓

0% 50% 100%

TStore achieves a ~70% storage reduction — roughly 37% better than the prior state-of-the-art — by storing one full center tensor per cluster and only tiny delta residuals for all other tensors.

cluster-center tensor (full) delta (compressed residual) TensorSketch fingerprint saved space