Treating tensors, not models, as the unit of storage — clustering similar tensors then delta-compressing against each cluster center.
Each incoming model is decomposed into tensors. A cheap bit-level TensorSketch fingerprint predicts which existing cluster each tensor is most similar to. The tensor is then delta-compressed against its cluster center and stored as a small residual — only the difference is kept on disk.
TStore achieves a ~70% storage reduction — roughly 37% better than the prior state-of-the-art — by storing one full center tensor per cluster and only tiny delta residuals for all other tensors.