Skip to main content

Predictive Coding For Lossless Dataset Compression

Madeleine Barowsky, Alexander Mariona, Flavio P. Calmon

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:05:54
08 Jun 2021

Lossless compression of datasets is a problem of significant theoretical and practical interest. It appears naturally in the task of storing, sending, or archiving large collections of information for scientific research. We can greatly improve encoding bitrate if we allow the compression of the original dataset to decompress to a permutation of the data. We prove the equivalence of dataset compression to compressing a permutation-invariant structure of the data and implement such a scheme via predictive coding. We benchmark our compression procedure against state-of-the-art compression utilities on the popular machine-learning datasets MNIST and CIFAR-10 and outperform for multiple parameter sets.

Chairs:
Chaker Larabi