Argonne National Laboratory

Improving Floating Point Compression through Binary Masks

TitleImproving Floating Point Compression through Binary Masks
Publication TypeConference Paper
Year of Publication2013
AuthorsGomez, LABautist, Cappello, F
Conference NameIEEE BigData 2013
Conference LocationSanta Barbara, California
Other NumbersANL/MCS-P5009-0813

Modern scientific technology such as particle accelerators, telescopes and supercomputers are producing extremely large amounts of data. That scientific data needs to be processed using systems with high computational capabilities such as supercomputers. Given that the scientific data is increasing in size at an exponential rate, storing and accessing the data is becoming expensive in both, time and space. Most of this scientific data is stored using floating point representation. Scientific applications executed in supercomputers spend a large amount of CPU cycles reading and writing floating point values, making data compression techniques an interesting way to increase computing efficiency. Given the accuracy requirements of scientific computing, we only focus on lossless data compression. In this paper we propose a masking technique that partially decreases the entropy of scientific datasets allowing for better compression ratio and higher throughput. We evaluate several data partitioning techniques for selective compression and compare these schemes with several existing compression strategies. Our approach shows up to 15% improvement in compression ratio while reducing the time spent in compression, to only a half of the original compression time in some cases.