Improving Floating Point Compression through Binary Masks
|Title||Improving Floating Point Compression through Binary Masks|
|Publication Type||Conference Paper|
|Year of Publication||2013|
|Authors||Gomez, LABautist, Cappello, F|
|Conference Name||IEEE BigData 2013|
|Conference Location||Santa Barbara, California|
Modern scientific technology such as particle accelerators, telescopes and supercomputers are producing extremely large amounts of data. That scientific data needs to be processed using systems with high computational capabilities such as supercomputers. Given that the scientific data is increasing in size at an exponential rate, storing and accessing the data is becoming expensive in both, time and space. Most of this scientific data is stored using floating point representation. Scientific applications executed in supercomputers spend a large amount of CPU cycles reading and writing floating point values, making data compression techniques an interesting way to increase computing efficiency. Given the accuracy requirements of scientific computing, we only focus on lossless data compression. In this paper we propose a masking technique that partially decreases the entropy of scientific datasets allowing for better compression ratio and higher throughput. We evaluate several data partitioning techniques for selective compression and compare these schemes with several existing compression strategies. Our approach shows up to 15% improvement in compression ratio while reducing the time spent in compression, to only a half of the original compression time in some cases.