Unix Data Compression Shootout
Fri 23 May 2025 by R.L. DaneI wanted to try a new-to-me compressor, lz4
, but it turned into a full ADHD-fueled file compression shoot-out:
Dang, lz4 is crazy fast!
Data/setup
The corpus is a 2.29 GiB uncompressed tar file consisting of several years worth of GPS data in various plain-text formats.
The computer is a Thinkpad x260 with the CPU governor set to performance
. The CPU is an Intel i5-6200U
Outcome
Chart: (grouped by compressor)
command/compressor time (user) size ratio
none/cat 0.077 2462955520
gzip 57.283 338289587 7.28
gzip -1 22.682 400956710 6.14
gzip -9 113.047 325547190 7.57
bzip2 319.847 262857414 9.37
bzip2 -1 255.654 278217711 8.85
bzip2 -9 326.718 262857414 9.37
bzip3 205.822 231173201 10.65
zstd 12.520 321229917 7.67
zstd -1 8.812 317234226 7.76
zstd -9 63.019 282940675 8.70
zstd -11 101.278 281894351 8.74
zstd --ultra -22 7317.944 230075751 10.70
xz 1476.153 228082956 10.80
xz -1 201.569 290137816 8.49
xz -9e 4683.144 212748984 11.58
lz4 5.744 549838913 4.48
lz4 -1 5.762 549838913 4.48
lz4 -9 74.670 434543206 5.67
Sorted by size: (descending)
command/compressor time (user) size ratio
none/cat 0.077 2462955520
lz4 5.744 549838913 4.48
lz4 -1 5.762 549838913 4.48
lz4 -9 74.670 434543206 5.67
gzip -1 22.682 400956710 6.14
gzip 57.283 338289587 7.28
gzip -9 113.047 325547190 7.57
zstd 12.520 321229917 7.67
zstd -1 8.812 317234226 7.76
xz -1 201.569 290137816 8.49
zstd -9 63.019 282940675 8.70
zstd -11 101.278 281894351 8.74
bzip2 -1 255.654 278217711 8.85
bzip2 319.847 262857414 9.37
bzip2 -9 326.718 262857414 9.37
bzip3 205.822 231173201 10.65
zstd --ultra -22 7317.944 230075751 10.70
xz 1476.153 228082956 10.80
xz -9e 4683.144 212748984 11.58
Sorted by time: (ascending)
command/compressor time (user) size ratio
none/cat 0.077 2462955520
lz4 5.744 549838913 4.48
lz4 -1 5.762 549838913 4.48
zstd -1 8.812 317234226 7.76
zstd 12.520 321229917 7.67
gzip -1 22.682 400956710 6.14
gzip 57.283 338289587 7.28
zstd -9 63.019 282940675 8.70
lz4 -9 74.670 434543206 5.67
zstd -11 101.278 281894351 8.74
gzip -9 113.047 325547190 7.57
xz -1 201.569 290137816 8.49
bzip3 205.822 231173201 10.65
bzip2 -1 255.654 278217711 8.85
bzip2 319.847 262857414 9.37
bzip2 -9 326.718 262857414 9.37
xz 1476.153 228082956 10.80
xz -9e 4683.144 212748984 11.58
zstd --ultra -22 7317.944 230075751 10.70
Chart: (compression ratio / time score)
command/compressor time (user) size ratio ratio/time
zstd --ultra -22 7317.944 230075751 10.70 0.0015
xz -9e 4683.144 212748984 11.58 0.0025
xz 1476.153 228082956 10.80 0.0073
bzip2 -9 326.718 262857414 9.37 0.0287
bzip2 319.847 262857414 9.37 0.0293
bzip2 -1 255.654 278217711 8.85 0.0346
xz -1 201.569 290137816 8.49 0.0421
bzip3 205.822 231173201 10.65 0.0518
gzip -9 113.047 325547190 7.57 0.0669
lz4 -9 74.67 434543206 5.67 0.0759
zstd -11 101.278 281894351 8.74 0.0863
gzip 57.283 338289587 7.28 0.1271
zstd -9 63.019 282940675 8.70 0.1381
gzip -1 22.682 400956710 6.14 0.2708
zstd 12.52 321229917 7.67 0.6124
lz4 -1 5.762 549838913 4.48 0.7774
lz4 5.744 549838913 4.48 0.7798
zstd -1 8.812 317234226 7.76 0.8811
none/cat 0.077 2462955520 1.00 12.9870 (nonsensical)
Conclusion
lz4
is the fastest compressor... but zstd -1
still kicks butt
While it doesn't score well in the overalls core, bzip3
still provides excellent compression in a reasonable amount of time.
Raw output
tmp $ lscpu |grep i5
Model name: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
tmp $ time cat < corpus.tar |wc -c
2462955520
real 0m1.388s
user 0m0.077s
sys 0m1.497s
tmp $ time gzip < corpus.tar |wc -c
338289587
real 0m57.971s
user 0m57.283s
sys 0m0.633s
tmp $ time bzip2 < corpus.tar |wc -c
262857414
real 5m21.280s
user 5m19.847s
sys 0m1.192s
tmp $ time bzip3 < corpus.tar |wc -c
231173201
real 3m26.608s
user 3m25.822s
sys 0m0.712s
tmp $ time zstd < corpus.tar |wc -c
321229917
real 0m11.717s
user 0m12.520s
sys 0m1.278s
tmp $ time xz < corpus.tar |wc -c
228082956
real 6m15.579s
user 24m36.153s
sys 0m1.481s
tmp $ time lz4 < corpus.tar |wc -c
549838913
real 0m2.190s
user 0m5.744s
sys 0m0.833s
tmp $ time lz4 -9 < corpus.tar |wc -c
434543206
real 0m25.151s
user 1m14.670s
sys 0m0.869s
tmp $ time zstd -9 < corpus.tar |wc -c
282940675
real 1m2.564s
user 1m3.019s
sys 0m1.351s
tmp $ time zstd -11 < corpus.tar |wc -c
281894351
real 1m40.556s
user 1m41.278s
sys 0m1.292s
tmp $ time zstd --ultra -22 < corpus.tar |wc -c
230075751
real 122m1.384s
user 121m57.944s
sys 0m2.642s
tmp $ time xz -9e < corpus.tar |wc -c
212748984
real 78m3.870s
user 78m3.144s
sys 0m1.345s
tmp $
tmp $ time xz -1 < corpus.tar |wc -c
290137816
real 0m50.878s
user 3m21.569s
sys 0m1.083s
tmp $ time zstd -1 < corpus.tar |wc -c
317234226
real 0m8.282s
user 0m8.812s
sys 0m1.162s
tmp $ time gzip -1 < corpus.tar |wc -c
400956710
real 0m23.496s
user 0m22.682s
sys 0m0.721s
tmp $ time gzip -9 < corpus.tar |wc -c
325547190
real 1m55.453s
user 1m53.047s
sys 0m0.730s
tmp $ time bzip2 -1 < corpus.tar |wc -c
278217711
real 4m16.753s
user 4m15.654s
sys 0m1.376s
tmp $ time bzip2 -9 < corpus.tar |wc -c
262857414
real 5m27.726s
user 5m26.718s
sys 0m1.157s
tmp $ time lz4 -1 < corpus.tar |wc -c
549838913
real 0m2.212s
user 0m5.762s
sys 0m0.832s