[clang] [llvm] [HIP] change compress level (PR #83605)
Yaxun Liu via cfe-commits
cfe-commits at lists.llvm.org
Sat Mar 2 22:13:01 PST 2024
================
@@ -942,20 +942,28 @@ CompressedOffloadBundle::compress(const llvm::MemoryBuffer &Input,
Input.getBuffer().size());
llvm::compression::Format CompressionFormat;
+ int Level;
- if (llvm::compression::zstd::isAvailable())
+ if (llvm::compression::zstd::isAvailable()) {
CompressionFormat = llvm::compression::Format::Zstd;
- else if (llvm::compression::zlib::isAvailable())
+ // Use a high zstd compress level by default for better size reduction.
+ const int DefaultZstdLevel = 20;
----------------
yxsamliu wrote:
level 20 is a sweet spot for both compression rate and compression time, as shown in the following table for Blender 4.1 bundled bitcode for 6 GPU archs:
| Zstd Level | Size Before (bytes) | Size After (bytes) | Compression Rate | Compression Time (s) | Decompression Time (s) |
|------------|---------------------|--------------------|------------------|----------------------|------------------------|
| 6 | 68,459,756 | 32,612,291 | 2.10 | 0.8891 | 0.1809 |
| 9 | 68,459,756 | 31,445,373 | 2.18 | 1.4200 | 0.1742 |
| 15 | 68,459,756 | 28,063,493 | 2.44 | 9.7994 | 0.1712 |
| 18 | 68,459,756 | 24,952,891 | 2.74 | 11.4201 | 0.1796 |
| 19 | 68,459,756 | 24,690,733 | 2.77 | 13.4060 | 0.1820 |
| 20 | 68,459,756 | 4,394,993 | 15.58 | 2.0946 | 0.1320 |
| 21 | 68,459,756 | 4,394,399 | 15.59 | 2.1500 | 0.1318 |
| 22 | 68,459,756 | 4,394,429 | 15.59 | 2.6635 | 0.1309 |
Level 20 and level 19 has some compression parameter differences (https://github.com/facebook/zstd/blob/a58b48ef0e543980888a4d9d16c9072ff22135ca/lib/compress/clevels.h#L48 ) the meaning of these parameters are defined at https://github.com/facebook/zstd/blob/a58b48ef0e543980888a4d9d16c9072ff22135ca/lib/zstd.h#L1299. It seems either the largest match distance or fully searched segment makes the difference.
clang-offload-bundler just concatenates the binaries for different GPU arch's together. Parallelization does not help much, unless zstd can be parallelized.
https://github.com/llvm/llvm-project/pull/83605
More information about the cfe-commits
mailing list