[clang] [llvm] [HIP] change compress level (PR #83605)

Yaxun Liu via cfe-commits cfe-commits at lists.llvm.org
Sat Mar 2 22:13:01 PST 2024


================
@@ -942,20 +942,28 @@ CompressedOffloadBundle::compress(const llvm::MemoryBuffer &Input,
       Input.getBuffer().size());
 
   llvm::compression::Format CompressionFormat;
+  int Level;
 
-  if (llvm::compression::zstd::isAvailable())
+  if (llvm::compression::zstd::isAvailable()) {
     CompressionFormat = llvm::compression::Format::Zstd;
-  else if (llvm::compression::zlib::isAvailable())
+    // Use a high zstd compress level by default for better size reduction.
+    const int DefaultZstdLevel = 20;
----------------
yxsamliu wrote:

level 20 is a sweet spot for both compression rate and compression time, as shown in the following table for Blender 4.1 bundled bitcode for 6 GPU archs:

| Zstd Level | Size Before (bytes) | Size After (bytes) | Compression Rate | Compression Time (s) | Decompression Time (s) |
|------------|---------------------|--------------------|------------------|----------------------|------------------------|
| 6          | 68,459,756          | 32,612,291         | 2.10             | 0.8891               | 0.1809                 |
| 9          | 68,459,756          | 31,445,373         | 2.18             | 1.4200               | 0.1742                 |
| 15         | 68,459,756          | 28,063,493         | 2.44             | 9.7994               | 0.1712                 |
| 18         | 68,459,756          | 24,952,891         | 2.74             | 11.4201              | 0.1796                 |
| 19         | 68,459,756          | 24,690,733         | 2.77             | 13.4060              | 0.1820                 |
| 20         | 68,459,756          | 4,394,993          | 15.58            | 2.0946               | 0.1320                 |
| 21         | 68,459,756          | 4,394,399          | 15.59            | 2.1500               | 0.1318                 |
| 22         | 68,459,756          | 4,394,429          | 15.59            | 2.6635               | 0.1309                 |

Level 20 and level 19 has some compression parameter differences (https://github.com/facebook/zstd/blob/a58b48ef0e543980888a4d9d16c9072ff22135ca/lib/compress/clevels.h#L48 ) the meaning of these parameters are defined at https://github.com/facebook/zstd/blob/a58b48ef0e543980888a4d9d16c9072ff22135ca/lib/zstd.h#L1299. It seems either the largest match distance or fully searched segment makes the difference.

clang-offload-bundler just concatenates the binaries for different GPU arch's together. Parallelization does not help much, unless zstd can be parallelized.


https://github.com/llvm/llvm-project/pull/83605


More information about the cfe-commits mailing list