[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

Yaxun Liu via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 7 09:16:04 PST 2024


================
@@ -906,6 +906,16 @@ CreateFileHandler(MemoryBuffer &FirstInput,
 }
 
 OffloadBundlerConfig::OffloadBundlerConfig() {
+  if (llvm::compression::zstd::isAvailable()) {
+    CompressionFormat = llvm::compression::Format::Zstd;
+    // Use a high zstd compress level by default for better size reduction.
----------------
yxsamliu wrote:

> Also, I've just discovered that zstd already has `--long` option: https://github.com/facebook/zstd/blob/b293d2ebc3a5d29309390a70b3e7861b6f5133ec/lib/zstd.h#L394
> 
> ```
>     ZSTD_c_enableLongDistanceMatching=160, /* Enable long distance matching.
>                                      * This parameter is designed to improve compression ratio
>                                      * for large inputs, by finding large matches at long distance.
>                                      * It increases memory usage and window size.
>                                      * Note: enabling this parameter increases default ZSTD_c_windowLog to 128 MB
>                                      * except when expressly set to a different value.
>                                      * Note: will be enabled by default if ZSTD_c_windowLog >= 128 MB and
>                                      * compression strategy >= ZSTD_btopt (== compression level 16+) */
> ```
> 
> This sounds like something we could use here.

Thanks this option is promising. Here is some benchmark result of a fat binary containing 13 code objects each of which is about 2.7MB.

The following data is without `--long`.  The numbers are compression level, original size -> compressed size (compression rate), compression speed, decompression speed.
```
$ zstd -b1 -e22 -f --ultra tmp.o
 1#tmp.o             :  34864866 ->   9169246 (3.802), 657.0 MB/s ,1691.0 MB/s 
 2#tmp.o             :  34864866 ->   7352667 (4.742), 626.3 MB/s ,1903.8 MB/s 
 3#tmp.o             :  34864866 ->   6885718 (5.063), 488.1 MB/s ,1900.2 MB/s 
 4#tmp.o             :  34864866 ->   6700508 (5.203), 416.7 MB/s ,1897.2 MB/s 
 5#tmp.o             :  34864866 ->   6405252 (5.443), 236.4 MB/s ,1918.8 MB/s 
 6#tmp.o             :  34864866 ->   6336706 (5.502), 211.8 MB/s ,1941.4 MB/s 
 7#tmp.o             :  34864866 ->   6170409 (5.650), 153.5 MB/s ,2032.5 MB/s 
 8#tmp.o             :  34864866 ->   6121226 (5.696), 131.1 MB/s ,2071.5 MB/s 
 9#tmp.o             :  34864866 ->   6098948 (5.717), 124.9 MB/s ,2080.4 MB/s 
10#tmp.o             :  34864866 ->   2555599 (13.64), 179.4 MB/s ,3504.2 MB/s 
11#tmp.o             :  34864866 ->   2545375 (13.70), 119.4 MB/s ,3516.8 MB/s 
12#tmp.o             :  34864866 ->   2542711 (13.71), 107.2 MB/s ,3518.4 MB/s 
13#tmp.o             :  34864866 ->   2601619 (13.40),  58.4 MB/s ,3507.6 MB/s 
14#tmp.o             :  34864866 ->   2590656 (13.46),  46.2 MB/s ,3520.4 MB/s 
15#tmp.o             :  34864866 ->   2518599 (13.84),  28.4 MB/s ,3557.4 MB/s 
16#tmp.o             :  34864866 ->   2527122 (13.80),  20.8 MB/s ,3348.5 MB/s 
17#tmp.o             :  34864866 ->   2277125 (15.31),  19.0 MB/s ,3370.6 MB/s 
18#tmp.o             :  34864866 ->   2138918 (16.30),  15.0 MB/s ,3182.2 MB/s 
19#tmp.o             :  34864866 ->   2118238 (16.46),  8.82 MB/s ,3194.5 MB/s 
20#tmp.o             :  34864866 ->   2041007 (17.08),  8.31 MB/s ,3178.4 MB/s 
21#tmp.o             :  34864866 ->   2039075 (17.10),  5.21 MB/s ,3170.6 MB/s 
22#tmp.o             :  34864866 ->   2038568 (17.10),  3.60 MB/s ,3171.5 MB/s 
```
The following data are with `--long`:

```
$ zstd --long -b1 -e22 -f --ultra tmp.o
 1#tmp.o             :  34864866 ->   3281430 (10.62), 375.0 MB/s ,3531.9 MB/s 
 2#tmp.o             :  34864866 ->   2854143 (12.22), 360.6 MB/s ,3536.7 MB/s 
 3#tmp.o             :  34864866 ->   2648807 (13.16), 325.4 MB/s ,3462.7 MB/s 
 4#tmp.o             :  34864866 ->   2548618 (13.68), 309.6 MB/s ,3345.9 MB/s 
 5#tmp.o             :  34864866 ->   2540406 (13.72), 265.8 MB/s ,3297.8 MB/s 
 6#tmp.o             :  34864866 ->   2518788 (13.84), 251.9 MB/s ,3296.0 MB/s 
 7#tmp.o             :  34864866 ->   2451360 (14.22), 206.5 MB/s ,3446.9 MB/s 
 8#tmp.o             :  34864866 ->   2421083 (14.40), 186.5 MB/s ,3522.7 MB/s 
 9#tmp.o             :  34864866 ->   2406717 (14.49), 172.0 MB/s ,3472.2 MB/s 
10#tmp.o             :  34864866 ->   2392819 (14.57), 139.6 MB/s ,3439.4 MB/s 
11#tmp.o             :  34864866 ->   2386599 (14.61), 113.0 MB/s ,3415.2 MB/s 
12#tmp.o             :  34864866 ->   2385088 (14.62), 104.5 MB/s ,3430.0 MB/s 
13#tmp.o             :  34864866 ->   2389264 (14.59),  69.5 MB/s ,3422.9 MB/s 
14#tmp.o             :  34864866 ->   2382705 (14.63),  61.2 MB/s ,3428.6 MB/s 
15#tmp.o             :  34864866 ->   2372640 (14.69),  51.2 MB/s ,3446.7 MB/s 
16#tmp.o             :  34864866 ->   2209022 (15.78),  20.5 MB/s ,3483.3 MB/s 
17#tmp.o             :  34864866 ->   2168474 (16.08),  18.2 MB/s ,3381.5 MB/s 
18#tmp.o             :  34864866 ->   2065724 (16.88),  14.2 MB/s ,3187.8 MB/s 
19#tmp.o             :  34864866 ->   2042810 (17.07),  8.50 MB/s ,3195.7 MB/s 
20#tmp.o             :  34864866 ->   2040443 (17.09),  8.13 MB/s ,3173.0 MB/s 
21#tmp.o             :  34864866 ->   2038794 (17.10),  5.11 MB/s ,3174.3 MB/s 
22#tmp.o             :  34864866 ->   2038375 (17.10),  3.54 MB/s ,3177.8 MB/s 
```
>From the data we can see, with `--long`, even with compression level 3, the compressed file size is below one code object, whereas the compression speed is very fast. Higher compression level only improves compression rate slightly but with a much slower speed. Therefore compression level 3 seems to be more favorable.

https://github.com/llvm/llvm-project/pull/83605


More information about the cfe-commits mailing list