[all-commits] [llvm/llvm-project] 3dab7f: [CMake] Add clang-bolt target

Amir Ayupov via All-commits all-commits at lists.llvm.org
Fri Sep 23 01:10:49 PDT 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 3dab7fede2019c399d793c43ca9ea5a4f2d5031f
      https://github.com/llvm/llvm-project/commit/3dab7fede2019c399d793c43ca9ea5a4f2d5031f
  Author: Amir Ayupov <aaupov at fb.com>
  Date:   2022-09-23 (Fri, 23 Sep 2022)

  Changed paths:
    M clang/CMakeLists.txt
    A clang/cmake/caches/BOLT.cmake
    M clang/utils/perf-training/perf-helper.py

  Log Message:
  -----------
  [CMake] Add clang-bolt target

This patch adds `CLANG_BOLT_INSTRUMENT` option that applies BOLT instrumentation
to Clang, performs a bootstrap build with the resulting Clang, merges resulting
fdata files into a single profile file, and uses it to perform BOLT optimization
on the original Clang binary.

The projects and targets used for bootstrap/profile collection are configurable via
`CLANG_BOLT_INSTRUMENT_PROJECTS` and `CLANG_BOLT_INSTRUMENT_TARGETS`.
The defaults are "llvm" and "count" respectively, which results in a profile with
~5.3B dynamically executed instructions.

The intended use of the functionality is through BOLT CMake cache file, similar
to PGO 2-stage build:
```
cmake <llvm-project>/llvm -C <llvm-project>/clang/cmake/caches/BOLT.cmake
ninja clang++-bolt # pulls clang-bolt
```

Stats with a recent checkout (clang-16), pre-built BOLT and Clang, 72vCPU/224G
| CMake configure with host Clang + BOLT.cmake | 1m6.592s
| Instrumenting Clang with BOLT | 2m50.508s
| CMake configure `llvm` with instrumented Clang | 5m46.364s (~5x slowdown)
| CMake build `not` with instrumented Clang |0m6.456s
| Merging fdata files | 0m9.439s
| Optimizing Clang with BOLT | 0m39.201s

Building Clang:
```cmake ../llvm-project/llvm -DCMAKE_C_COMPILER=... -DCMAKE_CXX_COMPILER=...
  -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS=clang
  -DLLVM_TARGETS_TO_BUILD=Native -GNinja```

| | Release | BOLT-optimized
| cmake | 0m24.016s | 0m22.333s
| ninja clang | 5m55.692s | 4m35.122s

I know it's not rigorous, but shows a ballpark figure.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D132975




More information about the All-commits mailing list