[PATCH] D132975: [clang][BOLT] Add clang-bolt target

Thu Sep 1 15:49:59 PDT 2022

Amir added a comment.

Hi Petr, thank you for your comments!

In D132975#3763264 <https://reviews.llvm.org/D132975#3763264>, @phosek wrote:

> This was already on my list of build system features I'd like to implement and I'm glad someone else is already looking into it, thank you! I have two high level comments about your approach.
>
> The first one is related to the use of Clang build as the training data. I think that Clang build is both unnecessarily heavyweight, but also not particularly representative of typical workloads (most Clang users don't use it to build Clang). Ideally, we would give vendors the flexibility to supply their own training data. I'd prefer reusing the existing perf-training <https://github.com/llvm/llvm-project/tree/main/clang/utils/perf-training> setup to do so. In fact, I'd imagine most vendors would likely use the same training data for both PGO and BOLT and that use case should be supported.

Agree that perf-training might be useful for vendors. I'll try to enable it in a follow-up diff.

Please note that the target for profile collection is not hardcoded to clang, it's configurable via CLANG_BOLT_INSTRUMENT_PROJECTS and CLANG_BOLT_INSTRUMENT_TARGETS. Right now it's the llvm/not tool (the smallest possible). Also, that the

> The second one is related to applicability. I don't think this mechanism should be limited only to Clang. Ideally, it should be possible to instrument and optimize other tools in the toolchain distribution as well; LLD is likely going to be the most common one after Clang.

I thought about it, and I think we can accommodate optimizing arbitrary targets is by providing an interface to instrument specified target(s) via `-DBOLT_INSTRUMENT_TARGETS`. For each of the target binaries, CMake would create targets like `bolt-instrument-$TARGET` and `bolt-optimize-$TARGET`. 
For `bolt-instrument-$TARGET`, BOLT would instrument the target binary, placing instrumented binary next to the original one (e.g. `target`-bolt.inst). End users would use those instrumented binaries on representative workloads to collect the profile. For `bolt-optimize-$TARGET`, BOLT would post-process the profiles and create optimized binary (`target`-bolt).

I appreciate your suggestions. Do you think we can move incrementally from this diff towards more general uses in follow-up diffs?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132975/new/

https://reviews.llvm.org/D132975