[llvm] Use thinlto and pgo for x86_64 windows release packaging (PR #71067)

via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 6 04:12:17 PST 2023


zmodem wrote:

> Why stop with LTO+PGO? BOLTing Clang/LLD should yield another sizeable improvement.

My understanding was the BOLT for PE/COFF binaries isn't production ready.

> Regarding the external source file used for profiling (pgo_training-1.ii), I don't know LLVM policy, but maybe it would be better to host it on our side, so we're sure it does not disappear one of those days. By the way, what is the difference with pgo_training-2.ii, which is available too? If you have links to explain those files, it would be nice to add as a comment.

Thinking about this some more, using `pgo_training-1.ii` is convenient for Chromium, but may not be right for LLVM. As you say, we should probably store it somewhere in the project, and at 11 MB we can't just put it in Git. Additionally, it may need updating now and then.

I've pushed a new commit to train by building Clang's Sema.cpp instead. It requires running CMake one more time, but I think that's okay. The performance seems to be the same.

(To answer your questions anyway, the origin of pgo_training-1.ii is documented here: https://source.chromium.org/chromium/chromium/src/+/main:tools/clang/scripts/build.py;l=1044 The story of the -2 version is here: https://bugs.chromium.org/p/chromium/issues/detail?id=984067)

> I know it's a boring request, but could you evaluate separately benefits from PGO alone, and then adding ThinLTO? [..] Build times should be checked too ideally.

Build times on my workstation:

1. Baseline (using the checked in version of build_llvm_package.py)
1h 30 minutes
2. PGO only
1h 40 minutes (+11%)
3.  PGO+ThinLTO
3h 44 minutes (+149%)

Performance, using the clang produced above to do a release build of clang (metrics are best-of-two):

1. Baseline
4m 55s
2. PGO only
3m 49s (-22%)
3. PGO + ThinLTO
3m 38s (-26%)

I was surprised that PGO helped so much by itself. Given that ThinLTO builds are so much slower to produce, I'm tempted to say we should just go with PGO for now. What do you think?

> Finally, aarch64 build could benefit from that too, but that can be done later.

Yes, and hopefully we can refactor some of the code to be shared at that point.

https://github.com/llvm/llvm-project/pull/71067


More information about the llvm-commits mailing list