[PATCH] D147954: [SimplifyCFG] Merge branch_weights for direct callsites and introduce options to preserve value profile for indirect callsites

Mingming Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 19 17:31:05 PDT 2023


mingmingl added a comment.

In D147954#4281685 <https://reviews.llvm.org/D147954#4281685>, @wenlei wrote:

> Just to make sure I understand it correctly. The solution to the issue Teresa raised,
>
>>> In other cases (non-LTO and instrumentation PGO) ICP appears to be performed right after profile annotation.
>>
>> However, we also perform SimplifyCFG at least once before profile annotation (both sample and instrumentation PGO). Wouldn't we still end up merging indirect calls in some cases too aggressively, because they don't have any profile metadata so we won't know any better? I'm currently testing a change to avoid speculation in SimplifyCFG before profile annotation to avoid doing incorrectly aggressive transformations. We might want to do something similar here too: setPreserveIndirectCallInstWithProfInHoistAndSink(false) for all pre-PGO annotation invocations of SimplifyCFG in PGO compiles.
>
> is to not rely on pre-link vs post-link to decide when to enable/disable hoist/sink for calls, but to check whether calls have target value profile remaining (i.e. ICP yet to happen) so we can always avoid destroying profile for ICP regardless of where simpilifyCFG is run. Is that correct?

Here is a summary of how dropped metadata happens in an instrumented FDO + thinlto build

1. SimplifyCFG pass run multiple times in prelink pipeline and postlink pipeline, and each run could be configured with different SimplifyCFGOptions (sink or no sink, hoist or no hoist, etc)
2. The dropped metadata is observed in a instrumented fdo + lto build, when prelink pipeline runs simplifycfg with hoist && sink, so postlink ICP couldn't see the `!prof` metadata.

For the questions, here are my thoughts

- could SimplifyCFG undesirably prematurely sink/hoist callsites before we got a chance to instrument them,
  - the answer is that it shouldn't. Instead SimplifyCFG pass should preserve the callsites for instrumentation (I'm not very familiar with callsite splitting that's mentioned above, but guessing from the name it wants to split callsites and expose them)
- could SimplifyCFG sink/hoist callsites in a binary `foo` over which SamplePGO profiles are collected
  - if `foo` gets two indirect callsites `c1` and `c2` merged, hypothetically one debug location is kept in the merged callsite, and target values of both `c1` and `c2` are attributed to this debug location, so it looks like a) hoist && sink could affect the collected profiles -> we want two value profiles, one for `c1` and another for `c2`, but we only get one profile with merged target values (since profiled binary only has one callsite) b) as long as collected sample-pgo profiles have two value profiles, in an AutoFDO build, we should see both callsites exposed and annotated.

About why we get two value profiles in the first place in an instrumentation + LTO build

- I took a look at pass builder just now, looks like earlier simplify-cfg passes in one pipeline don't turn on 'sink' or 'hoist' [1], and later simplify-cfg passes in a pipeline do turn on sink and host. Similarly, for the small test case (https://gcc.godbolt.org/z/WxE8zshqE), hoist or sink doesn't happen until the 6th SimplifyCFG run (out of 8 SimplifyCFG runs in total). That probably explains why we are able to get two different indirect callsites exposed and instrumented in the first place.

- In `PassBuilderPipelines.cpp`, sink and hoist are set to `true` inside
  1. buildFunctionSimplificationPipeline <https://github.com/llvm/llvm-project/blob/f4f588ab0bcdc31a05212c629b16bbc4e9a93377/llvm/lib/Passes/PassBuilderPipelines.cpp#L718> (and buildFunctionSimplificationPipeline is called inside buildInlinerPipeline)
  2. addVectorPassses <https://github.com/llvm/llvm-project/blob/f4f588ab0bcdc31a05212c629b16bbc4e9a93377/llvm/lib/Passes/PassBuilderPipelines.cpp#L1135>

[1] https://gcc.godbolt.org/z/8E39cTosE prints the passes with options in `opt` since printing in `clang` is difficult <https://github.com/llvm/llvm-project/issues/28692>


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147954/new/

https://reviews.llvm.org/D147954



More information about the llvm-commits mailing list