[PATCH] D147954: [SimplifyCFG] Merge branch_weights for direct callsites and introduce options to preserve value profile for indirect callsites

Mingming Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 18 15:42:11 PDT 2023


mingmingl added a comment.

In D147954#4277742 <https://reviews.llvm.org/D147954#4277742>, @tejohnson wrote:

> Instead of making this false by default and requiring it to be set from the command line for pre-LTO compile steps, it would be better to add a "setPreserveIndirectCallInstWithProfInHoistAndSink" interface and set this to false when appropriate from the pass pipeline builder (see other examples where the SimplifyCFGOptions are set from there). That way it can be controlled more exactly.

Added `setPreserveIndirectCallInstWithProfInHoistAndSink` method for pass builder pipeline usage.

I'm a little confused why we would disable this only in the pre-LTO compile step right now: I suppose this is specific to SamplePGO because ICP is not performed during the pre-LTO compile step?

(I was confused when seeing it as well) It happens in IFDO since in this particular case, prelink pipeline doesn't run ICP pass; the `!prof` metadata is dropped in prelink build, so postlink ICP didn't see it.

In other cases (non-LTO and instrumentation PGO) ICP appears to be performed right after profile annotation.

> However, we also perform SimplifyCFG at least once before profile annotation (both sample and instrumentation PGO). Wouldn't we still end up merging indirect calls in some cases too aggressively, because they don't have any profile metadata so we won't know any better? I'm currently testing a change to avoid speculation in SimplifyCFG before profile annotation to avoid doing incorrectly aggressive transformations. We might want to do something similar here too: setPreserveIndirectCallInstWithProfInHoistAndSink(false) for all pre-PGO annotation invocations of SimplifyCFG in PGO compiles.



In D147954#4277744 <https://reviews.llvm.org/D147954#4277744>, @mingmingl wrote:

> In D147954#4276844 <https://reviews.llvm.org/D147954#4276844>, @nikic wrote:
>
>> Could you please start by implementing metadata merging, and then return to this patch only if strictly necessary? I'd rather not have another SimplifyCFG option for such a niche use case, if it's possible to get most of the benefit via standard metadata merging.
>
> (quick reply to describe the plan) okay, let me implement something as described and see how it works for this use case
>
> 1. do metadata merging for direct callsites 2) conservatively not hoisting/sinking indirect callsites for indirect callsites if `!prof` presents -> indirect-call-promotion pass will clear `!prof` for indirect-callsite (since there are no other uses of  this type of value profiles) so the SimplifyCFG pass after indirect-call-promotion should be able to do a full cleanup if the indirect-call is still in the IR but not speculatively devirtualized.

Actually merging two different virtual calls with the same counters could be counterproductive so I kept the option.

In D147954#4277932 <https://reviews.llvm.org/D147954#4277932>, @davidxl wrote:

> In D147954#4276616 <https://reviews.llvm.org/D147954#4276616>, @mingmingl wrote:
>
>> In D147954#4276502 <https://reviews.llvm.org/D147954#4276502>, @davidxl wrote:
>>
>>> In D147954#4276299 <https://reviews.llvm.org/D147954#4276299>, @wenlei wrote:
>>>
>>>> I guess you do this only for calls and limit it for pre-link because you want to minimize the restriction on optimization, while making post-link inlining operate on accurate call profile?
>>
>> Yep this is the plan, but for post-link indirect-call-promotion (which could be subsequently inlined after ICP)
>>
>>>> Code motion is generally destructive for profile quality. Instead of blocking all sinking and hoisting, have you considered trying to update profile for sinking/hoisting when there's no merging, but only block those that involves merging (i.e. head/tail merge) with incompatible profile? I imagine in some cases, allow sinking/hoisting but updating profile properly may help better inlining later.
>>
>> For my understanding, is https://gcc.godbolt.org/z/cMWY9Tqzv an example (based on c++ source <https://gcc.godbolt.org/z/4qnEbboTT> with manual `!prof`) where updating profile would be useful? I think sinking/hoisting would merge instructions but I might miss something.
>>
>> - In this example, two direct callsites are merged but both `branch_weights` metadata (`!prof 0`, `!prof 1`) are dropped which is sub-optimal. It's likely better in majority of cases by merging two `!prof` into one and accumulate counters.
>>
>> Nevertheless, (after thinking about direct callsites above), I modified the implementation (and option names) to look at indirect calls only primarily because 1) the desired behavior upon icall (indicating `!prof` is value profile as opposed to `branch_weights` for direct call) is probably clearer and 2) the pattern shows up (as C++ code shows)  with polymorphism, and meanwhile added a 'FIXME' to mention the current 'hoisting' is still suboptimal since it dropped `branch_weights` on two identical direct calls.
>>
>> I concur it's better to decide on option name once (`preserve-call` vs `preserve-indirect-call`) so I'm fine with implementing the sum-of-branch-weight counters or a better alternative in this patch (and name the option back to more general `preserve-call...` Let me know your thoughts!
>>
>>> This one is mainly for profile based devirtualization. Related to inlining, there is a pre-pass to do the unmerging:
>>
>> For my information, in which pass does the unmerging happens?
>
> There is a CallSiteSplitting Pass that does it.

Ack, thanks!

>>> arg = select cond, 1, val
>>> call foo(arg)
>>>
>>>>
>>> -
>>>
>>> if (cond) {
>>>
>>>   call foo(1)
>>>
>>> } else
>>>
>>>   call foo(val)




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147954/new/

https://reviews.llvm.org/D147954



More information about the llvm-commits mailing list