[PATCH] D147954: [SimplifyCFG]Prevent premature simplification of indirect callsites with profile data

David Li via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 18 09:45:54 PDT 2023


davidxl added a comment.

In D147954#4276616 <https://reviews.llvm.org/D147954#4276616>, @mingmingl wrote:

> In D147954#4276502 <https://reviews.llvm.org/D147954#4276502>, @davidxl wrote:
>
>> In D147954#4276299 <https://reviews.llvm.org/D147954#4276299>, @wenlei wrote:
>>
>>> I guess you do this only for calls and limit it for pre-link because you want to minimize the restriction on optimization, while making post-link inlining operate on accurate call profile?
>
> Yep this is the plan, but for post-link indirect-call-promotion (which could be subsequently inlined after ICP)
>
>>> Code motion is generally destructive for profile quality. Instead of blocking all sinking and hoisting, have you considered trying to update profile for sinking/hoisting when there's no merging, but only block those that involves merging (i.e. head/tail merge) with incompatible profile? I imagine in some cases, allow sinking/hoisting but updating profile properly may help better inlining later.
>
> For my understanding, is https://gcc.godbolt.org/z/cMWY9Tqzv an example (based on c++ source <https://gcc.godbolt.org/z/4qnEbboTT> with manual `!prof`) where updating profile would be useful? I think sinking/hoisting would merge instructions but I might miss something.
>
> - In this example, two direct callsites are merged but both `branch_weights` metadata (`!prof 0`, `!prof 1`) are dropped which is sub-optimal. It's likely better in majority of cases by merging two `!prof` into one and accumulate counters.
>
> Nevertheless, (after thinking about direct callsites above), I modified the implementation (and option names) to look at indirect calls only primarily because 1) the desired behavior upon icall (indicating `!prof` is value profile as opposed to `branch_weights` for direct call) is probably clearer and 2) the pattern shows up (as C++ code shows)  with polymorphism, and meanwhile added a 'FIXME' to mention the current 'hoisting' is still suboptimal since it dropped `branch_weights` on two identical direct calls.
>
> I concur it's better to decide on option name once (`preserve-call` vs `preserve-indirect-call`) so I'm fine with implementing the sum-of-branch-weight counters or a better alternative in this patch (and name the option back to more general `preserve-call...` Let me know your thoughts!
>
>> This one is mainly for profile based devirtualization. Related to inlining, there is a pre-pass to do the unmerging:
>
> For my information, in which pass does the unmerging happens?

There is a CallSiteSplitting Pass that does it.

>> arg = select cond, 1, val
>> call foo(arg)
>>
>>>
>> -
>>
>> if (cond) {
>>
>>   call foo(1)
>>
>> } else
>>
>>   call foo(val)




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147954/new/

https://reviews.llvm.org/D147954



More information about the llvm-commits mailing list