[PATCH] D147954: [SimplifyCFG]Prevent premature simplification of indirect callsites with profile data

Mingming Liu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 17 23:41:28 PDT 2023


mingmingl added a comment.

In D147954#4276616 <https://reviews.llvm.org/D147954#4276616>, @mingmingl wrote:

> In D147954#4276502 <https://reviews.llvm.org/D147954#4276502>, @davidxl wrote:
>
>> In D147954#4276299 <https://reviews.llvm.org/D147954#4276299>, @wenlei wrote:
>>
>>> I guess you do this only for calls and limit it for pre-link because you want to minimize the restriction on optimization, while making post-link inlining operate on accurate call profile?
>
> Yep this is the plan, but for post-link indirect-call-promotion (which could be subsequently inlined after ICP)
>
>>> Code motion is generally destructive for profile quality. Instead of blocking all sinking and hoisting, have you considered trying to update profile for sinking/hoisting when there's no merging, but only block those that involves merging (i.e. head/tail merge) with incompatible profile? I imagine in some cases, allow sinking/hoisting but updating profile properly may help better inlining later.
>
> For my understanding, is https://gcc.godbolt.org/z/cMWY9Tqzv an example (based on c++ source <https://gcc.godbolt.org/z/4qnEbboTT> with manual `!prof`) where updating profile would be useful? I think sinking/hoisting would merge instructions but I might miss something.
>
> - In this example, two direct callsites are merged but both `branch_weights` metadata (`!prof 0`, `!prof 1`) are dropped which is sub-optimal. It's likely better in majority of cases by merging two `!prof` into one and accumulate counters.
>
> Nevertheless, (after thinking about direct callsites above), I modified the implementation (and option names) to look at indirect calls only primarily because 1) the desired behavior upon icall (indicating `!prof` is value profile as opposed to `branch_weights` for direct call) is probably clearer and 2) the pattern shows up (as C++ code shows)  with polymorphism, and meanwhile added a 'FIXME' to mention the current 'hoisting' is still suboptimal since it dropped `branch_weights` on two identical direct calls.
>
> I concur it's better to decide on option name once (`preserve-call` vs `preserve-indirect-call`) so I'm fine with implementing the sum-of-branch-weight counters or a better alternative in this patch (and name the option back to more general `preserve-call...` Let me know your thoughts!

My original thought is,

- given that SimplifyCFG runs multiple times, one simpler way is to conservatively prevent premature hoisting/sinking for both direct calls and indirect-calls in early stage of the pipeline and preserve the `!prof` (no need to sum counters for `branch_weight` of direct callsites), let passes that use `!prof` sees the original IR and make transformations (e.g., inline hot calls or whatever), and do aggressive simplifycfg, in other words, do not preserve `!prof` in {direct, indirect} calls in later stage of the pipeline.

Hoist direct calls and sum counters is probably better since 
//(use two callsites as example)//

1. when both direct callsites are hot, they are hoisted once and both inlined -> faster (in terms of compile time) and more reliable than inline twice and do common code sequence elimination.
2. if one callsite is cold and another is hot, sum counters gets both callsites inlined (since there is only one hot callsites after hoist), compared with leave cold call outlined && inline hot calls.

>> This one is mainly for profile based devirtualization. Related to inlining, there is a pre-pass to do the unmerging:
>
> For my information, in which pass does the unmerging happens?
>
>> arg = select cond, 1, val
>> call foo(arg)
>>
>>>
>> -
>>
>> if (cond) {
>>
>>   call foo(1)
>>
>> } else
>>
>>   call foo(val)




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147954/new/

https://reviews.llvm.org/D147954



More information about the llvm-commits mailing list