[PATCH] D145171: [FSAFDO] Improve FS discriminator encoding

Thu Mar 9 10:06:32 PST 2023

xur added a comment.

In D145171#4179636 <https://reviews.llvm.org/D145171#4179636>, @hoy wrote:

> In D145171#4179269 <https://reviews.llvm.org/D145171#4179269>, @xur wrote:
>
>> The performance for removing BBSize from discriminator hash is out: it shows a slightly gain on performance, like ~0.2% vs with BBsize in the hash. I will remove BBSize from the hash.
>
> Good to know removing it helps perf. Can you please update the summary as well? LGTM otherwise.

Definitely. I will update the summary and the check-in message.

In D145171#4179924 <https://reviews.llvm.org/D145171#4179924>, @wenlei wrote:

> In D145171#4179258 <https://reviews.llvm.org/D145171#4179258>, @xur wrote:
>
>> In D145171#4174356 <https://reviews.llvm.org/D145171#4174356>, @wenlei wrote:
>>
>>>> This reduces the hash conflicts.
>>>
>>> Curious how did you check/detect conflicts/collisions?
>>
>> Most from eyeballing the change -- I manually look at some functions.
>> The other indicator was the number of discriminators created in some lines in templates header.  For some line, we overflow the bits.
>
> I'm wondering if we can have a way to systemically detect and report such cases -- that should make us aware when this is happening which can throttle perf.

I have another follow-up patch that will track and report the match. It does not track the conflict. But let me think about it to add this support.

>>>> one should expect a performance gain over the binary without this change.
>>>
>>> How big is the gain you saw?
>>
>> We measured this for a variety of programs -- we are seeing improvement range from 0.7% to 2.0% on top of current FSAFDO.
>> The gain usually is higher for iterative profiles. (This is with the BBSize in the hash).
>
> That's quite promising. Just to double check, 0.7%-2.0% was the additional improvement from this patch alone, right? What's the total perf improvement from FSAFDO after this change that you saw? 
> We will measure this internally too.

The total improvement really depends on programs. On average, I would say 1.5% to 2.0%.  For some programs, like clang itself, FSAFDO improve ~3% over AFDO.
There is also a small change on the create_llvm_prof tool side. We will update the tool soon.

I'm also looking forward to hearing the performance number on your tests.

> The change looks good, thanks!

================
Comment at: llvm/lib/CodeGen/MIRFSDiscriminator.cpp:153
+      if (!ImprovedFSDiscriminator)
+        DiscriminatorCurrPass += getCallStackHashV0(BB, I, DIL);
       DiscriminatorCurrPass &= BitMaskThisPass;
----------------
hoy wrote:
> The callsite hash is being removed from the discriminator for V1. I guess the discriminator conflicts will be way less than V0?
Yes, the callsite hash is not in the discriminator hash for V1 -- the callsite hash is now the part of the key for the map. We now can have discriminators with the same value but they will not conflict with each other as they belong to different maps. This indirectly increases the range of the discriminators and thus results in less conflicts.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145171/new/

https://reviews.llvm.org/D145171