[PATCH] D158889: [AsmPrinter][PGO] Adds optional dumping of branch probabilities for PGO metrics.

Thu Aug 31 22:23:51 PDT 2023

mtrofin added a comment.

In D158889#4633333 <https://reviews.llvm.org/D158889#4633333>, @wenlei wrote:

> In D158889#4621393 <https://reviews.llvm.org/D158889#4621393>, @red1bluelost wrote:
>
>> In D158889#4619737 <https://reviews.llvm.org/D158889#4619737>, @hoy wrote:
>>
>>> In D158889#4618519 <https://reviews.llvm.org/D158889#4618519>, @wenlei wrote:
>>>
>>>> We have something similar internally, but didn't upstream because we are not sure if the use case is too narrow to justify burdening the code base with all the related complexity.  cc @hoy
>>>
>>> Yeah, we internally serialize machine block execution counts to the binary.  Would it be helpful to write those counts into some bb section similar to the `llvm_bb_addr_map` section?
>>
>> This is interesting and similar but our metrics are only focusing on branch weights at the moment. We've focused on branch weights so far since they are a direct result of Profile Loading.
>>
>> At least for the current state of the metrics here and consumed in D158890 <https://reviews.llvm.org/D158890>, there is not a use for block counts but it could be a future extension of the metrics.
>
> We're probably among the few people that will actually benefit from something like this, but honestly I'm still a bit unsure whether the use case is common enough to justify built-in support like this.

Even if relatively few groups would build up on such information, arguably PGO (as a technique) is probably the most impactful tool we have for improving performance. Like I think we're all observing here, profiles aren't necessarily well maintained throughout passes - it's actually easy to write a pass that accidentally and silently drops it, or that corrupts it somehow. I think having primitives in llvm that can help anyone interested build validation tooling and detect and fix such bugs would end up helping the community way more than their cost.

Maybe we can even, eventually, have a layer of defense doing this on a build bot with e.g. llvm-test-suite benchmarks. (I have a rfc for doing some even simpler validation transparently as part of unittests, would send it after the long weekend)

> However, if we make something like this part of llvm, I suggest we at least make it as general purpose as possible. A few comments related to that:
>
> 1. Try to incorporate block counts/frequencies as well. Most of the researches on profile quality use a block overlap metric which relies on block counts rather than branch probabilities. Our internal version also uses block counts, as branch weights cannot represent the profile for branchless code.

+1

> 2. Instead of coupling this with a specific consumer, Pin tool in your case, and in the next patch, I suggest we build general support to decode such metadata section, so tools like llvm-objdump can be used to inspect its payload.

+1!

Actually - @red1bluelost - sorry if this would creep up the scope of your work, but since we're talking design choices, would summarizing in a RFC be maybe a good step? Then we can discuss the motivation and design choices in the community, and since there is similar work (@wenlei's earlier remark), their experience will surely help!

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158889/new/

https://reviews.llvm.org/D158889