[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

Wed Aug 26 17:26:46 PDT 2020

hoy added a comment.

In D86193#2240596 <https://reviews.llvm.org/D86193#2240596>, @wmi wrote:

> In D86193#2240502 <https://reviews.llvm.org/D86193#2240502>, @hoy wrote:
>
>> In D86193#2240353 <https://reviews.llvm.org/D86193#2240353>, @wmi wrote:
>>
>>>> There are some optimizations such as if-convert, tail call elimination, that were initially blocked by the pseudo probe intrinsic but is now unblocked by fixes included in this change. With the current change we do not see perf degradation out of SPEC and one of our internal large services.
>>>> The main optimizations left blocked intentionally are those that merge blocks for smaller code size, such as tail merge which is the opposite of jump threading. We believe that those optimizations are not very beneficial for performance and AutoFDO.
>>>
>>> If the optimizations are not very beneficial for performance and AutoFDO and should be blocked, it may be better to block them in a more general way and not depend on pseudo probe, because blocking them may also be beneficial for debug info based AutoFDO.
>>
>> In theory, yes, we should have a black list of transforms (mainly related to block merge) that are not needed by AutoFDO and block them. In reality it might take quite some efforts to figure them out. Pseudo probe, on the other hand, starts with blocking those transforms in the first place and relax the ones that might actually help AutoFDO.
>>
>>> Another reason is that pseudo probe looks pretty much like debug information to me. They are used to annotate the IR but shouldn't affect the transformation. Binaries built w/wo debug information are required to be identical in LLVM. I think that requirement could be applied on pseudo probe as well. It is even better to have some test to enforce it so that no change in the future could break the requirement.
>>
>> Good point! Yes, pseudo probe is implemented in a similar way with the debug intrinsics. However they are not guaranteed to not affect the codegen since its main purpose is to achieve an accurate profile correlation with low cost. Regarding the cost, it sits somewhere between the debug intrinsics and the PGO instrumentation and close to a zero cost in practice.
>
> I see. It makes sense to fix up some important transformations to achieve the goal of low cost. To achieve the goal of not affecting codegen needs a lot more effort to test and fix up all over the pipeline. I don't mean to have it ready in the patch, but I think it maybe something worthy to strive for later on.

Sounds good, we will be accumulating a list of AutoFDO-unfriendly transforms over time.

>> Agreed that it would be better to have tests protect the pseudo probe cost from going too high, but not sure which optimizations we should start with. Maybe to start with some critical optimizations like inlining, vectorization?
>
> The test I have in my mind comes from debug info. It is to bootstrap llvm with and without debug information. The test is to check whether the binaries built after stripping the debug information are identical. I am thinking pseudo probe can have such test setup somewhere sometime in the future. Same as above, it doesn't have to be ready currently.

I like the idea. It would catch a regression on pseudo probe with new optimization changes. Let me think about it. Thanks!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193