[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.
Hongtao Yu via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 26 16:39:47 PDT 2020
hoy added a comment.
In D86193#2240353 <https://reviews.llvm.org/D86193#2240353>, @wmi wrote:
>> There are some optimizations such as if-convert, tail call elimination, that were initially blocked by the pseudo probe intrinsic but is now unblocked by fixes included in this change. With the current change we do not see perf degradation out of SPEC and one of our internal large services.
>> The main optimizations left blocked intentionally are those that merge blocks for smaller code size, such as tail merge which is the opposite of jump threading. We believe that those optimizations are not very beneficial for performance and AutoFDO.
>
> If the optimizations are not very beneficial for performance and AutoFDO and should be blocked, it may be better to block them in a more general way and not depend on pseudo probe, because blocking them may also be beneficial for debug info based AutoFDO.
In theory, yes, we should have a black list of transforms (mainly related to block merge) that are not needed by AutoFDO and block them. In reality it might take quite some efforts to figure them out. Pseudo probe, on the other hand, starts with blocking those transforms in the first place and relax the ones that might actually help AutoFDO.
> Another reason is that pseudo probe looks pretty much like debug information to me. They are used to annotate the IR but shouldn't affect the transformation. Binaries built w/wo debug information are required to be identical in LLVM. I think that requirement could be applied on pseudo probe as well. It is even better to have some test to enforce it so that no change in the future could break the requirement.
Good point! Yes, pseudo probe is implemented in a similar way with the debug intrinsics. However they are not guaranteed to not affect the codegen since its main purpose is to achieve an accurate profile correlation with low cost. Regarding the cost, it sits somewhere between the debug intrinsics and the PGO instrumentation and close to a zero cost in practice. Agreed that it would be better to have tests protect the pseudo probe cost from going too high, but not sure which optimizations we should start with. Maybe to start with some critical optimizations like inlining, vectorization?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D86193/new/
https://reviews.llvm.org/D86193
More information about the llvm-commits
mailing list