[PATCH] D138990: [AArch64] Enable the select optimize pass for AArch64

Sotiris Apostolakis via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 30 09:07:32 PST 2022


apostolakis added a comment.

Thanks Dave for this extensive evaluation and reporting!

The performance improvements even for non-PGO builds are somewhat expected given that currently the AArch64 backend does not have almost any logic to make this decision (contrary to x86) and it just aggressively prefers predication. 
So, even without profile information, the loop-level heuristics, albeit conservative, allow for some obvious cases to be converted to branches.

Note that internally at Google, we have already enabled the select-optimize pass for all instrPGO-optimized builds including AArch64. The performance improvements for AArch64 appear to be even more significant than x86. 
For non-PGO builds, we have also seen significant improvements for some microbenchmarks on AArch64 but I did not have the time to investigate more. So, your efforts are more than welcome.

Regarding compilation time, the impact should be small. For non-PGO builds, we essentially only have the loop-level heuristic that does two passes over all the instructions in each loop and for each instruction it iterates over its operands. So for each loop, 2*N*K, where N is the instructions in the loop and K is the operand count of each instruction; this is essentially O(N) given that the operand count is a small bounded number. 
In practice, the constant costs might be noticeable for some programs with big loops. 
Enabling for -O3 only sounds reasonable. Note that you do not want to enable it for size-optimizing builds (although checks within the pass already prevent it from being used in those cases)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138990/new/

https://reviews.llvm.org/D138990



More information about the llvm-commits mailing list