[llvm] [AArch64] Avoid generating LDAPUR on certain cores (PR #124274)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 29 02:10:28 PST 2025
davemgreen wrote:
> Hi,
>
> Sorry for being late - I acknowledge this has already been merged. Nevertheless, I think the logic for this should be reversed. Instead of explicitly disabling the fold for the few cores above, in my opinion we should explicitly enable it only where we know it's safe to do so (and assume default disabled).
>
> The penalty for getting it wrong on the affected cores outweighs the penalty for missing out on the fold on unaffected cores (as in this latter case, it's just an extra GP instruction), and many users/package providers compile with `-march` rather than an `-mcpu`.
Hi. I was thinking about the same thing too and wasn't sure which way to go on it. Just to be clear it is always safe to use the instructions, they are not incorrect, they just act slower than ldapr. The benefit of using ldapur is relatively minor compared to using a register increment, but it is a little better in terms of both codesize, performance and register pressure.
Anyone using the default -march=armv8 will not use ldapur, so will not see any problems. I was thinking of it in terms of enabling the tuning feature for -mcpu=generic, for which the main problem is when do we stop avoiding them? In 5 year? 10? Do we end up never using the instruction because some cpus had an issue with them? There might be something we can do where we tie it to the architecture revision though, and make -mcpu=generic + armv8.4->9.3 avoid the instructions, but anything after that use them. I will see if I can put together a patch so we can see what it looks like.
https://github.com/llvm/llvm-project/pull/124274
More information about the llvm-commits
mailing list