[llvm] [AArch64] Fix heuristics for folding "lsl" into load/store ops. (PR #86894)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 29 00:53:26 PDT 2024
https://github.com/davemgreen commented:
Thanks for looking into this, it is one thing off my todo list.
> It turns out the current commit message isn't precisely right. Current Cortex cores (X2 and later) have free shift by 1 for integer loads... but no free shift by 1/4 for floating-point loads. Not sure if it's worth explicitly modeling int-shift vs. float-shift.
That might be an inaccuracy in the optimization guide more than a difference between int/fp. lsl #4 are still an extra operation, but I do not believe they come up very often.
> https://reviews.llvm.org/D155470#4527270 suggests that we should default to AddrLSLSlow14... I'm not sure if that's the right choice. An explicit shift is guaranteed to increase latency, but an extra integer micro-op generated by a folded shift might not matter in a lot of cases.
Yeah I agree - with more new cores having faster #1 shifts I think it should be OK as the default.
https://github.com/llvm/llvm-project/pull/86894
More information about the llvm-commits
mailing list