[llvm] [AArch64] Fix heuristics for folding "lsl" into load/store ops. (PR #86894)

Wed Mar 27 17:56:32 PDT 2024

efriedma-quic wrote:

I looked at some of the older reviews, and rechecked my research... a few more notes:

- It turns out the current commit message isn't precisely right.  Current Cortex cores (X2 and later) have free shift by 1 for integer loads... but no free shift by 1/4 for floating-point loads.  Not sure if it's worth explicitly modeling int-shift vs. float-shift.
- https://reviews.llvm.org/D155470#4527270 suggests that we should default to AddrLSLSlow14... I'm not sure if that's the right choice. An explicit shift is guaranteed to increase latency, but an extra integer micro-op generated by a folded shift might not matter in a lot of cases.

https://github.com/llvm/llvm-project/pull/86894