[llvm] [RISCV] Account for factor in interleave memory op costs (PR #111511)

Luke Lau via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 11 12:24:10 PDT 2024


lukel97 wrote:

> Other factors are handled one segment at a time. If the segment is more than DLEN bits then each segment will require multiple cycles. So the cost is something like VL*ceil(DLEN/(SEW*factor)).

At NF > 4 I'm seeing the same behaviour on the banana pi too, where it scales with VL. Do we need some sort of subtarget hook to specify which factors go down the wide-load fast path and which ones don't?

> For x280, I think factor 2 is handled pretty much like a unit stride load. It reads/writes two register at once.

Does it also have some sort of shuffling overhead similar to the costing currently in this PR?

https://github.com/llvm/llvm-project/pull/111511


More information about the llvm-commits mailing list