[llvm] [RISCV] Account for factor in interleave memory op costs (PR #111511)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 11 12:24:10 PDT 2024
lukel97 wrote:
> Other factors are handled one segment at a time. If the segment is more than DLEN bits then each segment will require multiple cycles. So the cost is something like VL*ceil(DLEN/(SEW*factor)).
At NF > 4 I'm seeing the same behaviour on the banana pi too, where it scales with VL. Do we need some sort of subtarget hook to specify which factors go down the wide-load fast path and which ones don't?
> For x280, I think factor 2 is handled pretty much like a unit stride load. It reads/writes two register at once.
Does it also have some sort of shuffling overhead similar to the costing currently in this PR?
https://github.com/llvm/llvm-project/pull/111511
More information about the llvm-commits
mailing list