[PATCH] D145155: [RISCV] Enable interleaved access vectorization

Mon Mar 6 08:28:39 PST 2023

reames requested changes to this revision.
reames added inline comments.
This revision now requires changes to proceed.

================
Comment at: llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp:347
+      CostKind == TTI::TCK_RecipThroughput)
+    return Factor;
+  return BaseT::getInterleavedMemoryOpCost(Opcode, VecTy, Factor, Indices,
----------------
This doesn't look right.  We need to account for the cost of the actual memory op, plus the interweave cost (if any).  

At a minimum, we need to have the full cost of the wide memory op as a baseline.  I can't imagine hardware with an optimized segment-2 which beats the cost of a normal load/store op of the same width.

The only question left is whether we need to explicitly model the shuffle cost.  Depending on the hardware, we may or may not have an optimized segment load/store.  

I think it's probably safest to cost model this as if we're going to do a wide load followed by a shuffle.  We can reduce that cost if we have a target which a) actually has faster segment-2, and b) cares about the cost difference.  

================
Comment at: llvm/test/Transforms/LoopVectorize/RISCV/interleaved-accesses.ll:5
+
+define void @load_store_factor2(ptr %p) {
+; CHECK-LABEL: @load_store_factor2(
----------------
Can you add a couple tests for SEW < 64 bits?  

Also, you should probably add an actual CostModel test rather than relying on indirectly testing this through the vectorizer.  

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145155/new/

https://reviews.llvm.org/D145155