[PATCH] D103230: [AMDGPU] Disable NSA for BVH instructions when appropriate

Fri Jul 30 00:59:33 PDT 2021

critson marked 4 inline comments as done.
critson added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4813-4814
+    LLT OpTy = LLT::fixed_vector(Ops.size(), 32);
+    Register MergedOps = MRI.createGenericVirtualRegister(OpTy);
+    B.buildMerge(MergedOps, Ops);
+    Ops.clear();
----------------
foad wrote:
> Nit: we generally avoid explicit createGenericVirtualRegister calls. You can write: `Register MergedOps = B.buildMerge(OpTy, Ops).getReg(0);`
Sure, I think I just copied the style of the code above.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:7414
+      // Build a single vector containing all the operands so far prepared.
+      const unsigned LaneCount = NumVAddrs <= 8 ? 8 : 16;
+      while (Ops.size() < LaneCount)
----------------
foad wrote:
> Same question as for globalisel: do we need to round up at all here? Rounding up to 8 certainly seems odd now that we have v5, v6, v7 classes.
BVH minimum size is 256-bits, so the new MIMG v5/v6/v7 are not relevant here.
I have however rewritten this code to only do anything above 8 VGPRs.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103230/new/

https://reviews.llvm.org/D103230