[PATCH] D103230: [AMDGPU] Disable NSA for BVH instructions when appropriate

Tue Jul 27 02:19:59 PDT 2021

foad added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4813-4814
+    LLT OpTy = LLT::fixed_vector(Ops.size(), 32);
+    Register MergedOps = MRI.createGenericVirtualRegister(OpTy);
+    B.buildMerge(MergedOps, Ops);
+    Ops.clear();
----------------
Nit: we generally avoid explicit createGenericVirtualRegister calls. You can write: `Register MergedOps = B.buildMerge(OpTy, Ops).getReg(0);`

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:7414
+      // Build a single vector containing all the operands so far prepared.
+      const unsigned LaneCount = NumVAddrs <= 8 ? 8 : 16;
+      while (Ops.size() < LaneCount)
----------------
Same question as for globalisel: do we need to round up at all here? Rounding up to 8 certainly seems odd now that we have v5, v6, v7 classes.

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:7416
+      while (Ops.size() < LaneCount)
+        Ops.push_back(DAG.getConstant(0, DL, MVT::i32));
+
----------------
Can we use undef instead of zero to avoid having to materialise a constant?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103230/new/

https://reviews.llvm.org/D103230