[llvm] [NVPTX] Vectorize and lower 256-bit global loads/stores for sm_100+/ptx88+ (PR #139292)

Alex MacLean via llvm-commits llvm-commits at lists.llvm.org
Fri May 9 10:24:31 PDT 2025


================
@@ -1328,14 +1349,33 @@ bool NVPTXDAGToDAGISel::tryLDGLDU(SDNode *N) {
     Opcode = pickOpcodeForVT(
         EltVT.getSimpleVT().SimpleTy, NVPTX::INT_PTX_LDG_G_v4i8_ELE,
         NVPTX::INT_PTX_LDG_G_v4i16_ELE, NVPTX::INT_PTX_LDG_G_v4i32_ELE,
-        std::nullopt, NVPTX::INT_PTX_LDG_G_v4f32_ELE, std::nullopt);
+        NVPTX::INT_PTX_LDG_G_v4i64_ELE, NVPTX::INT_PTX_LDG_G_v4f32_ELE,
+        NVPTX::INT_PTX_LDG_G_v4f64_ELE);
     break;
   case NVPTXISD::LDUV4:
     Opcode = pickOpcodeForVT(
         EltVT.getSimpleVT().SimpleTy, NVPTX::INT_PTX_LDU_G_v4i8_ELE,
         NVPTX::INT_PTX_LDU_G_v4i16_ELE, NVPTX::INT_PTX_LDU_G_v4i32_ELE,
         std::nullopt, NVPTX::INT_PTX_LDU_G_v4f32_ELE, std::nullopt);
     break;
+  case NVPTXISD::LoadV8:
+    switch (EltVT.getSimpleVT().SimpleTy) {
+    case MVT::i32:
+      Opcode = NVPTX::INT_PTX_LDG_G_v8i32_ELE;
+      break;
----------------
AlexMaclean wrote:

Is there a reason we're not using `pickOpcodeForVT` for this case?

https://github.com/llvm/llvm-project/pull/139292


More information about the llvm-commits mailing list