[llvm] [NVPTX] Vectorize and lower 256-bit global loads/stores for sm_100+/ptx88+ (PR #139292)
Drew Kersnar via llvm-commits
llvm-commits at lists.llvm.org
Fri May 9 10:35:42 PDT 2025
================
@@ -1328,14 +1349,33 @@ bool NVPTXDAGToDAGISel::tryLDGLDU(SDNode *N) {
Opcode = pickOpcodeForVT(
EltVT.getSimpleVT().SimpleTy, NVPTX::INT_PTX_LDG_G_v4i8_ELE,
NVPTX::INT_PTX_LDG_G_v4i16_ELE, NVPTX::INT_PTX_LDG_G_v4i32_ELE,
- std::nullopt, NVPTX::INT_PTX_LDG_G_v4f32_ELE, std::nullopt);
+ NVPTX::INT_PTX_LDG_G_v4i64_ELE, NVPTX::INT_PTX_LDG_G_v4f32_ELE,
+ NVPTX::INT_PTX_LDG_G_v4f64_ELE);
break;
case NVPTXISD::LDUV4:
Opcode = pickOpcodeForVT(
EltVT.getSimpleVT().SimpleTy, NVPTX::INT_PTX_LDU_G_v4i8_ELE,
NVPTX::INT_PTX_LDU_G_v4i16_ELE, NVPTX::INT_PTX_LDU_G_v4i32_ELE,
std::nullopt, NVPTX::INT_PTX_LDU_G_v4f32_ELE, std::nullopt);
break;
+ case NVPTXISD::LoadV8:
+ switch (EltVT.getSimpleVT().SimpleTy) {
+ case MVT::i32:
+ Opcode = NVPTX::INT_PTX_LDG_G_v8i32_ELE;
+ break;
----------------
dakersnar wrote:
At the time it was because I didn't want to change the helper to take in optionals for all the arguments, as currently you can only pass nullopt for i64 and f64. But I think that seems like a reasonable change.
https://github.com/llvm/llvm-project/pull/139292
More information about the llvm-commits
mailing list