[llvm] [NVPTX] legalize v2i32 to improve compatibility with v2f32 (PR #153478)
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 2 10:54:21 PDT 2025
Artem-B wrote:
This patch appears to fix the crash.
That said, we should be able to actually lower the truncation v2i32->v2i16 into a single PRMT instruction. LLVM currently generates two PRMT instructions that could be combined. We could use the custom lowering for the op as it would help on all GPU variants.
```
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 3ac7c2874408..48e539037dcc 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -638,6 +638,9 @@ NVPTXTargetLowering::NVPTXTargetLowering(const NVPTXTargetMachine &TM,
// No support for these operations with v2f32/v2i32
setOperationAction(ISD::INSERT_VECTOR_ELT, {MVT::v2f32, MVT::v2i32}, Expand);
setOperationAction(ISD::VECTOR_SHUFFLE, {MVT::v2f32, MVT::v2i32}, Expand);
+
+ setOperationAction(ISD::TRUNCATE, MVT::v2i16, Expand);
+
// Need custom lowering in case the index is dynamic.
if (STI.hasF32x2Instructions())
setOperationAction(ISD::EXTRACT_VECTOR_ELT, {MVT::v2f32, MVT::v2i32},
```
https://github.com/llvm/llvm-project/pull/153478
More information about the llvm-commits
mailing list