[llvm] [NVPTX] legalize v2i32 to improve compatibility with v2f32 (PR #153478)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 2 10:54:21 PDT 2025


Artem-B wrote:

This patch appears to fix the crash.

That said, we should be able to actually lower the truncation v2i32->v2i16 into a single PRMT instruction. LLVM currently generates two PRMT instructions that could be combined. We could use the custom lowering for the op as it would help on all GPU variants.

```
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 3ac7c2874408..48e539037dcc 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -638,6 +638,9 @@ NVPTXTargetLowering::NVPTXTargetLowering(const NVPTXTargetMachine &TM,
   // No support for these operations with v2f32/v2i32
   setOperationAction(ISD::INSERT_VECTOR_ELT, {MVT::v2f32, MVT::v2i32}, Expand);
   setOperationAction(ISD::VECTOR_SHUFFLE, {MVT::v2f32, MVT::v2i32}, Expand);
+
+  setOperationAction(ISD::TRUNCATE, MVT::v2i16, Expand);
+
   // Need custom lowering in case the index is dynamic.
   if (STI.hasF32x2Instructions())
     setOperationAction(ISD::EXTRACT_VECTOR_ELT, {MVT::v2f32, MVT::v2i32},
```

https://github.com/llvm/llvm-project/pull/153478


More information about the llvm-commits mailing list