[llvm] [NVPTX] support packed f32 instructions for sm_100+ (PR #126337)

Princeton Ferro via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 10 14:19:42 PDT 2025


================
@@ -653,12 +658,16 @@ NVPTXTargetLowering::NVPTXTargetLowering(const NVPTXTargetMachine &TM,
 
   // Operations not directly supported by NVPTX.
   for (MVT VT : {MVT::bf16, MVT::f16, MVT::v2bf16, MVT::v2f16, MVT::f32,
-                 MVT::f64, MVT::i1, MVT::i8, MVT::i16, MVT::v2i16, MVT::v4i8,
-                 MVT::i32, MVT::i64}) {
+                 MVT::v2f32, MVT::f64, MVT::i1, MVT::i8, MVT::i16, MVT::v2i16,
+                 MVT::v4i8, MVT::i32, MVT::i64}) {
     setOperationAction(ISD::SELECT_CC, VT, Expand);
     setOperationAction(ISD::BR_CC, VT, Expand);
   }
 
+  // Not directly supported. TLI would attempt to expand operations like
+  // FMINIMUM(v2f32) using invalid SETCC and VSELECT nodes.
+  setOperationAction(ISD::VSELECT, MVT::v2f32, Expand);
----------------
Prince781 wrote:

`FMINIMUM` / `FMAXIMUM` on `v2f16` are legal and lower to `{min,max}.NaN.f16x2`, so TLI never expands them. I believe we can expand VSELECT for any type outside of `v4i8` (which we have a combiner rule for).

https://github.com/llvm/llvm-project/pull/126337


More information about the llvm-commits mailing list