[llvm] [NVPTX] support packed f32 instructions for sm_100+ (PR #126337)

Tue Jun 10 09:30:41 PDT 2025

================
@@ -3454,19 +3499,15 @@ SDValue NVPTXTargetLowering::LowerFormalArguments(
       unsigned I = 0;
       for (const unsigned NumElts : VectorInfo) {
         const EVT EltVT = VTs[I];
-        const EVT LoadVT = [&]() -> EVT {
-          // i1 is loaded/stored as i8.
-          if (EltVT == MVT::i1)
-            return MVT::i8;
-          // getLoad needs a vector type, but it can't handle
-          // vectors which contain v2f16 or v2bf16 elements. So we must load
-          // using i32 here and then bitcast back.
-          if (EltVT.isVector())
-            return MVT::getIntegerVT(EltVT.getFixedSizeInBits());
-          return EltVT;
-        }();
-
-        const EVT VecVT = EVT::getVectorVT(F->getContext(), LoadVT, NumElts);
+        // i1 is loaded/stored as i8
+        const EVT LoadVT = EltVT == MVT::i1 ? MVT::i8 : EltVT;
+        // If the element is v2f16/v2bf16/v2f32/etc, then it really holds 2
+        // packed elements of type f16/bf16/f32.
+        const unsigned PackingAmt =
----------------
AlexMaclean wrote:

It seems like this change which impacts v2 16-bit types as well is contributing most of the test diffs be causing lots of ld.params to be re-ordered. Maybe this can be pulled too?

https://github.com/llvm/llvm-project/pull/126337