[llvm] [NVPTX] support packed f32 instructions for sm_100+ (PR #126337)

Alex MacLean via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 10 09:30:41 PDT 2025


================
@@ -3595,6 +3635,9 @@ NVPTXTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv,
   const auto VectorInfo = VectorizePTXValueVTs(VTs, Offsets, RetAlign);
   unsigned I = 0;
   for (const unsigned NumElts : VectorInfo) {
+    // amount to subdivide. If not v2f32, we don't consider packing
+    const unsigned PackingAmt = VTs[I] == MVT::v2f32 ? 2 : 1;
----------------
AlexMaclean wrote:

Why only do this for v2f32?

https://github.com/llvm/llvm-project/pull/126337


More information about the llvm-commits mailing list