[llvm] [NVPTX] support packed f32 instructions for sm_100+ (PR #126337)

Alex MacLean via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 22 15:24:02 PDT 2025


================
@@ -5681,6 +6079,150 @@ static SDValue combineADDRSPACECAST(SDNode *N,
   return SDValue();
 }
 
+static SDValue PerformFP_ROUNDCombine(SDNode *N,
+                                      TargetLowering::DAGCombinerInfo &DCI) {
+  SDLoc DL(N);
+  SDValue Op = N->getOperand(0);
+  SDValue Trunc = N->getOperand(1);
+  EVT NarrowVT = N->getValueType(0);
+  EVT WideVT = Op.getValueType();
+
+  // v2[b]f16 = fp_round (v2f32 A)
+  // -> v2[b]f16 = (build_vector ([b]f16 = fp_round (extractelt A, 0)),
+  //                             ([b]f16 = fp_round (extractelt A, 1)))
----------------
AlexMaclean wrote:

Why is this required?

https://github.com/llvm/llvm-project/pull/126337


More information about the llvm-commits mailing list