[llvm] AMDGPU: Custom lower fptrunc vectors for f32 -> f16 (PR #141883)

Shilei Tian via llvm-commits llvm-commits at lists.llvm.org
Wed May 28 19:21:24 PDT 2025


================
@@ -2749,6 +2752,20 @@ bool AMDGPULegalizerInfo::legalizeMinNumMaxNum(LegalizerHelper &Helper,
   return Helper.lowerFMinNumMaxNum(MI) == LegalizerHelper::Legalized;
 }
 
+bool AMDGPULegalizerInfo::legalizeFPTrunc(LegalizerHelper &Helper,
+                                          MachineInstr &MI,
+                                          MachineRegisterInfo &MRI) const {
+  Register DstReg = MI.getOperand(0).getReg();
+  LLT DstTy = MRI.getType(DstReg);
+  assert (DstTy.isVector() && DstTy.getNumElements() > 2);
+  LLT EltTy = DstTy.getElementType();
+  assert (EltTy == S16 && "Only handle vectors of half");
+
+  // Split vector to packs.
+  return Helper.fewerElementsVector(MI, 0, LLT::fixed_vector(2, EltTy)) ==
----------------
shiltian wrote:

```suggestion
  return Helper.fewerElementsVector(MI, /*TypeIdx=*/0, LLT::fixed_vector(2, EltTy)) ==
```

https://github.com/llvm/llvm-project/pull/141883


More information about the llvm-commits mailing list