[llvm] AMDGPU: Custom lower fptrunc vectors for f32 -> f16 (PR #141883)
Changpeng Fang via llvm-commits
llvm-commits at lists.llvm.org
Thu May 29 00:32:59 PDT 2025
================
@@ -2749,6 +2753,21 @@ bool AMDGPULegalizerInfo::legalizeMinNumMaxNum(LegalizerHelper &Helper,
return Helper.lowerFMinNumMaxNum(MI) == LegalizerHelper::Legalized;
}
+bool AMDGPULegalizerInfo::legalizeFPTrunc(LegalizerHelper &Helper,
+ MachineInstr &MI,
+ MachineRegisterInfo &MRI) const {
+ Register DstReg = MI.getOperand(0).getReg();
+ LLT DstTy = MRI.getType(DstReg);
+ assert(DstTy.isVector() && DstTy.getNumElements() > 2);
+ LLT EltTy = DstTy.getElementType();
+ assert(EltTy == S16 && "Only handle vectors of half");
+
+ // Split vector to packs.
+ LLT PkTy = LLT::fixed_vector(2, EltTy);
+ return Helper.fewerElementsVector(MI, /*TypeIdx=*/0, PkTy) ==
+ LegalizerHelper::Legalized;
+}
+
----------------
changpeng wrote:
> This is just ordinary fewerElementsVector, this doesn't need t o be custom
Are you suggesting .fewerElementsIf ? I have difficulty in developing a predicate to check both Src and Dst Types. I think FP cast is special in that Src and Dst have different types.
https://github.com/llvm/llvm-project/pull/141883
More information about the llvm-commits
mailing list