[PATCH] D154528: [AMDGPU][GlobalISel] Generate fast fp64-to-fp16 conversions in unsafe mode.

Thu Jul 6 08:15:16 PDT 2023

kosarev added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/VOP1Instructions.td:519-522
+def : GCNPat<
+    (f16 (fpround f64:$src)),
+    (V_CVT_F16_F32_e64 SRCMODS.NONE, (V_CVT_F32_F64_e64 SRCMODS.NONE, $src))
+>;
----------------
foad wrote:
> kosarev wrote:
> > arsenm wrote:
> > > Should just let the expansion split them up, this is missing the source modifiers
> > Updated to match modifiers. The expansion would combine back to a single fpround f64, so doesn't work.
> "The expansion would combine back to a single fpround f64" - that sounds like the combine is broken then? It should respect legality.
It's trivial to explain it to GISel that f16 = fp_round f64 should be expanded while f16 = fp_round f32 is legal as it is. However, in SDAG, if I'm not wrong, we can only differentiate by the resulting type, and if we choose custom expansion for all to-f16 roundings, then SDAG will try to combine whatever they result in.

It's not an immediate problem for this patch because we always do FP_TO_FP16, because we don't have 16-bit registers here yet. For True16, however, if we want it be a chain of two FP_ROUNDs, the expansion will be combined back. But then if we need the pattern for the True16 case anyway, then I'm not sure doing special work in GlobalISel and for the non-16bit cases is worth it. Would appreciate your opinions.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154528/new/

https://reviews.llvm.org/D154528