[llvm] [AMDGPU] Fix canonicalization of truncated values. (PR #83054)

Tue Mar 5 02:35:34 PST 2024

================
@@ -26258,6 +26258,24 @@ SDValue DAGCombiner::visitFP_TO_FP16(SDNode *N) {
   if (N0->getOpcode() == ISD::FP16_TO_FP)
     return N0->getOperand(0);
 
+  // fold (fp_to_fp16 (freeze (fp16_to_fp (fp_to_fp16 op))))
----------------
jayfoad wrote:

Off the top of my head we could either:
1. Add a "must canonicalize" version of fp_round and/or fp_extend. But this does not sound very extendable, since we might also need "must canonicalize" versions of any other fp op that could be optimized away. Or:
2. Do not optimize away fcanonicalize as a DAG combine, but detect at instruction selection time when it is redundant because the input is already canonicalized. I guess this is what @arsenm suggested earlier (sorry I'm just catching up now). Is it hard to do this (without significant regressions)?

https://github.com/llvm/llvm-project/pull/83054