[llvm] [InstCombine] Eliminate fptrunc/fpext if fast math flags allow it (PR #115027)

Wed Jan 29 07:32:43 PST 2025

================
@@ -1940,6 +1940,31 @@ Instruction *InstCombinerImpl::visitFPExt(CastInst &FPExt) {
       return CastInst::Create(FPCast->getOpcode(), FPCast->getOperand(0), Ty);
   }
 
+  // fpext (fptrunc(x)) -> x, if the fast math flags allow it
+  if (auto *Trunc = dyn_cast<FPTruncInst>(Src)) {
+    // Whether this transformation is possible depends on the fast math flags of
+    // both the fpext and fptrunc.
+    FastMathFlags SrcFlags = Trunc->getFastMathFlags();
+    FastMathFlags DstFlags = FPExt.getFastMathFlags();
+    // Trunc can introduce inf and change the encoding of a nan, so the
+    // destination must have the nnan and ninf flags to indicate that we don't
+    // need to care about that. We are also removing a rounding step, and that
+    // requires both the source and destination to allow contraction.
+    if (DstFlags.noNaNs() && DstFlags.noInfs() && SrcFlags.allowContract() &&
----------------
john-brawn-arm wrote:

The LLVM LangRef is a bit unhelpful as to what the contract flag allows, as it just says "Allow floating-point contraction" without defining what it means by contraction. The definition of contraction in the C23 standard (in section 6.5.1) is clearer:

> A floating expression may be contracted, that is, evaluated as though it were a single operation, thereby omitting rounding errors implied by the source code and the expression evaluation method. The FP_CONTRACT pragma in <math.h> provides a way to disallow contracted expressions. Otherwise, whether and how expressions are contracted is implementation-defined.

with a footnote saying that in a contracted expression the intermediate operations are as if evaluated to infinite range and precision. If the expression `(double)(float)x` (where x is a double) is contracted then the double-to-float and float-to-double operations are evaluated as if to infinite range and precision, meaning the result is just x.

So long as what clang is doing is correct I don't think it matters that it's the first compiler to do it.

https://github.com/llvm/llvm-project/pull/115027