[llvm] [InstCombine] Eliminate fptrunc/fpext if fast math flags allow it (PR #115027)
Andy Kaylor via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 30 09:57:01 PST 2025
================
@@ -1940,6 +1940,31 @@ Instruction *InstCombinerImpl::visitFPExt(CastInst &FPExt) {
return CastInst::Create(FPCast->getOpcode(), FPCast->getOperand(0), Ty);
}
+ // fpext (fptrunc(x)) -> x, if the fast math flags allow it
+ if (auto *Trunc = dyn_cast<FPTruncInst>(Src)) {
+ // Whether this transformation is possible depends on the fast math flags of
+ // both the fpext and fptrunc.
+ FastMathFlags SrcFlags = Trunc->getFastMathFlags();
+ FastMathFlags DstFlags = FPExt.getFastMathFlags();
+ // Trunc can introduce inf and change the encoding of a nan, so the
+ // destination must have the nnan and ninf flags to indicate that we don't
+ // need to care about that. We are also removing a rounding step, and that
+ // requires both the source and destination to allow contraction.
+ if (DstFlags.noNaNs() && DstFlags.noInfs() && SrcFlags.allowContract() &&
----------------
andykaylor wrote:
I'm also not comfortable with using `contract` in this way. I can see your point that this meets the C23 definition for FP_CONTRACT, but I don't think it's what users would expect. My feeling is that users probably associate the contract flag with FMA. The C23 standard extends the definition, presumably in anticipation of similar hardware instructions that are able to fuse other combinations of operations, but the transformation proposed in this PR is most likely circumventing an explicit user instruction intended to truncate a value.
In order to get the 'contract' flag in isolation from clang, for example, you'd need to use the `-ffp-contract=fast` option. The documentation for this option says, "Specify when the compiler is permitted to form fused floating-point operations, such as fused multiply-add (FMA)." There's nothing there that indicates it will allow the compiler to disregard changes in precision implied (or explicitly requested) by the source code. If I were using this option, my intention would be to enable FMA formation across expressions. I would not be happy if the compiler changed my results in other ways.
This gets us into a problematic situation. I think we'd all agree that the full set of fast-math flags should enable this transformation. However, there isn't a flag for general value-changing optimizations, so if we don't allow this with `contract` we're going to have to either look at the "unsafe-fp-math" function attribute or do something more or less arbitrary in requiring some other combination of flags.
@jcranmer-intel What's your opinion on this?
https://github.com/llvm/llvm-project/pull/115027
More information about the llvm-commits
mailing list