[llvm] [InstCombine] Eliminate fptrunc/fpext if fast math flags allow it (PR #115027)
Joshua Cranmer via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 3 14:15:32 PST 2025
================
@@ -1940,6 +1940,31 @@ Instruction *InstCombinerImpl::visitFPExt(CastInst &FPExt) {
return CastInst::Create(FPCast->getOpcode(), FPCast->getOperand(0), Ty);
}
+ // fpext (fptrunc(x)) -> x, if the fast math flags allow it
+ if (auto *Trunc = dyn_cast<FPTruncInst>(Src)) {
+ // Whether this transformation is possible depends on the fast math flags of
+ // both the fpext and fptrunc.
+ FastMathFlags SrcFlags = Trunc->getFastMathFlags();
+ FastMathFlags DstFlags = FPExt.getFastMathFlags();
+ // Trunc can introduce inf and change the encoding of a nan, so the
+ // destination must have the nnan and ninf flags to indicate that we don't
+ // need to care about that. We are also removing a rounding step, and that
+ // requires both the source and destination to allow contraction.
+ if (DstFlags.noNaNs() && DstFlags.noInfs() && SrcFlags.allowContract() &&
----------------
jcranmer-intel wrote:
I've got an RFC on the `contract` semantics almost ready to go (let's see if I'm successful in getting it published this week). The working definition I have, that generalizes it from just FMA formation, is essentially "new expression would evaluate the same result if both old and newer were evaluated at infinite range/precision, and the new expression only has at most one instruction that causes rounding" (a few other conditions, but that's the main one).
The main goal is to generalize to cover cases like `fms` instructions (`a * b - c`), but when I was reviewing a lot of the documentation in C for `#pragma STDC FP_CONTRACT`, I've also found that `fpext; libm; fptrunc` is another case that seems to be contemplated for `FP_CONTRACT`.
By this definition of `contract`, then this optimization would be correct. But I'll also admit to having a vague level of unease about this being in `contract`--round-tripping via a smaller value tends to feel intentional, so removing it should have some stronger level of intent. Of all of the FMFs, `contract` is also the flag that is probably the most likely for users to default to enable, so it helps to be a little less aggressive in the optimizations here.
Stepping back a touch: I've increasingly come to the opinion that FMA formation via "combine add/mul into fma if some flag is present" isn't the best way to tackle optimization. It's better to instead have a "fast_fma" primitive, which does fma if there's a hardware instruction for it, and add/mul if there isn't. This is ultimately something that requires source changes, even language changes, so we may be screwed out of a path for this for C/C++, though.
If people want it, I can bring up this topic to the CFP study group.
https://github.com/llvm/llvm-project/pull/115027
More information about the llvm-commits
mailing list