[PATCH] D137811: InstCombine: Port amdgcn.class intrinsic combines to is.fpclass
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 30 09:01:09 PST 2022
arsenm added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:847
+ // fp_class (nnan x), qnan|snan|other -> fp_class (nnan x), other
+ if ((Mask & fcNan) && isKnownNeverNaN(Src0, &getTargetLibraryInfo())) {
+ return replaceOperand(II, 1,
----------------
foad wrote:
> jyknight wrote:
> > These removal of Mask bits should come before the `Mask == $X` tests, shouldn't they?
> I assume that this function will be called again to revisit the modified instrinsic.
Right, it's revisited. I don't expect the bit removal part to ever actually happen
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:856
+ // Clamp mask to used bits
+ if ((Mask & fcAllFlags) != Mask) {
+ CallInst *NewCall = Builder.CreateCall(
----------------
jyknight wrote:
> Why are unknown bits even accepted? ISTM it should be an error in Verifier::visitIntrinsicCall to pass invalid bits.
I don't know. Really this should be an i10 argument. I've been debating whether to add a verifier check, or make it an i10. I'd prefer to do that separately and clean up the bits here for now.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:824
+ // Equivalent of isnan. Replace with standard fcmp.
+ Value *FCmp = Builder.CreateFCmpUNO(Src0, Src0);
+ FCmp->takeName(&II);
----------------
jyknight wrote:
> arsenm wrote:
> > sepavloff wrote:
> > > Is it profitable to make such replacement early? Is there any advantage, at least hypothetical?
> > I think an fcmp should be more canonical and better handled by existing optimizations, than class which is handled ~nowhere
> This seems OK.
>
> I'm not sure it's the //best// choice -- if a CPU actually has an fpclassify instruction, is it really a good idea to canonicalize in generic code to fcmp? But I think that's fine to revisit later if it becomes a problem.
I've been thinking if we can do a classify in <= 2 IR instructions, fcmp+fabs is probably a better canonical form. If a class pattern is 3-4+ instructions, the class is probably better. FCmp + fneg + fabs are always going to be more broadly understood. Fcmp also supports fast math flags, unlike class (I guess we could fix that though)
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D137811/new/
https://reviews.llvm.org/D137811
More information about the llvm-commits
mailing list