[PATCH] D84056: [FPEnv] Don't transform FSUB(-0, X) -> FNEG(X) in SelectionDAGBuilder.
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 21 09:06:02 PDT 2020
arsenm added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fmed3.ll:39
%med3 = call float @llvm.amdgcn.fmed3.f32(float %src0, float %src1, float %src2)
%neg.med3 = fsub float -0.0, %med3
%med3.user = fmul float %med3, 4.0
----------------
cameron.mcinally wrote:
> cameron.mcinally wrote:
> > cameron.mcinally wrote:
> > > arsenm wrote:
> > > > This is a code size regression. We probably need
> > > >
> > > > This should just change to use fneg. We could probably consider the fsub case in performFNegCombine though
> > > > This should just change to use fneg.
> > >
> > > Are you saying the IR in this test should just be updated to use FNEG?
> > >
> > > I'll look to see if there's a clean way to combine FSUB with the FMED...
> > This problem is really sticky (and maybe a design flaw in DAGCombine). The initial DAGs for pre/post-Diff are largely the same, except for the FSUB/FNEG node.
> >
> > The problem arrises in visitFSUB(...), when the FSUB(-0,X) is transformed to FNEG(X). This combine, of course, removes the FSUB(-0,X) from its current position in the Worklist, and the new FNEG(X) is placed at the end of the Worklist.
> >
> > This new Worklist order is pretty unfortunate. The two transforms that are needed to fold the FNEG into the FMED3 are now out of order. (If DAGCombine ran one more time, it looks like we'd correctly get the fold.)
> Actually, this is a bug in `AMDGPUTargetLowering::performFNegCombine(...)`. Under the FMED3 case, we have this code:
>
> ```
> if (Res.getOpcode() != AMDGPUISD::FMED3)
> return SDValue(); // Op got folded away.
> if (!Op.hasOneUse())
> DAG.ReplaceAllUsesWith(Op, DAG.getNode(ISD::FNEG, SL, VT, Res));
> return Res;
> ```
>
> That's a problem. We're using RAUW to insert the new user(s) of Res. But we return Res, not the new user(s), to DAGCombine.
>
> DAGCombine will then add the users of Res back to the Worklist, but not the users of the new FNEG. So we miss the new FNEG combine opportunity.
>
> More details to come tomorrow...
Yes, the IR should be changed. Theoretically the fsub case would be another test and a different optimization
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D84056/new/
https://reviews.llvm.org/D84056
More information about the llvm-commits
mailing list