[PATCH] D84056: [FPEnv] Don't transform FSUB(-0, X) -> FNEG(X) in SelectionDAGBuilder.

Tue Jul 21 11:18:41 PDT 2020

cameron.mcinally added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:3796-3797
     SDValue Res = DAG.getNode(AMDGPUISD::FMED3, SL, VT, Ops, N0->getFlags());
     if (Res.getOpcode() != AMDGPUISD::FMED3)
       return SDValue(); // Op got folded away.
+
----------------
arsenm wrote:
> I don't actually know why this would ever happen?
Came from D50706. The IR for this particular problem is:

```
  %med3 = call float @llvm.amdgcn.fmed3.f32(float %src0, float %src1, float %src2)
  %neg.med3 = fsub float -0.0, %med3
  %med3.user = fmul float %med3, 4.0
```

The intention is that the FNEG is sunk into the FMED3 operands and the FMUL absorbs it by negating the constant.

Seems like a pretty specific peep to me, but I'm no expert on this instruction...

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9274-9275
+  case ISD::FSUB: {
+    // FSUB(-0.0,X) can be lowered or combined as a bit operation.
+    // Need to check its input recursively to handle.
+    ConstantFPSDNode *N0C = isConstOrConstSplatFP(Op.getOperand(0), true);
----------------
arsenm wrote:
> cameron.mcinally wrote:
> > cameron.mcinally wrote:
> > > arsenm wrote:
> > > > I thought the point was this cannot be lowered this way?
> > > This isn't as bad as it seems, I think. This is `isCanonicalized(...)`, so we're just saying the the FSUB(-0,X) *might* not flush denormals (if DAGCombine wants to change it to an FNEG(X) later). 
> > > 
> > > We could check the DenormalMode here though. That would let us know what will happen to the FSUB(-0,X) in DAGCombine. Assuming it's not a can of worms, I could do that. If it is a can-of-worms, might be better left for separate patch. It should be functional with this patch, just not optimal.
> > > 
> > > 
> > > We could check the DenormalMode here though. 
> > 
> > Guarding this code based on denormal mode is easy enough, but it affects the assembly in a number of tests where denormals are enabled. That will make this patch harder to review. It might be better to optimize in a stand-alone Diff.
> > 
> > Thoughts on this?
> I think checking for denormals isn't quite correct; canonicalized doesn't just mean denormals are flushed. It also means at minimum signaling nans are quieted
That's true, but there is more nuance here...

For the general FSUB case, the result will be canonicalized. No problem there.

For the FSUB(-0,X) case, the result may or may not be canonicalized, depending on if the FSUB is replaced with a FNEG or not. And DAGCombine will only do that FNEG transform if in a particular denormal mode.

So, as long as both this check and the decision to transform FSUB(-0,X)->FNEG(X) are in lockstep, it's not really a problem. But, clearly, this is somewhat brittle and error prone.

Being a little pessimistic in order to more forward isn't the worst thing, assuming we don't regress too much. Figuring out a good solution to this subproblem will be easier (for me at least) if we can isolate it from the larger project.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84056/new/

https://reviews.llvm.org/D84056