[PATCH] D84056: [FPEnv] Don't transform FSUB(-0, X) -> FNEG(X) in SelectionDAGBuilder.

Cameron McInally via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 22 14:05:39 PDT 2020


cameron.mcinally added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9274-9275
+  case ISD::FSUB: {
+    // FSUB(-0.0,X) can be lowered or combined as a bit operation.
+    // Need to check its input recursively to handle.
+    ConstantFPSDNode *N0C = isConstOrConstSplatFP(Op.getOperand(0), true);
----------------
cameron.mcinally wrote:
> arsenm wrote:
> > cameron.mcinally wrote:
> > > cameron.mcinally wrote:
> > > > arsenm wrote:
> > > > > I thought the point was this cannot be lowered this way?
> > > > This isn't as bad as it seems, I think. This is `isCanonicalized(...)`, so we're just saying the the FSUB(-0,X) *might* not flush denormals (if DAGCombine wants to change it to an FNEG(X) later). 
> > > > 
> > > > We could check the DenormalMode here though. That would let us know what will happen to the FSUB(-0,X) in DAGCombine. Assuming it's not a can of worms, I could do that. If it is a can-of-worms, might be better left for separate patch. It should be functional with this patch, just not optimal.
> > > > 
> > > > 
> > > > We could check the DenormalMode here though. 
> > > 
> > > Guarding this code based on denormal mode is easy enough, but it affects the assembly in a number of tests where denormals are enabled. That will make this patch harder to review. It might be better to optimize in a stand-alone Diff.
> > > 
> > > Thoughts on this?
> > I think checking for denormals isn't quite correct; canonicalized doesn't just mean denormals are flushed. It also means at minimum signaling nans are quieted
> That's true, but there is more nuance here...
> 
> For the general FSUB case, the result will be canonicalized. No problem there.
> 
> For the FSUB(-0,X) case, the result may or may not be canonicalized, depending on if the FSUB is replaced with a FNEG or not. And DAGCombine will only do that FNEG transform if in a particular denormal mode.
> 
> So, as long as both this check and the decision to transform FSUB(-0,X)->FNEG(X) are in lockstep, it's not really a problem. But, clearly, this is somewhat brittle and error prone.
> 
> Being a little pessimistic in order to more forward isn't the worst thing, assuming we don't regress too much. Figuring out a good solution to this subproblem will be easier (for me at least) if we can isolate it from the larger project.
This is the test that's tripping:

```
; GCN-LABEL: {{^}}v_test_canonicalize_fneg_fabs_var_f32:
; GFX678: v_mul_f32_e64 [[REG:v[0-9]+]], -1.0, |{{v[0-9]+}}|
; GFX9: v_max_f32_e64 [[REG:v[0-9]+]], -|{{v[0-9]+}}|, -|{{v[0-9]+}}|
; GCN: {{flat|global}}_store_dword v{{\[[0-9]+:[0-9]+\]}}, [[REG]]
define amdgpu_kernel void @v_test_canonicalize_fneg_fabs_var_f32(float addrspace(1)* %out) #1 {
  %val = load float, float addrspace(1)* %out
  %val.fabs = call float @llvm.fabs.f32(float %val)
  %val.fabs.fneg = fsub float -0.0, %val.fabs
  %canonicalized = call float @llvm.canonicalize.f32(float %val.fabs.fneg)
  store float %canonicalized, float addrspace(1)* %out
  ret void
}

attributes #1 = { nounwind "denormal-fp-math-f32"="preserve-sign,preserve-sign" }
```

We need a way for `isCanonicalized(...)` to return that an FSUB(-0,X) might not canonicalize if it is combined into an FNEG. Otherwise, the `@llvm.canonicalize.f32` will be removed and we end up with:

```
        flat_load_dword v2, v[0:1]
        s_waitcnt vmcnt(0) lgkmcnt(0)
        v_or_b32_e32 v2, 0x80000000, v2
        flat_store_dword v[0:1], v2
```

Maybe we need a `isFSUBtoFNEGLegal(...)` helper function? That way the check is uniform across all uses?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84056/new/

https://reviews.llvm.org/D84056





More information about the llvm-commits mailing list