[PATCH] D80702: [AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics

Thu May 28 08:07:42 PDT 2020

arsenm added inline comments.

================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:1918
+        auto NativeFP = IntrinsicID == Intrinsic::amdgcn_cos ? cos : sin;
+        return ConstantFoldFP(NativeFP, V * 2 * M_PI, Ty);
     }
----------------
foad wrote:
> arsenm wrote:
> > foad wrote:
> > > This folds amdgcn.sin(1.0) to a very small value that is not exactly 0.0. Should I add an explicit check for all of the quarter-integer values that should fold to exactly -1.0 or 0.0 or +1.0 ?
> > I've never been sure what exactly the policy should be for folding these sorts of intrinsics. I think nobody is happy with relying on the host libm call to begin with, plus the hardware instructions don't give the exact same results. We're already constant folding more precise results for rcp than the hardware instruction gives though (which is probably more useful in general).
> > 
> > I'm not sure this is how this will work going forward, but you could also consider strictfp on the call site (it may happen already, but you should add a test where certain folds are skipped)
> > 
> > Also you should use numbers::pi rather than relying on M_PI (and I strongly prefer using an explicit 2.0 rather than 2)
> > you could also consider strictfp on the call site
> 
> I thought:
> - strictfp is to do with whether certain math functions are guaranteed to respect the ambient rounding mode etc
> - the definition of amdgcn.cos is "do what the hardware instruction does"
> - the hardware instruction doesn't behave differently depending on the rounding mode
> - so there's no need to worry about strictfp
> But perhaps I'm wrong about some or all of this.
I'm pretty sure the nan quieting behavior will change based on the ieee_mode bit. AMDGPU has a wider set of FP environment properties. Plus, these support exceptions anyway

================
Comment at: llvm/test/Analysis/ConstantFolding/AMDGPU/sin.ll:174
+  store volatile double %ninf, double* %p
+  %nan = call double @llvm.amdgcn.sin.f64(double 0x7FF8000000000000) ; nan
+  store volatile double %nan, double* %p
----------------
foad wrote:
> arsenm wrote:
> > Also should test snan -> qnan (this also won't happen for !ieee_mode, so this case should depend on strictfp also)
> All of the FP constant folding code is skipped for infinities and nans so there's not much to check. (I didn't really mean to add any of these inf and nan checks here, just carelessly cut n pasted from another test.)
That would be a bug, since I would expect them to fold too

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80702/new/

https://reviews.llvm.org/D80702