[PATCH] D80702: [AMDGPU] Fold llvm.amdgcn.cos and llvm.amdgcn.sin intrinsics

Thu May 28 07:02:34 PDT 2020

arsenm added inline comments.

================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:1918
+        auto NativeFP = IntrinsicID == Intrinsic::amdgcn_cos ? cos : sin;
+        return ConstantFoldFP(NativeFP, V * 2 * M_PI, Ty);
     }
----------------
foad wrote:
> This folds amdgcn.sin(1.0) to a very small value that is not exactly 0.0. Should I add an explicit check for all of the quarter-integer values that should fold to exactly -1.0 or 0.0 or +1.0 ?
I've never been sure what exactly the policy should be for folding these sorts of intrinsics. I think nobody is happy with relying on the host libm call to begin with, plus the hardware instructions don't give the exact same results. We're already constant folding more precise results for rcp than the hardware instruction gives though (which is probably more useful in general).

I'm not sure this is how this will work going forward, but you could also consider strictfp on the call site (it may happen already, but you should add a test where certain folds are skipped)

Also you should use numbers::pi rather than relying on M_PI (and I strongly prefer using an explicit 2.0 rather than 2)

================
Comment at: llvm/test/Analysis/ConstantFolding/AMDGPU/sin.ll:174
+  store volatile double %ninf, double* %p
+  %nan = call double @llvm.amdgcn.sin.f64(double 0x7FF8000000000000) ; nan
+  store volatile double %nan, double* %p
----------------
Also should test snan -> qnan (this also won't happen for !ieee_mode, so this case should depend on strictfp also)

================
Comment at: llvm/test/Analysis/ConstantFolding/AMDGPU/sin.ll:177
+  ret void
+}
----------------
Add some cases where strictfp calls are skipped

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80702/new/

https://reviews.llvm.org/D80702