[PATCH] D101759: [PowerPC] Scalar IBM MASS library conversion pass

Tue May 25 12:42:23 PDT 2021

bmahjour added inline comments.

================
Comment at: llvm/lib/Target/PowerPC/PPCGenScalarMASSEntries.cpp:70
+/// function
+bool PPCGenScalarMASSEntries::isCandidateSafeToLower(const CallInst &CI) const {
+  return CI.isFast();
----------------
There should be a todo comment to handle non-finite entries using fewer fast-math flags.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1361
   setLibcallName(RTLIB::FMA_F128, "fmaf128");
+  if (TM.Options.PPCGenScalarMASSEntries && TM.Options.UnsafeFPMath) {
+    setLibcallName(RTLIB::COS_F64, "__xl_cos_finite");
----------------
masoud.ataei wrote:
> bmahjour wrote:
> > why are these being handled here instead of `PPCGenScalarMASSEntries.cpp`?
> We are not handling llvm intrinsics in `PPCGenScalarMASSEntries.cpp` because we don't want to block any type of existing optimizations (like pow(x,0.5) --> sqrt(x)) and future optimizations (like https://reviews.llvm.org/D94543 ?).
I see, could you please put a comment in the code to explain that? Alternatively you can put the comment at the top of `llvm/include/llvm/Analysis/ScalarFuncs.def`.

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:1361
   setLibcallName(RTLIB::FMA_F128, "fmaf128");
+  if (TM.Options.PPCGenScalarMASSEntries && TM.Options.UnsafeFPMath) {
+    setLibcallName(RTLIB::COS_F64, "__xl_cos_finite");
----------------
bmahjour wrote:
> masoud.ataei wrote:
> > bmahjour wrote:
> > > why are these being handled here instead of `PPCGenScalarMASSEntries.cpp`?
> > We are not handling llvm intrinsics in `PPCGenScalarMASSEntries.cpp` because we don't want to block any type of existing optimizations (like pow(x,0.5) --> sqrt(x)) and future optimizations (like https://reviews.llvm.org/D94543 ?).
> I see, could you please put a comment in the code to explain that? Alternatively you can put the comment at the top of `llvm/include/llvm/Analysis/ScalarFuncs.def`.
Instead of `TM.Options.UnsafeFPMath` we should test for the individual fast-math flags that are required for safety. Checking for "unsafe-fp-math" has a few drawbacks:
1. To make clang enable that flag it is necessary but not enough to specify `-funsafe-math-optimizations`! You'd have to specify `-fno-math-errno` as well.
2. Clang sets the "unsafe-fp-math" flag when all four of `-fno-math-errno  -fassociative-math -freciprocal-math  -fno-signed-zeros` are specified, regardless of other flags... For example this command does the conversion to the _finite calls despite the user request to honor NaNs. `clang t.c -c -O3 -fno-math-errno  -fassociative-math -freciprocal-math  -fno-signed-zeros -fhonor-nans`

Even if the clang inconsistencies/issues are resolved, it would still be better to check for the individual flags for finer control and for consistency with other front-ends.

================
Comment at: llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll:1
+; RUN: llc -enable-ppc-gen-scalar-mass -mtriple=powerpc-ibm-aix-xcoff < %s | FileCheck -check-prefixes CHECK-ALL,CHECK-LWR %s
+
----------------
why not just use the default `CHECK` prefix? `CHECK-ALL` and `CHECK-LWR` don't distinguish anything based on this run command.

================
Comment at: llvm/test/CodeGen/PowerPC/lower-intrinsics-mass-aix.ll:19
+; CHECK-ALL-LABEL: cosf_f32
+; CHECK-DFLT: bl __xl_cosf_finite
+; CHECK-LWR-NOT: bl __xl_cosf_finite
----------------
CHECK-DFLT is not in the list of prefixes defined.

================
Comment at: llvm/test/CodeGen/PowerPC/lower-intrinsics-nofast-mass.ll:147
+}
+attributes #1 = { "unsafe-fp-math"="true" }
----------------
Remove this line, `#1` is unused.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101759/new/

https://reviews.llvm.org/D101759