[PATCH] D62993: [PowerPC] Emit scalar min/max instructions with unsafe fp math

Fri Oct 25 16:12:59 PDT 2019

nemanjai marked 6 inline comments as done.
nemanjai added inline comments.

================
Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:1304
   case PPCISD::FSEL:            return "PPCISD::FSEL";
+  case PPCISD::XSMAXCDP:        return "PPCISD::XSMAXCPD";
+  case PPCISD::XSMINCDP:        return "PPCISD::XSMINCDP";
----------------
jsji wrote:
> Typo? `XSMAXCDP` not `XSMAXCPD`.
Oops. Thanks, I'l fix it.

================
Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:7210
   SDNodeFlags Flags;
   Flags.setNoInfs(true);
   Flags.setNoNaNs(true);
----------------
jsji wrote:
> Although this won't have problem right now, because we always return before using `Flags`.
> I think it would be better to move this after the new code, to avoid a potential trap for future programmers.
Fair enough, I'll move this down below the min/max switch.

================
Comment at: lib/Target/PowerPC/PPCISelLowering.cpp:7223
+    switch (CC) {
+    default: return Op;
+    case ISD::SETOGT:
----------------
jsji wrote:
> We will lose some opportunities here, eg: with `-mcpu=pwr9 --enable-no-nans-fp-math --enable-no-signed-zeros-fp-math`?
> We will catch new opportunities for max/min, but will give up lowering all the other CC?
That is a really good point, this will prevent us from generating the `fsel` on P9. I'll fix it up.

================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:1290
+            (f64 (XSMINDP $A, $B))>;
+  def : Pat<(f64 (fminnum_ieee f64:$A, (fcanonicalize f64:$B))),
+            (f64 (XSMINDP $A, $B))>;
----------------
jsji wrote:
> Nit: The pattern order here is not consistent with above: instead of having one min, one max, we start to put all min together, hen max.  It won't have problem, but would be better to read/check if we make it consistent.
I agree 100% that it's nicer to keep it organized and consistent. Will update, thank you.

================
Comment at: test/CodeGen/PowerPC/ctr-minmaxnum.ll:69
 ; QPX: mtctr
-; QPX-NOT: bl fminf
+; QPX-NOT: xsmindp
 ; QPX: blr
----------------
jsji wrote:
> QPX should NOT be affected, so shouldn't change here?
Oops, overzealous search-and-replace...

================
Comment at: test/CodeGen/PowerPC/scalar-min-max.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mcpu=pwr8 -ppc-asm-full-reg-names --enable-unsafe-fp-math \
+; RUN:   -verify-machineinstrs --enable-no-signed-zeros-fp-math \
----------------
jsji wrote:
> I would be more interested to see `-mcpu=pwr9` with `--enable-no-nans-fp-math` instead of `-mcpu=pwr8`. :)
I think it is good to show that we get the same instructions with fast math on both P8 and P9.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62993/new/

https://reviews.llvm.org/D62993