[PATCH] D54783: [Power9] suboptimal vec_abs for some cases

Thu Dec 13 10:25:40 PST 2018

jsji requested changes to this revision.
jsji added a comment.
This revision now requires changes to proceed.

I like the idea here overall, but looks like we are squeezing several similar and related work into one patch. I would prefer we split them into different patches. Thanks!

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:657

+    if (!Subtarget.hasP8Altivec())
+      setOperationAction(ISD::ABS, MVT::v2i64, Expand);
----------------
Can we add some comment here for why we need to handle v2i64 differently? due to current subtarget doesn't support smax v2i64?

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:9572
+
+  // SMAX patch hasn't landed yet, so use intrinsic first
+  // TODO: Should use SMAX directly once SMAX patch landed
----------------
Which SMAX patch? can you be specific? Has it landed now?

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:13267
+  case ISD::VSELECT: 
+    return combineVSelect(N, DCI);
   }
----------------
Although combineVSelect is also for VABSD, this looks like another independent optimization, can we split this into another patch?

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:14526
+
+SDValue PPCTargetLowering::combineABS(SDNode *N, DAGCombinerInfo &DCI) const {
+  assert((N->getOpcode() == ISD::ABS) && "Need ABS node here");
----------------
Can we add some comments before this function, to show what it will transform from & to?

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:14562
+
+SDValue PPCTargetLowering::combineVSelect(SDNode *N,
+                                          DAGCombinerInfo &DCI) const {
----------------
Can we add some comments before this function, to show what it will transform from & to?

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.h:381
+      /// operand #2 constant i32 0 or 1, to indicate whether needs to patch
+      /// the most significant bit for signed i32
+      ABSD,
----------------
Can we have some example code to show why we need to patch the bit, similar to old code for ISD::ABS

```   
// For abs(sub(a, b)), we generate VABSDUW(a+0x80000000, b+0x80000000).
...
```

================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.h:382
+      /// the most significant bit for signed i32
+      ABSD,
+
----------------
Since this node will only handle vector types, maybe we should rename it to `VABSD`?

================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:4048
+
+  def : Pat<(v4i32 (PPCabsd v4i32:$A, v4i32:$B, (i32 1))),
+            (v4i32 (VABSDUW (XVNEGSP $A), (XVNEGSP $B)))>;
----------------
Can we add comment here for the last operand

================
Comment at: llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll:1
-; RUN: llc < %s -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 -verify-machineinstrs | FileCheck %s
-; RUN: llc < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr9 -verify-machineinstrs | FileCheck %s
-; RUN: llc < %s -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-PWR8 -implicit-check-not vabsdu
+; RUN: llc < %s -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 -ppc-vsr-nums-as-vr -ppc-asm-full-reg-names -verify-machineinstrs | FileCheck %s
+; RUN: llc < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr9 -ppc-vsr-nums-as-vr -ppc-asm-full-reg-names -verify-machineinstrs | FileCheck %s
----------------
Can we add -ppc-vsr-nums-as-vr -ppc-asm-full-reg-names  in a NFC patch before this?

================
Comment at: llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll:5

-; Function Attrs: nounwind readnone
 define <4 x i32> @simple_absv_32(<4 x i32> %a) local_unnamed_addr {
----------------
Why we remove Function Attrs here? If this is no relevant clean up work, please do it in a NFC patch before this. Thanks.

================
Comment at: llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll:103
+
+define <8 x i16> @sub_absv_16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr {
+entry:
----------------
This can be in the new patch for combineVSelect.

================
Comment at: llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll:470

+; To verify vabsdu* exploitation for ucmp + sub + select sequence
+
----------------
This can be in new combineVselect patch as well.

================
Comment at: llvm/test/CodeGen/PowerPC/pre-inc-disable.ll:22
 ; CHECK:    xxpermdi v5, f0, f0, 2
-; CHECK:    vperm v0, v4, v5, v2
-; CHECK:    vperm v5, v5, v4, v3
-; CHECK:    xvnegsp v5, v5
-; CHECK:    xvnegsp v0, v0
+; CHECK-DAG: vperm v[[VR1:[0-9]+]], v4, v5, v2
+; CHECK-DAG: vperm v[[VR2:[0-9]+]], v5, v4, v3
----------------
are these `CHECK` -> `CHECK-DAG` relevant to this change? Or just non-relevant NFC change?If so , please commit before this patch. Thanks.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54783/new/

https://reviews.llvm.org/D54783