[PATCH] D54783: [Power9] suboptimal vec_abs for some cases
Jinsong Ji via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 13 10:25:40 PST 2018
jsji requested changes to this revision.
jsji added a comment.
This revision now requires changes to proceed.
I like the idea here overall, but looks like we are squeezing several similar and related work into one patch. I would prefer we split them into different patches. Thanks!
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:657
+ if (!Subtarget.hasP8Altivec())
+ setOperationAction(ISD::ABS, MVT::v2i64, Expand);
----------------
Can we add some comment here for why we need to handle v2i64 differently? due to current subtarget doesn't support smax v2i64?
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:9572
+
+ // SMAX patch hasn't landed yet, so use intrinsic first
+ // TODO: Should use SMAX directly once SMAX patch landed
----------------
Which SMAX patch? can you be specific? Has it landed now?
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:13267
+ case ISD::VSELECT:
+ return combineVSelect(N, DCI);
}
----------------
Although combineVSelect is also for VABSD, this looks like another independent optimization, can we split this into another patch?
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:14526
+
+SDValue PPCTargetLowering::combineABS(SDNode *N, DAGCombinerInfo &DCI) const {
+ assert((N->getOpcode() == ISD::ABS) && "Need ABS node here");
----------------
Can we add some comments before this function, to show what it will transform from & to?
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:14562
+
+SDValue PPCTargetLowering::combineVSelect(SDNode *N,
+ DAGCombinerInfo &DCI) const {
----------------
Can we add some comments before this function, to show what it will transform from & to?
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.h:381
+ /// operand #2 constant i32 0 or 1, to indicate whether needs to patch
+ /// the most significant bit for signed i32
+ ABSD,
----------------
Can we have some example code to show why we need to patch the bit, similar to old code for ISD::ABS
```
// For abs(sub(a, b)), we generate VABSDUW(a+0x80000000, b+0x80000000).
...
```
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.h:382
+ /// the most significant bit for signed i32
+ ABSD,
+
----------------
Since this node will only handle vector types, maybe we should rename it to `VABSD`?
================
Comment at: llvm/lib/Target/PowerPC/PPCInstrVSX.td:4048
+
+ def : Pat<(v4i32 (PPCabsd v4i32:$A, v4i32:$B, (i32 1))),
+ (v4i32 (VABSDUW (XVNEGSP $A), (XVNEGSP $B)))>;
----------------
Can we add comment here for the last operand
================
Comment at: llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll:1
-; RUN: llc < %s -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 -verify-machineinstrs | FileCheck %s
-; RUN: llc < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr9 -verify-machineinstrs | FileCheck %s
-; RUN: llc < %s -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-PWR8 -implicit-check-not vabsdu
+; RUN: llc < %s -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr9 -ppc-vsr-nums-as-vr -ppc-asm-full-reg-names -verify-machineinstrs | FileCheck %s
+; RUN: llc < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr9 -ppc-vsr-nums-as-vr -ppc-asm-full-reg-names -verify-machineinstrs | FileCheck %s
----------------
Can we add -ppc-vsr-nums-as-vr -ppc-asm-full-reg-names in a NFC patch before this?
================
Comment at: llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll:5
-; Function Attrs: nounwind readnone
define <4 x i32> @simple_absv_32(<4 x i32> %a) local_unnamed_addr {
----------------
Why we remove Function Attrs here? If this is no relevant clean up work, please do it in a NFC patch before this. Thanks.
================
Comment at: llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll:103
+
+define <8 x i16> @sub_absv_16(<8 x i16> %a, <8 x i16> %b) local_unnamed_addr {
+entry:
----------------
This can be in the new patch for combineVSelect.
================
Comment at: llvm/test/CodeGen/PowerPC/ppc64-P9-vabsd.ll:470
+; To verify vabsdu* exploitation for ucmp + sub + select sequence
+
----------------
This can be in new combineVselect patch as well.
================
Comment at: llvm/test/CodeGen/PowerPC/pre-inc-disable.ll:22
; CHECK: xxpermdi v5, f0, f0, 2
-; CHECK: vperm v0, v4, v5, v2
-; CHECK: vperm v5, v5, v4, v3
-; CHECK: xvnegsp v5, v5
-; CHECK: xvnegsp v0, v0
+; CHECK-DAG: vperm v[[VR1:[0-9]+]], v4, v5, v2
+; CHECK-DAG: vperm v[[VR2:[0-9]+]], v5, v4, v3
----------------
are these `CHECK` -> `CHECK-DAG` relevant to this change? Or just non-relevant NFC change?If so , please commit before this patch. Thanks.
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D54783/new/
https://reviews.llvm.org/D54783
More information about the llvm-commits
mailing list