[PATCH] D45522: [PowerPC] fix incorrect vectorization of abs() on POWER9
Hiroshi Inoue via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 18 00:39:54 PDT 2018
inouehrs added inline comments.
================
Comment at: lib/Target/PowerPC/PPCISelDAGToDAG.cpp:4805
+ if (N->getOperand(0).getOpcode() == ISD::SUB) {
+ if (N->getOperand(0)->getOperand(0).getOpcode() == ISD::ZERO_EXTEND &&
+ N->getOperand(0)->getOperand(1).getOpcode() == ISD::ZERO_EXTEND)
----------------
nemanjai wrote:
> Shouldn't this check `ZERO_EXTEND_VECTOR_INREG` as well? Or is that a node we can't have this late?
In my understanding `ZERO_EXTEND_VECTOR_INREG` is created in the legalize phase, while this code is for the initial selection phase. So I think this code will not find `ZERO_EXTEND_VECTOR_INREG` node here.
================
Comment at: lib/Target/PowerPC/PPCISelDAGToDAG.cpp:4810
+
+ if (VecVT == MVT::v4i32) {
+ AbsOpcode = PPC::VABSDUW;
----------------
nemanjai wrote:
> It seems that for the `v4i32` type, we should be able to just use `xvnegsp` rather than loading the immediate, moving and adding.
Do you know it is safe to use a floating point instruction for integer data if the bit pattern is for NaN of Inf?
================
Comment at: lib/Target/PowerPC/PPCISelDAGToDAG.cpp:4828
+ }
+ else if (VecVT == MVT::v16i8) {
+ AbsOpcode = PPC::VABSDUB;
----------------
nemanjai wrote:
> We should just be able to do something like:
> ```
> xxspltib 35, 128 # Mask
> vxor 0, 3, 2 # Flip sign
> vabsduh ... # The actual absdiff
> ```
Good catch. I will update to use `xxspltib`.
VSX splat immediate supports 8-bit immediate while older VMX splat immediate supports only 5 bits.
https://reviews.llvm.org/D45522
More information about the llvm-commits
mailing list