[PATCH] D81076: [PowerPC] Custom lower rotl v1i128 to vector_shuffle.
EsmeYi via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 8 00:30:06 PDT 2020
Esme marked 7 inline comments as done.
Esme added inline comments.
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:9622
+ if (Op.getValueType() != MVT::v1i128)
+ return SDValue();
+ SDLoc dl(Op);
----------------
shawnl wrote:
> steven.zhang wrote:
> > It will have problem if return SDValue() when lower the ROTL. I would change it as assertion.
> `assert(Op.getValueType() == MVT::v1i128 && "unexpected MVP type")`
Thx~ assertions have been added.
================
Comment at: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:9624
+ SDLoc dl(Op);
+ SDValue N0 = Op.getOperand(0);
+ SDValue N1 = peekThroughBitcasts(Op.getOperand(1));
----------------
lkail wrote:
> I suppose we still need to peekThroughBitcasts Op0.
Thanks for reminding.
================
Comment at: llvm/test/CodeGen/PowerPC/pr45628.ll:142
; CHECK-NOVSX: # %bb.0: # %entry
-; CHECK-NOVSX-NEXT: addis r3, r2, .LCPI7_0 at toc@ha
-; CHECK-NOVSX-NEXT: addi r3, r3, .LCPI7_0 at toc@l
-; CHECK-NOVSX-NEXT: lvx v3, 0, r3
-; CHECK-NOVSX-NEXT: addis r3, r2, .LCPI7_1 at toc@ha
-; CHECK-NOVSX-NEXT: addi r3, r3, .LCPI7_1 at toc@l
-; CHECK-NOVSX-NEXT: vslo v4, v2, v3
-; CHECK-NOVSX-NEXT: vspltb v3, v3, 15
-; CHECK-NOVSX-NEXT: vsl v3, v4, v3
-; CHECK-NOVSX-NEXT: lvx v4, 0, r3
-; CHECK-NOVSX-NEXT: vsro v2, v2, v4
-; CHECK-NOVSX-NEXT: vspltb v4, v4, 15
-; CHECK-NOVSX-NEXT: vsr v2, v2, v4
-; CHECK-NOVSX-NEXT: vor v2, v3, v2
+; CHECK-NOVSX-NEXT: addi r3, r1, -32
+; CHECK-NOVSX-NEXT: stvx v2, 0, r3
----------------
shawnl wrote:
> steven.zhang wrote:
> > lkail wrote:
> > > New sequence looks getting more memory ops than original one.
> > I believe they are using to move the VSR to two GPR. And yes, we need to take a more look to see if there is any better way to handle it.
> Is there any way to only lower to scalar if the value does not need to be in a vector register like it does here? Also, why does the vector code here look better than the version in my PR?:
> ```
>
> swap_with_shift: # @swap_with_shift
> xxspltd 35, 34, 1
> xxswapd 34, 34
> xxlxor 0, 0, 0 <=== the version on left does not need this xor
> xxpermdi 35, 35, 0, 1
> xxpermdi 34, 0, 34, 1
> xxlor 34, 35, 34
> blr
> ```
> Is there any way to only lower to scalar if the value does not need to be in a vector register like it does here?
I will take a look into it, thx.
> why does the vector code here look better than the version in my PR?
Because the llc option -mcpu=pwr9 was added.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D81076/new/
https://reviews.llvm.org/D81076
More information about the llvm-commits
mailing list