[PATCH] D148347: [AArch64] Handle vector with two different values

JinGu Kang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 5 07:24:48 PDT 2023

jaykang10 added inline comments.

Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:12545
+    //    t30: v8i8 = AArch64ISD::DUP t26
+    //  t31: v8i8 = vector_shuffle<0,0,0,0,8,8,8,8> t28, t30
+    if (NumElts >= 8) {
dmgreen wrote:
> jaykang10 wrote:
> > dmgreen wrote:
> > > Would it help if this mask is `<0,1,2,3,12,13,14,15>` or `<0,1,2,3,8,9,10,11>`? I'm not sure it would help at the moment, but this case with i8's could use 's' lane inserts to avoid the tbl. It wouldn't help in general though.
> > Thanks for comment.
> > This code handles the vector with only two different values so the case with the mask <0,1,2,3,12,13,14,15> and <0,1,2,3,8,9,10,11> will not meet this code.
> The idea is that the two original values have been dup'd to all elements of DUP1 and DUP2. So the value in lane 0 should be the same as in lane 1..7, and the value in lane 8 (lane 0 of DUP2) will be the same as 9..15. So we can chose any element in those vectors. And if we pick <0,1,2,3> there is a chance to convert that to a 'S' reg lane move without requiring the tbl. It looks like that might already happen if the lanes were sequential.
> Having said that, if the type is a float then the value should already be in lane 0, and the DUP's become unnecessary. It should be able to just use a tbl directly.
Ah, I understand what you said now.
Yep, we can choose the element for the mask because the values of the lanes are same.
Let me check which mask can be lowered well on `LowerVECTOR_SHUFFLE`.
Thanks for good comment.

  rG LLVM Github Monorepo



More information about the llvm-commits mailing list