[PATCH] D113376: [AArch64][SVE] Lower shuffles to permute instructions: zip1/2, uzp1/2, trn1/2

weiwei via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 9 00:55:02 PST 2021


wwei added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:9678-9681
+  if (isZIPMask(M, VT, WhichResult)) {
+    unsigned Opc = (WhichResult == 0) ? AArch64ISD::ZIP1 : AArch64ISD::ZIP2;
+    return DAG.getNode(Opc, dl, V1.getValueType(), V1, V2);
+  }
----------------
paulwalker-arm wrote:
> wwei wrote:
> > paulwalker-arm wrote:
> > > I don't believe this logic is universally safe.  The use of ZIP2 specifically relies on knowing the exact size of the register to know which indices represent the "high half" of a vector register.  This is something that is only known when `sve-vector-bits-min==sve-vector-bits-max`.
> > > 
> > > I also suspect functions like isZIPMask were written when only 64 and 128 bit legal vectors existed.  I doubt the logic holds for the case when a vector is legal but not the exact size of the target vector register, as is the case with the SVE fixed length support.
> > > 
> > > I have not looked into the extent at which the other shuffles are affected but I suspect each have their own complexity.
> > Thanks, I get it. It seems that there are potential problems for the fixed-length shuffle vector lowering, include the patch https://reviews.llvm.org/D105289. I will update a new revision later to fix it by adding the check: `sve-vector-bits-min==sve-vector-bits-max`
> I believe that patch is ok given the restrictions it imposes.  It is fine using the `is###Mask` functions as a way to identify a named shuffle.  What is unsafe is assuming `is###Mask` means the `###` instruction can be used directly when converted to scalable vectors.
> 
> So taking D105289 you can see that is uses `isEXTMask` to determine the type of shuffle but then the actually lowering code only handles a specific case and that case does not emit an `EXT` instruction as that would have fallen into the same trap.
Yeah, you are right. The logic in D105289  is correct, using `EXTRACT_VECTOR_ELT` and `INSR` instructions to match a specific `EXT` pattern, it's a smart solution.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113376/new/

https://reviews.llvm.org/D113376



More information about the llvm-commits mailing list