[PATCH] D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask.

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Mar 19 11:46:50 PDT 2023


dmgreen added a comment.

Hi - I like the idea of generating tbl instructions from non-constant masks. Does the patch need an update to be based against main?



================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10218
+  // directly to TBL1.
+  if (VT.getSimpleVT() != MVT::v16i8 && VT.getSimpleVT() != MVT::v8i8)
     return SDValue();
----------------
This can test against the EVT directly: `VT != MVT::v16i8 && .. `.
Other types could be handled too, by breaking the larger type up into i8 lanes. i.e a i32 would add `[idx*4,idx*4+1,idx*4+2,idx*4+3]` for example, if you get my meaning.
I'm not sure if that is worth adding to this patch, or handling in another but it would be good to keep in mind.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10241
 
     ConstantSDNode *ConstantNode = cast<ConstantSDNode>(Operand.getOperand(1));
     AndMaskConstants.push_back(SDValue(ConstantNode, 0));
----------------
This doesn't need the cast any more.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10258
 
     if (MaskSource.getOpcode() != ISD::EXTRACT_VECTOR_ELT)
       return SDValue();
----------------
Should something check that the inner extract element (MaskSource) has an index equal to i?


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10279
+    SourceVec =
+        DAG.getNode(ISD::CONCAT_VECTORS, dl, MVT::v16i8, SourceVec, SourceVec);
+
----------------
You can concat with undef if all the needed elements are in the bottom half.


================
Comment at: llvm/test/CodeGen/AArch64/neon-shuffle-vector-tbl.ll:12
 
   %masked_mask = and <16 x i8> %mask, <i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15, i8 15>
   %1 = extractelement <16 x i8> %masked_mask, i64 0
----------------
It would be good to have a test case without this And mask, just using %mask directly.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146212/new/

https://reviews.llvm.org/D146212



More information about the llvm-commits mailing list