[PATCH] D88742: [AArch64] Identify SAD pattern for v16i8 type

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 9 01:17:57 PDT 2020


dmgreen added a comment.

> Currently the patterns written are for v16i8 to v16i16 and v16i8 to v16i32 types. I am not sure what are the other types needed to support along with this as there are **at least 24 more patterns** if we consider signed and floating points excluding the vabdl ones which are already in td files (Reference : https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics?search=vabd).
>
> There are 12 type pair possible for unsigned types alone:
>
> 1. v8i8  to i16,i32,i64 vector types with same number of elements
> 2. v16i8 to i16,i32,i64 vector types
> 3. v4i16 to i32,i64 vector types
> 4. v8i16 to i32, i64 vector types
> 5. v2i32 to v2i64 vector type
> 6. v4i32 to v4i64 vector type
>
> I have handled the current patterns to solve the bug mentioned in the ticket. Which other types do you have in mind?

If we are adding UABD/SABD pattern matching I would expect us to generalize it to support for all the types that instruction supports. You can ignore UABDL though, I agree that is a separate issue (and seems to be handled elsewhere).

The "to" type isn't very important - it just needs to be big enough to make the results equivalent. I think in this case that's only 1 or 2 bits, and it could even not be a power 2 (although that usually isn't very important to support).  The "from" type would be i8, i16 or i32, handling either 64 or 128bit vectors. I think if we effectively ignore the "to" type (because it gets handled by the ZERO_EXTEND), what you have here is already pretty close.



================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:11051
+// This is useful as it is the input into a SAD pattern.
+static bool detectZextAbsDiff(const SDValue &Abs, SDValue &Op0, SDValue &Op1) {
+  SDValue AbsOp1 = Abs->getOperand(0);
----------------
You might want to simplify this and keep it in a single performABSCombine function. 


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.h:225
+  // Absolute difference
+  UABD,
+  SABD,
----------------
Can you move this past SRHADD? SHADD and SRHADD deserve to live next to each other.


================
Comment at: llvm/test/CodeGen/AArch64/arm64-vabs.ll:165
+; CHECK: uabd.16b
+  %aload = load <16 x i8>, <16 x i8>* %a, align 1
+  %bload = load <16 x i8>, <16 x i8>* %b, align 1
----------------
The tests needn't do this load if you pass the value as a function argument directly.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88742/new/

https://reviews.llvm.org/D88742



More information about the llvm-commits mailing list