[PATCH] D125233: [AArch64][SVE] Convert SRSHL to LSL when the fed from an ABS intrinsic

Tue May 10 08:37:33 PDT 2022

david-arm added a comment.

Hi @bsmith, this looks like a sensible optimisation! I suppose we can also do something similar when the input is an `and` too? i.e.

  %mask = tail call <vscale x 8 x i16> @llvm.aarch64.sve.dup.x.nxv8i16(i16 0x7FFF)
  %and = tail call <vscale x 8 x i16> @llvm.aarch64.sve.and.nxv8i16(<vscale x 8 x i1> %pg, <vscale x 8 x i16> %a, <vscale x 8 x i16> %mask)
  %splat = tail call <vscale x 8 x i16> @llvm.aarch64.sve.dup.x.nxv8i16(i16 2)
  %shr = tail call <vscale x 8 x i16> @llvm.aarch64.sve.srshl.nxv8i16(<vscale x 8 x i1> %pg, <vscale x 8 x i16> %and, <vscale x 8 x i16> %splat)

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:1230
+
+  // Convert SQSHL into the simpler LSL intrinsic when fed by an ABS intrinsic.
+  Value *AbsPred, *MergedValue;
----------------
Do you mean SRSHL instead of SQSHL in all the comments?

================
Comment at: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-abs-srshl.ll:42
+}
+
+define <vscale x 8 x i16> @srshl_abs_negative_merge(<vscale x 8 x i16> %a, <vscale x 8 x i1> %pg, <vscale x 8 x i1> %pg2) #0 {
----------------
nit: Maybe it's worth splitting the tests out into Positive and Negative tests, i.e. top half positive, bottom half negative? I think that makes it a bit easier to see what's going on.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125233/new/

https://reviews.llvm.org/D125233