[PATCH] D62308: [AArch64] support neon_sshl and neon_ushl in performIntrinsicCombine.

Tue Sep 17 22:21:55 PDT 2019

anemet accepted this revision.
anemet added a comment.
This revision is now accepted and ready to land.

Some minor test questions/suggestions.  Feel free to commit after addressing.

================
Comment at: llvm/test/CodeGen/AArch64/arm64-vshift.ll:1204
+  %tmp1 = load <8 x i8>, <8 x i8>* %A
+  %tmp2 = zext <8 x i8> %tmp1 to <8 x i16>
+  %tmp3 = call <8 x i16> @llvm.aarch64.neon.ushl.v8i16(<8 x i16> %tmp2, <8 x i16> <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1>)
----------------
I don't see any negative tests when we zero-extend not to the next one higher type.

================
Comment at: llvm/test/CodeGen/AArch64/arm64-vshift.ll:1292
+
+define <16 x i8> @neon.sshll16b_constant_shift(<16 x i8>* %A) nounwind {
+;CHECK-LABEL: neon.sshll16b_constant_shift
----------------
technically this is not sshll (long)

================
Comment at: llvm/test/CodeGen/AArch64/arm64-vshift.ll:1318
+
+; FIXME: unnecessary sshll.4s v0, v0, #0?
+define <4 x i32> @neon.sshll4s_neg_constant_shift(<4 x i16>* %A) nounwind {
----------------
Isn't it used for the extensions?

================
Comment at: llvm/test/CodeGen/AArch64/arm64-vshift.ll:1333
+;CHECK-LABEL: neon.sshll4s_constant_fold
+;CHECK: shl.4s v0, {{v[0-9]+}}, #1
+        %tmp3 = call <4 x i32> @llvm.aarch64.neon.sshl.v4i32(<4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32> <i32 1, i32 1, i32 1, i32 1>)
----------------
I think that we should also have other shl tests (.4s non-foldable and perhaps some other sizes).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62308/new/

https://reviews.llvm.org/D62308