[PATCH] D91255: [AArch64] Rearrange mul(dup(sext/zext)) to mul(sext/zext(dup))

Wed Dec 2 07:48:22 PST 2020

NickGuy added inline comments.

================
Comment at: llvm/test/CodeGen/AArch64/aarch64-matrix-smull.ll:4
+
+define void @matrix_mul_signed(i32 %N, i32* nocapture %C, i16* nocapture readonly %A, i16 %val) {
+; CHECK-LABEL: matrix_mul_signed:
----------------
dmgreen wrote:
> I would expect tests that looked something like (but I got and edited this one from mve):
> ```
> define <4 x i32> @vdup_i16(i16 %src) {
> ; CHECK-LABEL: vdup_i16:
> ; CHECK:       @ %bb.0: @ %entry
> ; CHECK-NEXT:    vdup.16 q0, r0
> ; CHECK-NEXT:    bx lr
> entry:
>   %0 = insertelement <4 x i16> undef, i16 %src, i32 0
>   %x = shufflevector <4 x i16> %0, <4 x i16> undef, <4 x i32> zeroinitializer
>   %out = sext <4 x i16> %0 to <4 x i32>
>   ret <4 x i32> %out
> }
> ```
> 
> But for all type and sizes that this transform supports. Which seems to be a lot at the moment. Don't forget scalable types too.
> 
> Having tests that show that mul(sext(...), dup(sext(...))) are also folded sounds useful too, but they can hopefully be equally small.
I've added some tests testing the behaviour for different types and sizes (all generated, so the IR is identical apart from the types). I've omitted support for scalable types, as I encountered some issues when testing them. I plan to make progress with the fixed types first, then revisit scalable types later (Unless it turns out that to support them, I'm missing 1 line somewhere).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91255/new/

https://reviews.llvm.org/D91255