[PATCH] D91255: [AArch64] Rearrange mul(dup(sext/zext)) to mul(sext/zext(dup))
Nicholas Guy via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 2 07:48:22 PST 2020
NickGuy added inline comments.
================
Comment at: llvm/test/CodeGen/AArch64/aarch64-matrix-smull.ll:4
+
+define void @matrix_mul_signed(i32 %N, i32* nocapture %C, i16* nocapture readonly %A, i16 %val) {
+; CHECK-LABEL: matrix_mul_signed:
----------------
dmgreen wrote:
> I would expect tests that looked something like (but I got and edited this one from mve):
> ```
> define <4 x i32> @vdup_i16(i16 %src) {
> ; CHECK-LABEL: vdup_i16:
> ; CHECK: @ %bb.0: @ %entry
> ; CHECK-NEXT: vdup.16 q0, r0
> ; CHECK-NEXT: bx lr
> entry:
> %0 = insertelement <4 x i16> undef, i16 %src, i32 0
> %x = shufflevector <4 x i16> %0, <4 x i16> undef, <4 x i32> zeroinitializer
> %out = sext <4 x i16> %0 to <4 x i32>
> ret <4 x i32> %out
> }
> ```
>
> But for all type and sizes that this transform supports. Which seems to be a lot at the moment. Don't forget scalable types too.
>
> Having tests that show that mul(sext(...), dup(sext(...))) are also folded sounds useful too, but they can hopefully be equally small.
I've added some tests testing the behaviour for different types and sizes (all generated, so the IR is identical apart from the types). I've omitted support for scalable types, as I encountered some issues when testing them. I plan to make progress with the fixed types first, then revisit scalable types later (Unless it turns out that to support them, I'm missing 1 line somewhere).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D91255/new/
https://reviews.llvm.org/D91255
More information about the llvm-commits
mailing list