[llvm] [AArch64] recognise trn1/trn2 with flipped operands (PR #169858)
Philip Ginsbach-Chen via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 2 14:27:27 PST 2025
================
@@ -246,6 +246,87 @@ define <4 x float> @vtrnQf(ptr %A, ptr %B) nounwind {
ret <4 x float> %tmp5
}
+define <8 x i8> @vtrni8_8first(ptr %A, ptr %B) nounwind {
+; CHECKLE-LABEL: vtrni8_8first:
+; CHECKLE: // %bb.0:
+; CHECKLE-NEXT: ldr d0, [x0]
+; CHECKLE-NEXT: ldr d1, [x1]
+; CHECKLE-NEXT: trn1 v2.8b, v1.8b, v0.8b
+; CHECKLE-NEXT: trn2 v0.8b, v0.8b, v1.8b
+; CHECKLE-NEXT: add v0.8b, v2.8b, v0.8b
+; CHECKLE-NEXT: ret
+;
+; CHECKBE-LABEL: vtrni8_8first:
+; CHECKBE: // %bb.0:
+; CHECKBE-NEXT: ld1 { v0.8b }, [x0]
+; CHECKBE-NEXT: ld1 { v1.8b }, [x1]
+; CHECKBE-NEXT: trn1 v2.8b, v1.8b, v0.8b
+; CHECKBE-NEXT: trn2 v0.8b, v0.8b, v1.8b
+; CHECKBE-NEXT: add v0.8b, v2.8b, v0.8b
+; CHECKBE-NEXT: rev64 v0.8b, v0.8b
+; CHECKBE-NEXT: ret
+ %tmp1 = load <8 x i8>, ptr %A
+ %tmp2 = load <8 x i8>, ptr %B
----------------
ginsbach wrote:
I was trying to match surrounding test cases as closely as possible, all of which seem to have the load instructions and slightly broken formatting. I gladly simplified and reformatted in commit 2.
https://github.com/llvm/llvm-project/pull/169858
More information about the llvm-commits
mailing list