[llvm] Subject: [PATCH] [AArch64ISelLowering] Optimize rounding shift and saturation truncation (PR #74325)
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 14 05:56:27 PST 2023
================
@@ -95,9 +95,9 @@ entry:
define <16 x i8> @rshrn_v16i16_8(<16 x i16> %a) {
; CHECK-LABEL: rshrn_v16i16_8:
; CHECK: // %bb.0: // %entry
-; CHECK-NEXT: movi v2.2d, #0000000000000000
-; CHECK-NEXT: raddhn v0.8b, v0.8h, v2.8h
-; CHECK-NEXT: raddhn2 v0.16b, v1.8h, v2.8h
+; CHECK-NEXT: urshr v1.8h, v1.8h, #8
----------------
david-arm wrote:
On the surface this looks worse than before. raddhn has a latency of 2, throughput of 4 on neoverse-v1, whereas urshr has a latency of 4 and throughput of 2. I think the original code would likely be faster. Not sure if there is an easy way of keeping the old version here?
https://github.com/llvm/llvm-project/pull/74325
More information about the llvm-commits
mailing list