[llvm] 06a4c85 - Use v16i8 rather than v2i64 as the VT for memset expansion on AArch64.
Owen Anderson via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 19 09:54:15 PDT 2021
Author: Owen Anderson
Date: 2021-08-19T16:54:07Z
New Revision: 06a4c858901d72389e01fbdd0f83b03c74d56831
URL: https://github.com/llvm/llvm-project/commit/06a4c858901d72389e01fbdd0f83b03c74d56831
DIFF: https://github.com/llvm/llvm-project/commit/06a4c858901d72389e01fbdd0f83b03c74d56831.diff
LOG: Use v16i8 rather than v2i64 as the VT for memset expansion on AArch64.
This allows the instruction selector to realize that it can directly
broadcast the low byte of the memset value, rather than replicating
it to a 64-bit GPR before broadcasting.
This fixes PR50985.
Differential Revision: https://reviews.llvm.org/D108354
Added:
llvm/test/CodeGen/AArch64/memset.ll
Modified:
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
Removed:
################################################################################
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 0e5e70475503..2ec9f84f48ec 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -12091,8 +12091,8 @@ EVT AArch64TargetLowering::getOptimalMemOpType(
};
if (CanUseNEON && Op.isMemset() && !IsSmallMemset &&
- AlignmentIsAcceptable(MVT::v2i64, Align(16)))
- return MVT::v2i64;
+ AlignmentIsAcceptable(MVT::v16i8, Align(16)))
+ return MVT::v16i8;
if (CanUseFP && !IsSmallMemset && AlignmentIsAcceptable(MVT::f128, Align(16)))
return MVT::f128;
if (Op.size() >= 8 && AlignmentIsAcceptable(MVT::i64, Align(8)))
diff --git a/llvm/test/CodeGen/AArch64/memset.ll b/llvm/test/CodeGen/AArch64/memset.ll
new file mode 100644
index 000000000000..4d1d2241c05a
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/memset.ll
@@ -0,0 +1,18 @@
+; RUN: llc < %s | FileCheck %s
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-unknown-linux-gnu"
+
+; CHECK: memset_call:
+; CHECK-NOT: and
+; CHECK: dup
+; CHECK-NEXT: stp
+; CHECK-NEXT: stp
+; CHECK-NEXT: ret
+define void @memset_call(i8* %0, i32 %1) {
+ %3 = trunc i32 %1 to i8
+ call void @llvm.memset.p0i8.i64(i8* %0, i8 %3, i64 64, i1 false)
+ ret void
+}
+
+declare void @llvm.memset.p0i8.i64(i8*, i8, i64, i1 immarg)
+
More information about the llvm-commits
mailing list