[llvm] 816663a - [SVE] In LoopIdiomRecognize::isLegalStore bail out for scalable vectors
David Sherwood via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 14 03:29:16 PDT 2020
Author: David Sherwood
Date: 2020-09-14T11:28:31+01:00
New Revision: 816663adb5f1362597c9b2947586e0847c5cdf9b
URL: https://github.com/llvm/llvm-project/commit/816663adb5f1362597c9b2947586e0847c5cdf9b
DIFF: https://github.com/llvm/llvm-project/commit/816663adb5f1362597c9b2947586e0847c5cdf9b.diff
LOG: [SVE] In LoopIdiomRecognize::isLegalStore bail out for scalable vectors
The function LoopIdiomRecognize::isLegalStore looks for stores in loops
that could be transformed into memset or memcpy. However, the algorithm
currently requires that we know how big the store is at runtime, i.e.
that the store size will not overflow an unsigned integer. For scalable
vectors we cannot guarantee this so I have changed the code to bail out
for now. In addition, even if we add a way to query the maximum value of
vscale in future we will still need to update the algorithm to cope with
non-constant strides. The additional cost associated with calculating
the memset and memcpy arguments will need to be taken into account as
well.
This patch also fixes up an implicit TypeSize -> uint64_t cast,
thereby removing a warning. I've added tests here showing a fixed
width vector loop being transformed into memcpy, and a scalable
vector loop remaining unchanged:
Transforms/LoopIdiom/memcpy-vectors.ll
Differential Revision: https://reviews.llvm.org/D87439
Added:
llvm/test/Transforms/LoopIdiom/memcpy-vectors.ll
Modified:
llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
Removed:
################################################################################
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 011d6f487742..147ccc939ac9 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -468,8 +468,11 @@ LoopIdiomRecognize::isLegalStore(StoreInst *SI) {
Value *StorePtr = SI->getPointerOperand();
// Reject stores that are so large that they overflow an unsigned.
- uint64_t SizeInBits = DL->getTypeSizeInBits(StoredVal->getType());
- if ((SizeInBits & 7) || (SizeInBits >> 32) != 0)
+ // When storing out scalable vectors we bail out for now, since the code
+ // below currently only works for constant strides.
+ TypeSize SizeInBits = DL->getTypeSizeInBits(StoredVal->getType());
+ if (SizeInBits.isScalable() || (SizeInBits.getFixedSize() & 7) ||
+ (SizeInBits.getFixedSize() >> 32) != 0)
return LegalStoreKind::None;
// See if the pointer expression is an AddRec like {base,+,1} on the current
diff --git a/llvm/test/Transforms/LoopIdiom/memcpy-vectors.ll b/llvm/test/Transforms/LoopIdiom/memcpy-vectors.ll
new file mode 100644
index 000000000000..b4445c70cb57
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/memcpy-vectors.ll
@@ -0,0 +1,53 @@
+; RUN: opt -loop-idiom -S <%s 2>%t | FileCheck %s
+; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
+
+; If this check fails please read test/CodeGen/AArch64/README for instructions on how to resolve it.
+; WARN-NOT: warning
+
+define void @memcpy_fixed_vec(i64* noalias %a, i64* noalias %b) local_unnamed_addr #1 {
+; CHECK-LABEL: @memcpy_fixed_vec(
+; CHECK: entry:
+; CHECK: memcpy
+; CHECK: vector.body
+entry:
+ br label %vector.body
+
+vector.body: ; preds = %vector.body, %entry
+ %index = phi i64 [ 0, %entry ], [ %index.next, %vector.body ]
+ %0 = getelementptr inbounds i64, i64* %a, i64 %index
+ %1 = bitcast i64* %0 to <2 x i64>*
+ %wide.load = load <2 x i64>, <2 x i64>* %1, align 8
+ %2 = getelementptr inbounds i64, i64* %b, i64 %index
+ %3 = bitcast i64* %2 to <2 x i64>*
+ store <2 x i64> %wide.load, <2 x i64>* %3, align 8
+ %index.next = add nuw nsw i64 %index, 2
+ %4 = icmp eq i64 %index.next, 1024
+ br i1 %4, label %for.cond.cleanup, label %vector.body
+
+for.cond.cleanup: ; preds = %vector.body
+ ret void
+}
+
+define void @memcpy_scalable_vec(i64* noalias %a, i64* noalias %b) local_unnamed_addr #1 {
+; CHECK-LABEL: @memcpy_scalable_vec(
+; CHECK: entry:
+; CHECK-NOT: memcpy
+; CHECK: vector.body
+entry:
+ br label %vector.body
+
+vector.body: ; preds = %vector.body, %entry
+ %index = phi i64 [ 0, %entry ], [ %index.next, %vector.body ]
+ %0 = bitcast i64* %a to <vscale x 2 x i64>*
+ %1 = getelementptr inbounds <vscale x 2 x i64>, <vscale x 2 x i64>* %0, i64 %index
+ %wide.load = load <vscale x 2 x i64>, <vscale x 2 x i64>* %1, align 16
+ %2 = bitcast i64* %b to <vscale x 2 x i64>*
+ %3 = getelementptr inbounds <vscale x 2 x i64>, <vscale x 2 x i64>* %2, i64 %index
+ store <vscale x 2 x i64> %wide.load, <vscale x 2 x i64>* %3, align 16
+ %index.next = add nuw nsw i64 %index, 1
+ %4 = icmp eq i64 %index.next, 1024
+ br i1 %4, label %for.cond.cleanup, label %vector.body
+
+for.cond.cleanup: ; preds = %vector.body
+ ret void
+}
More information about the llvm-commits
mailing list