[PATCH] D98054: [LoopVectorize][SVE] Fix crash when vectorising FP negation
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 5 09:21:46 PST 2021
david-arm created this revision.
david-arm added reviewers: sdesmalen, peterwaller-arm, craig.topper.
Herald added subscribers: psnobl, hiraditya, kristof.beyls, tschuett.
Herald added a reviewer: efriedma.
david-arm requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.
This patch fixes a crash encountered when vectorising the following loop:
void foo(float *dst, float *src, long long n) {
for (long long i = 0; i < n; i++)
dst[i] = -src[i];
}
using scalable vectors. I've added a test to
Transforms/LoopVectorize/AArch64/sve-basic-vec.ll
as well as cleaned up the other tests in the same file.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D98054
Files:
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/test/Transforms/LoopVectorize/AArch64/sve-basic-vec.ll
Index: llvm/test/Transforms/LoopVectorize/AArch64/sve-basic-vec.ll
===================================================================
--- llvm/test/Transforms/LoopVectorize/AArch64/sve-basic-vec.ll
+++ llvm/test/Transforms/LoopVectorize/AArch64/sve-basic-vec.ll
@@ -18,8 +18,7 @@
; CHECK: store <vscale x 4 x i32> [[TMP2]], <vscale x 4 x i32>* {{.*}}, align 4
;
entry:
- %cmp7 = icmp sgt i64 %n, 0
- br i1 %cmp7, label %for.body, label %for.end
+ br label %for.body
for.body: ; preds = %entry, %for.body
%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
@@ -50,8 +49,7 @@
; CHECK: store <vscale x 4 x float> [[TMP2]], <vscale x 4 x float>* {{.*}}, align 4
entry:
- %cmp8 = icmp sgt i64 %n, 0
- br i1 %cmp8, label %for.body, label %for.end
+ br label %for.body
for.body: ; preds = %entry, %for.body
%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
@@ -63,7 +61,33 @@
store float %conv, float* %arrayidx3, align 4
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond.not = icmp eq i64 %indvars.iv.next, %n
- br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !6
+ br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !0
+
+for.end: ; preds = %for.body, %entry
+ ret void
+}
+
+define void @fneg_f32(float* noalias nocapture %a, float* noalias nocapture readonly %b, i64 %n) {
+; CHECK-LABEL: @fneg_f32(
+; CHECK-NEXT: entry:
+; CHECK: vector.body:
+; CHECK: [[WIDE_LOAD:%.*]] = load <vscale x 4 x float>, <vscale x 4 x float>* {{.*}}, align 4
+; CHECK-NEXT: [[TMP1:%.*]] = fneg <vscale x 4 x float> [[WIDE_LOAD]]
+; CHECK: store <vscale x 4 x float> [[TMP1]], <vscale x 4 x float>* {{.*}}, align 4
+
+entry:
+ br label %for.body
+
+for.body: ; preds = %entry, %for.body
+ %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
+ %arrayidx = getelementptr inbounds float, float* %b, i64 %indvars.iv
+ %0 = load float, float* %arrayidx, align 4
+ %fneg = fneg float %0
+ %arrayidx3 = getelementptr inbounds float, float* %a, i64 %indvars.iv
+ store float %fneg, float* %arrayidx3, align 4
+ %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+ %exitcond.not = icmp eq i64 %indvars.iv.next, %n
+ br i1 %exitcond.not, label %for.end, label %for.body, !llvm.loop !0
for.end: ; preds = %for.body, %entry
ret void
@@ -75,4 +99,3 @@
!3 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}
!4 = !{!"llvm.loop.interleave.count", i32 1}
!5 = !{!"llvm.loop.vectorize.enable", i1 true}
-!6 = distinct !{!6, !1, !2, !3, !4, !5}
Index: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
===================================================================
--- llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7381,8 +7381,11 @@
Op2VK, TargetTransformInfo::OP_None, Op2VP, Operands, I);
}
case Instruction::FNeg: {
- assert(!VF.isScalable() && "VF is assumed to be non scalable.");
- unsigned N = isScalarAfterVectorization(I, VF) ? VF.getKnownMinValue() : 1;
+ unsigned N = 1;
+ if (isScalarAfterVectorization(I, VF)) {
+ assert(!VF.isScalable() && "VF is assumed to be non scalable.");
+ N = VF.getKnownMinValue();
+ }
return N * TTI.getArithmeticInstrCost(
I->getOpcode(), VectorTy, CostKind,
TargetTransformInfo::OK_AnyValue,
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D98054.328551.patch
Type: text/x-patch
Size: 3645 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210305/92256294/attachment.bin>
More information about the llvm-commits
mailing list