[llvm] [LV] Use ICMP_UGE for BranchOnCount when VF is scalable (PR #102575)
Pengcheng Wang via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 19 00:58:11 PDT 2024
wangpc-pp wrote:
> Why can't this be fixed in SCEV?
>
> > Post it here to see if it is the right way to fix this isssue in LV, because I can't fix it in SCEV.
>
> could you elaborate on why this can't be fixed in SCEV?
Yes. IIRC, it's because we can't analyse the right exit limit for `ICMP_NE` and `ICMP_EQ` cases in `ScalarEvolution::computeExitLimitFromICmp` when the exit condition is a comparison of a scalable value:
https://github.com/llvm/llvm-project/blob/985d64b03accbed8500a85372d716367d89b61be/llvm/lib/Analysis/ScalarEvolution.cpp#L9182-L9215
Before this patch:
```llvm
vector.body: ; preds = %vector.body, %vector.ph
%index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
%9 = getelementptr inbounds float, ptr %b, i64 %index
%wide.load = load <vscale x 4 x float>, ptr %9, align 4, !tbaa !9
%10 = fadd <vscale x 4 x float> %wide.load, shufflevector (<vscale x 4 x float> insertelement (<vscale x 4 x float> poison, float 1.000000e+00, i64 0), <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer)
%11 = getelementptr inbounds float, ptr %a, i64 %index
store <vscale x 4 x float> %10, ptr %11, align 4, !tbaa !9
%index.next = add nuw i64 %index, %8
%12 = icmp eq i64 %index.next, %n.vec
br i1 %12, label %middle.block, label %vector.body, !llvm.loop !13
```
```
Loop %for.body: backedge-taken count is (-1 + (zext i32 %n to i64) + (-1 * %indvars.iv.ph)<nsw>)
Loop %for.body: constant max backedge-taken count is i64 -1
Loop %for.body: symbolic max backedge-taken count is (-1 + (zext i32 %n to i64) + (-1 * %indvars.iv.ph)<nsw>)
Loop %for.body: Trip multiple is 1
Loop %vector.body: backedge-taken count is (((-4 * vscale)<nsw> + %n.vec) /u (4 * vscale)<nuw><nsw>)
Loop %vector.body: constant max backedge-taken count is i64 2305843009213693951
Loop %vector.body: symbolic max backedge-taken count is (((-4 * vscale)<nsw> + %n.vec) /u (4 * vscale)<nuw><nsw>)
Loop %vector.body: Trip multiple is 1
```
After this patch:
```llvm
vector.body: ; preds = %vector.body, %vector.ph
%index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
%9 = getelementptr inbounds float, ptr %b, i64 %index
%wide.load = load <vscale x 4 x float>, ptr %9, align 4, !tbaa !9
%10 = fadd <vscale x 4 x float> %wide.load, shufflevector (<vscale x 4 x float> insertelement (<vscale x 4 x float> poison, float 1.000000e+00, i64 0), <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer)
%11 = getelementptr inbounds float, ptr %a, i64 %index
store <vscale x 4 x float> %10, ptr %11, align 4, !tbaa !9
%index.next = add nuw i64 %index, %8
%.not = icmp ult i64 %index.next, %n.vec
br i1 %.not, label %vector.body, label %middle.block, !llvm.loop !13
```
```
Loop %for.body: backedge-taken count is (-1 + (zext i32 %n to i64) + (-1 * %indvars.iv.ph)<nsw>)
Loop %for.body: constant max backedge-taken count is i64 -1
Loop %for.body: symbolic max backedge-taken count is (-1 + (zext i32 %n to i64) + (-1 * %indvars.iv.ph)<nsw>)
Loop %for.body: Trip multiple is 1
Loop %vector.body: backedge-taken count is ((-1 + ((4 * vscale)<nuw><nsw> umax %n.vec))<nsw> /u (4 * vscale)<nuw><nsw>)
Loop %vector.body: constant max backedge-taken count is i64 268435455
Loop %vector.body: symbolic max backedge-taken count is ((-1 + ((4 * vscale)<nuw><nsw> umax %n.vec))<nsw> /u (4 * vscale)<nuw><nsw>)
Loop %vector.body: Trip multiple is 1
```
https://github.com/llvm/llvm-project/pull/102575
More information about the llvm-commits
mailing list