[llvm] [LV] Use ICMP_UGE for BranchOnCount when VF is scalable (PR #102575)

Pengcheng Wang via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 19 00:58:11 PDT 2024


wangpc-pp wrote:

> Why can't this be fixed in SCEV?
> 
> > Post it here to see if it is the right way to fix this isssue in LV, because I can't fix it in SCEV.
> 
> could you elaborate on why this can't be fixed in SCEV?

Yes. IIRC, it's because we can't analyse the right exit limit for `ICMP_NE` and `ICMP_EQ` cases in `ScalarEvolution::computeExitLimitFromICmp` when the exit condition is a comparison of a scalable value:
https://github.com/llvm/llvm-project/blob/985d64b03accbed8500a85372d716367d89b61be/llvm/lib/Analysis/ScalarEvolution.cpp#L9182-L9215
Before this patch:
```llvm
vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %9 = getelementptr inbounds float, ptr %b, i64 %index
  %wide.load = load <vscale x 4 x float>, ptr %9, align 4, !tbaa !9
  %10 = fadd <vscale x 4 x float> %wide.load, shufflevector (<vscale x 4 x float> insertelement (<vscale x 4 x float> poison, float 1.000000e+00, i64 0), <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer)
  %11 = getelementptr inbounds float, ptr %a, i64 %index
  store <vscale x 4 x float> %10, ptr %11, align 4, !tbaa !9
  %index.next = add nuw i64 %index, %8
  %12 = icmp eq i64 %index.next, %n.vec
  br i1 %12, label %middle.block, label %vector.body, !llvm.loop !13
```
```
Loop %for.body: backedge-taken count is (-1 + (zext i32 %n to i64) + (-1 * %indvars.iv.ph)<nsw>)
Loop %for.body: constant max backedge-taken count is i64 -1
Loop %for.body: symbolic max backedge-taken count is (-1 + (zext i32 %n to i64) + (-1 * %indvars.iv.ph)<nsw>)
Loop %for.body: Trip multiple is 1
Loop %vector.body: backedge-taken count is (((-4 * vscale)<nsw> + %n.vec) /u (4 * vscale)<nuw><nsw>)
Loop %vector.body: constant max backedge-taken count is i64 2305843009213693951
Loop %vector.body: symbolic max backedge-taken count is (((-4 * vscale)<nsw> + %n.vec) /u (4 * vscale)<nuw><nsw>)
Loop %vector.body: Trip multiple is 1
```

After this patch:
```llvm
vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %9 = getelementptr inbounds float, ptr %b, i64 %index
  %wide.load = load <vscale x 4 x float>, ptr %9, align 4, !tbaa !9
  %10 = fadd <vscale x 4 x float> %wide.load, shufflevector (<vscale x 4 x float> insertelement (<vscale x 4 x float> poison, float 1.000000e+00, i64 0), <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer)
  %11 = getelementptr inbounds float, ptr %a, i64 %index
  store <vscale x 4 x float> %10, ptr %11, align 4, !tbaa !9
  %index.next = add nuw i64 %index, %8
  %.not = icmp ult i64 %index.next, %n.vec
  br i1 %.not, label %vector.body, label %middle.block, !llvm.loop !13
```
```
Loop %for.body: backedge-taken count is (-1 + (zext i32 %n to i64) + (-1 * %indvars.iv.ph)<nsw>)
Loop %for.body: constant max backedge-taken count is i64 -1
Loop %for.body: symbolic max backedge-taken count is (-1 + (zext i32 %n to i64) + (-1 * %indvars.iv.ph)<nsw>)
Loop %for.body: Trip multiple is 1
Loop %vector.body: backedge-taken count is ((-1 + ((4 * vscale)<nuw><nsw> umax %n.vec))<nsw> /u (4 * vscale)<nuw><nsw>)
Loop %vector.body: constant max backedge-taken count is i64 268435455
Loop %vector.body: symbolic max backedge-taken count is ((-1 + ((4 * vscale)<nuw><nsw> umax %n.vec))<nsw> /u (4 * vscale)<nuw><nsw>)
Loop %vector.body: Trip multiple is 1
```

https://github.com/llvm/llvm-project/pull/102575


More information about the llvm-commits mailing list