[llvm] [LoopVectorize] LLVM fails to vectorise loops with multi-bool varables (PR #89226)

David Sherwood via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 14 01:37:07 PDT 2024


================
@@ -47,3 +43,47 @@ for.cond.cleanup.loopexit:                        ; preds = %for.body
 for.cond.cleanup:                                 ; preds = %for.cond.cleanup.loopexit, %entry
   ret void
 }
+
+define i32 @multi_user_cmp(ptr readonly %a, i32 noundef %n) {
+; CHECK: LV: Found an estimated cost of 0 for VF 16 For instruction:   %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
+; CHECK-NEXT: LV: Found an estimated cost of 0 for VF 16 For instruction:   %all.0.off010 = phi i1 [ true, %entry ], [ %all.0.off0., %for.body ]
+; CHECK-NEXT: LV: Found an estimated cost of 0 for VF 16 For instruction:   %any.0.off09 = phi i1 [ false, %entry ], [ %.any.0.off0, %for.body ]
+; CHECK-NEXT: LV: Found an estimated cost of 0 for VF 16 For instruction:   %arrayidx = getelementptr inbounds float, ptr %a, i64 %indvars.iv
+; CHECK-NEXT: LV: Found an estimated cost of 4 for VF 16 For instruction:   %load1 = load float, ptr %arrayidx, align 4
+; CHECK-NEXT: LV: Found an estimated cost of 4 for VF 16 For instruction:   %cmp1 = fcmp olt float %load1, 0.000000e+00
+; CHECK-NEXT: LV: Found an estimated cost of 1 for VF 16 For instruction:   %.any.0.off0 = select i1 %cmp1, i1 true, i1 %any.0.off09
+; CHECK-NEXT: LV: Found an estimated cost of 1 for VF 16 For instruction:   %all.0.off0. = select i1 %cmp1, i1 %all.0.off010, i1 false
+; CHECK-NEXT: LV: Found an estimated cost of 1 for VF 16 For instruction:   %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+; CHECK-NEXT: LV: Found an estimated cost of 1 for VF 16 For instruction:   %exitcond.not = icmp eq i64 %indvars.iv.next, %wide.trip.count
+; CHECK-NEXT: LV: Found an estimated cost of 0 for VF 16 For instruction:   br i1 %exitcond.not, label %exit, label %for.body
+; CHECK-NEXT: LV: Vector loop of width 16 costs: 0.
+entry:
+  %wide.trip.count = zext nneg i32 %n to i64
+  br label %for.body
+
+for.body:
+  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
+  %all.0.off010 = phi i1 [ true, %entry ], [ %all.0.off0., %for.body ]
+  %any.0.off09 = phi i1 [ false, %entry ], [ %.any.0.off0, %for.body ]
+  %arrayidx = getelementptr inbounds float, ptr %a, i64 %indvars.iv
+  %load1 = load float, ptr %arrayidx, align 4
+  %cmp1 = fcmp olt float %load1, 0.000000e+00
+  %.any.0.off0 = select i1 %cmp1, i1 true, i1 %any.0.off09
+  %all.0.off0. = select i1 %cmp1, i1 %all.0.off010, i1 false
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond.not = icmp eq i64 %indvars.iv.next, %wide.trip.count
+  br i1 %exitcond.not, label %exit, label %for.body
+
+exit:
+  %0 = select i1 %.any.0.off0, i32 2, i32 3
+  %1 = select i1 %all.0.off0., i32 1, i32 %0
+  ret i32 %1
+}
+
+; CHECK-LABEL: define void @selects_1(
----------------
david-arm wrote:

I see why you had to move these CHECK lines. Perhaps you don't really need them at all? Since you're collecting the debug output from the vectoriser anyway, you can simply CHECK for the chosen VF, i.e.

```
LV: Found an estimated cost of 1 for VF 16 For instruction:   %.any.0.off0 = select i1 %cmp1, i1 true, i1 %any.0.off09
LV: Found an estimated cost of 1 for VF 16 For instruction:   %all.0.off0. = select i1 %cmp1, i1 %all.0.off010, i1 false
...
LV: Selecting VF: 16
```

This way you don't need to check any lines of IR and might simplify the test.

https://github.com/llvm/llvm-project/pull/89226


More information about the llvm-commits mailing list