[PATCH] D106646: [LoopVectorize] Don't interleave scalar ordered reductions for inner loops

Mon Jul 26 07:00:13 PDT 2021

sdesmalen added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:6474
+    // set the limit to 2, and for ordered reductions it's best to disable
+    // interleaving entirely.
     if (HasReductions && TheLoop->getLoopDepth() > 1) {
----------------
I don't think I fully understand //why// disabling interleaving is more profitable than having it enabled when VF=1, but I think you empirically found that having a UF=1 when VF>1 leads to regressions when enabling strict reductions.

This means that with this patch enabling strict reductions by default will no longer lead to regressions, whereas without strict reductions enabled, this loop would not have been vectorized or interleaved in the first place. So this is purely limiting the scope of strict-reductions to avoid regressions.

That approach sounds sensible to me.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll:6
+
+; CHECK-DEBUG: LV: Not interleaving scalar ordered reductions.
+
----------------
This `REQUIRES: asserts` ?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/strict-fadd-vf1.ll:15
+  %0 = shl nuw i64 %M, 2
+  call void @llvm.memset.p0i8.i64(i8* align 4 %dst27, i8 0, i64 %0, i1 false)
+  br label %for.body.us
----------------
is this needed for the test?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106646/new/

https://reviews.llvm.org/D106646