[PATCH] D101836: [LoopVectorize] Enable strict reductions when allowReordering() returns false

Kerry McLaughlin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 17 06:04:37 PDT 2021


kmclaughlin added inline comments.


================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll:7
 define float @fadd_strict(float* noalias nocapture readonly %a, i64 %n) {
-; CHECK-LABEL: @fadd_strict
-; CHECK: vector.body:
-; CHECK: %[[VEC_PHI:.*]] = phi float [ 0.000000e+00, %vector.ph ], [ %[[RDX:.*]], %vector.body ]
-; CHECK: %[[LOAD:.*]] = load <vscale x 8 x float>, <vscale x 8 x float>*
-; CHECK: %[[RDX]] = call float @llvm.vector.reduce.fadd.nxv8f32(float %[[VEC_PHI]], <vscale x 8 x float> %[[LOAD]])
-; CHECK: for.end
-; CHECK: %[[PHI:.*]] = phi float [ %[[SCALAR:.*]], %for.body ], [ %[[RDX]], %middle.block ]
-; CHECK: ret float %[[PHI]]
+; CHECK-VF8UF1-LABEL: @fadd_strict
+; CHECK-VF8UF1: vector.body:
----------------
sdesmalen wrote:
> Should all test functions have check lines for all VF8UF1, VF8UF4, etc. ? Conversely, is it sufficient to just pass the interleave-count hint (not the vector width) via metadata and have 1 RUN line for VF8UF1, VF8UF4, VF4UF1?
> 
> Which also makes me wonder, what is the additional value of having both VF8UF1 and VF4UF1 ?
I think it should be sufficient to pass the interleave count via metadata. I've changed VF4UF1 to VF8UF1 as there was no additional benefit in having both, similarly I've changed VF4UF1 in the strict-fadd.ll test as well to reduce the number of RUN lines.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101836/new/

https://reviews.llvm.org/D101836



More information about the llvm-commits mailing list