[PATCH] D100385: [NFC] Add tests for scalable vectorization of loops with in-order reductions

Thu Apr 15 08:35:37 PDT 2021

david-arm added inline comments.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll:40
+; CHECK: %[[RDX3:.*]] = call float @llvm.vector.reduce.fadd.nxv8f32(float %[[RDX2]], <vscale x 8 x float> %[[LOAD3]])
+; CHECK: %[[RDX4:.*]] = call float @llvm.vector.reduce.fadd.nxv8f32(float %[[RDX3]], <vscale x 8 x float> %[[LOAD4]])
+; CHECK: for.end
----------------
I think this should just be `[[RDX4]]`

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll:69
+; CHECK: %[[STEPVEC1:.*]] = call <vscale x 4 x i64> @llvm.experimental.stepvector.nxv4i64()
+; CHECK: %[[STEP_SHL:.*]] = shl <vscale x 4 x i64> %7, shufflevector (<vscale x 4 x i64> insertelement (<vscale x 4 x i64> undef, i64 1, i32 0), <vscale x 4 x i64> undef, <vscale x 4 x i32> zeroinitializer)
+; CHECK: vector.body
----------------
nit: Should this be `[[STEPVEC1]]` instead of `%7`?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll:71
+; CHECK: vector.body
+; CHECK: %[[VEC_PHI1:.*]] = phi float [ %[[LOAD2]], %vector.ph ], [ %[[RDX2:.*]], %vector.body ]
+; CHECK: %[[VEC_PHI2:.*]] = phi float [ %[[LOAD1]], %vector.ph ], [ %[[RDX1:.*]], %vector.body ]
----------------
Perhaps this should be named `VEC_PHI2` to match the `LOAD2` and same for the PHI below?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll:73
+; CHECK: %[[VEC_PHI2:.*]] = phi float [ %[[LOAD1]], %vector.ph ], [ %[[RDX1:.*]], %vector.body ]
+; CHECK: %[[VEC_IND:.*]] = phi <vscale x 4 x i64> [ %[[STEP_SHL]], %vector.ph ], [ %[[VEC_IND_NEXT:.*]], %vector.body ]
+; CHECK: %[[GEP1:.*]] = getelementptr inbounds float, float* %b, <vscale x 4 x i64> %[[VEC_IND]]
----------------
nit: I think you can just use `{{.*}}` here instead of `[[VEC_IND_NEXT:.*]]` since you don't reference the variable later on.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll:119
+; CHECK: for.end.loopexit
+; CHECK: %[[EXIT_PHI:.*]] = phi float [ %[[SCALAR:.*]], %for.body ], [ %[[RDX]], %middle.block ]
+; CHECK: for.end
----------------
nit: `SCALAR` isn't used below I think so this can just be `{{.*}}`

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll:154
+; CHECK: %[[MASKED_LOAD:.*]] = call <vscale x 4 x float> @llvm.masked.load.nxv4f32.p0nxv4f32(<vscale x 4 x float>* %8, i32 4, <vscale x 4 x i1> %[[FCMP]], <vscale x 4 x float> poison)
+; CHECK: %[[PRED_PHI:.*]] = select <vscale x 4 x i1> %[[FCMP]], <vscale x 4 x float> %[[MASKED_LOAD]], <vscale x 4 x float> shufflevector (<vscale x 4 x float> insertelement (<vscale x 4 x float> poison, float 3.000000e+00, i32 0), <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer)
+; CHECK: %[[RDX]] = call float @llvm.vector.reduce.fadd.nxv4f32(float %[[VEC_PHI]], <vscale x 4 x float> %[[PRED_PHI]])
----------------
nit: Perhaps better named as `SELECTED_VALS` or something like that, since this isn't really a PHI?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll:206
+; CHECK: middle.block
+; CHECK: %[[RDX:.*]] = call float @llvm.vector.reduce.fadd.nxv8f32(float -0.000000e+00, <vscale x 8 x float> %[[VEC_FADD2]])
+; CHECK: for.body
----------------
Ok, it looks like this test is actually falling back on a non-strict implementation that reorders FP operations. This happens in this case because we are using hints and allowsReordering always return true for hints.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100385/new/

https://reviews.llvm.org/D100385