[PATCH] D95363: [SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse

Fri Mar 5 03:32:37 PST 2021

david-arm added inline comments.

================
Comment at: llvm/include/llvm/IR/IRBuilder.h:2405

+  Value *CreateVectorReverse(Value *V, const Twine &Name = "") {
+    auto Ty = cast<VectorType>(V->getType());
----------------
nit: This function looks a bit too complex to live in a header - maybe better defined in the .cpp file?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1154

+/// Return the runtime value for VF.
+Value *getRuntimeVF(IRBuilder<> &B, Type *Ty, ElementCount VF) {
----------------
nit: One of my patches has already landed with this function so I think you'll need to remove it when committing.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2826
+          Builder.CreateMul(Builder.getInt32(-Part),
+                            getRuntimeVF(Builder, Builder.getInt32Ty(), VF));
+      // LastLane = 1 - (/*getRunTimeVF*/ VScale * VF.getKnownMinValue())
----------------
nit: Maybe instead of calling getRuntimeVF twice you can call it once and assign it to a variable?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll:24
+
+define void @vector_reverse_mask_nxv2i1(double* %a, double* %cond, i64 %N) #0 {
+; CHECK-LABEL: vector.body:
----------------
nit: Perhaps better renamed to `nxv4i1` since that's what the generated code looks like?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll:24
+
+define void @vector_reverse_mask_v2i1(double* %a, double* %cond, i64 %N) #0 {
+; CHECK-LABEL: vector.body:
----------------
nit: Perhaps better renamed to `v4i1` since the CHECK lines have that type?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll:27
+; CHECK: %[[REVERSE6:.*]] = shufflevector <4 x i1> %{{.*}}, <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
+; CHECK: %[[WIDEMSKLOAD:.*]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* nonnull %{{.*}}, i32 8, <4 x i1> %[[REVERSE6]], <4 x double> poison)
+
----------------
This seems a bit less well tested than the sve version - perhaps it's worth adding CHECK lines for the store too? I assume the store will reuse the same mask?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse.ll:29
+; CHECK-NEXT: %[[CAST:.*]] = bitcast double* %[[GEP4]] to <8 x double>*
+; CHECK-NEXT:  store <8 x double> %reverse6, <8 x double>* %[[CAST]], align 8
+
----------------
nit: %[[REVERSE6]]?

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse.ll:64
+; CHECK-NEXT: %[[CAST1:.*]] = bitcast i64* %[[GEP4]] to <8 x i64>*
+; CHECK-NEXT:  store <8 x i64> %reverse6, <8 x i64>* %[[CAST1]], align 8
+
----------------
nit: %[[REVERSE6]]

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95363/new/

https://reviews.llvm.org/D95363