[PATCH] D95363: [SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse

Mon Mar 8 05:37:42 PST 2021

CarolineConcatto added a comment.

Hey @david-arm sorry for the nit in the tests.
Usually happens when I do copy and paste many times.
I've updated the patch and rebased it with the main.

Carol

================
Comment at: llvm/include/llvm/IR/IRBuilder.h:2405

+  Value *CreateVectorReverse(Value *V, const Twine &Name = "") {
+    auto Ty = cast<VectorType>(V->getType());
----------------
david-arm wrote:
> nit: This function looks a bit too complex to live in a header - maybe better defined in the .cpp file?
Thank you David.
I hadn't noticed that this could be also implemented in IRBuilder.cpp.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1154

+/// Return the runtime value for VF.
+Value *getRuntimeVF(IRBuilder<> &B, Type *Ty, ElementCount VF) {
----------------
david-arm wrote:
> nit: One of my patches has already landed with this function so I think you'll need to remove it when committing.
Wonderfull! Good to know!

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2826
+          Builder.CreateMul(Builder.getInt32(-Part),
+                            getRuntimeVF(Builder, Builder.getInt32Ty(), VF));
+      // LastLane = 1 - (/*getRunTimeVF*/ VScale * VF.getKnownMinValue())
----------------
david-arm wrote:
> nit: Maybe instead of calling getRuntimeVF twice you can call it once and assign it to a variable?
Good idea David, so I can reduce code duplication

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll:24
+
+define void @vector_reverse_mask_nxv2i1(double* %a, double* %cond, i64 %N) #0 {
+; CHECK-LABEL: vector.body:
----------------
david-arm wrote:
> nit: Perhaps better renamed to `nxv4i1` since that's what the generated code looks like?
That is what happens when I do C&P. I've creates tests for all masks sizes, but only submitted one size.
I was wondering, I did not add tests for all mask sizes here. 
Do you think it is valid? 
I guess that one size fits all, if one mask size does not break it means that the rest will not break too.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll:27
+; CHECK: %[[REVERSE6:.*]] = shufflevector <4 x i1> %{{.*}}, <4 x i1> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
+; CHECK: %[[WIDEMSKLOAD:.*]] = call <4 x double> @llvm.masked.load.v4f64.p0v4f64(<4 x double>* nonnull %{{.*}}, i32 8, <4 x i1> %[[REVERSE6]], <4 x double> poison)
+
----------------
david-arm wrote:
> This seems a bit less well tested than the sve version - perhaps it's worth adding CHECK lines for the store too? I assume the store will reuse the same mask?
Ok, I think it is valid to have similar check's for scalable and fixed vectors

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse.ll:29
+; CHECK-NEXT: %[[CAST:.*]] = bitcast double* %[[GEP4]] to <8 x double>*
+; CHECK-NEXT:  store <8 x double> %reverse6, <8 x double>* %[[CAST]], align 8
+
----------------
david-arm wrote:
> nit: %[[REVERSE6]]?
Good catch! 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95363/new/

https://reviews.llvm.org/D95363