[PATCH] D105199: [LoopVectorize] Fix scalable vector crash in VPReplicateRecipe::execute

Mon Jul 5 01:07:47 PDT 2021

david-arm added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3069

   // End if-block.
   if (IfPredicateInstr)
----------------
CarolineConcatto wrote:
> Do we need this test also for scalable recipe  in scalarizeInstructions 
> 
> ```
>   if (IfPredicateInstr)
> ```
> ?
Good question! So in VPRecipeBuilder::handleReplication I have explicitly asserted that we should never create this recipe with predication, because I think the only cases we'd hit are divides (which we don't support) or loads/stores. For the latter we'll widen the instruction using masked load/store intrinsics and shouldn't use this recipe I think.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8964-8965
+  bool IsScalable = Range.Start.isScalable();
+  assert(IsScalable == Range.End.isScalable() &&
+         "VFRange contains mixture of scalable and fixed-width VFs!");
+  VPRecipeBase *Recipe;
----------------
CarolineConcatto wrote:
> I believe you don't need the assert
> See lib/Transforms/Vectorize/VPlan.h
> struct VFRange;
Ah I see now - good spot thank you! I hadn't realised we already asserted this in the VFRange constructor.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8966
+         "VFRange contains mixture of scalable and fixed-width VFs!");
+  VPRecipeBase *Recipe;
+  if (IsScalable && !IsUniform) {
----------------
CarolineConcatto wrote:
> Why do we need R and Recipe here?
The problem here is caused by complicated multiple inheritance, i.e.

  class VPReplicateRecipe : public VPRecipeBase, public VPValue

and the fact the functions below take pointers to only one of the inherited base classes, i.e.

  void setRecipe(Instruction *I, VPRecipeBase *R) {

  void addVPValue(Value *V, VPValue *VPV)

In order to have a single common block here I would have to restructure the recipes to have the same common base, i.e.

  class VPReplicateBaseRecipe : public VPRecipeBase, public VPValue

  class VPScalableReplicateRecipe : public VPReplicateBaseRecipe {

  class VPReplicateRecipe : public VPReplicateBaseRecipe {

This would mean I could then write:

  VPReplicateBaseRecipe *Recipe;
  if (IsScalable && !IsUniform)
    Recipe = new VPScalableReplicateRecipe(I, Plan->mapToVPValues(I->operands()));
  else
    Recipe = new VPReplicateRecipe(I, Plan->mapToVPValues(I->operands()),
                                    IsUniform, IsPredicated);
  setRecipe(I, R);
  Plan->addVPValue(I, R);

However, I wasn't sure which was more acceptable here - rewrite the class structures to be more complex or simply have two blocks of code here?

Alternatively, if anyone knows some magic C++ that allows me to write a common block with rewriting the class structure that would be great too! I could well be missing some trick here.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8967
+  VPRecipeBase *Recipe;
+  if (IsScalable && !IsUniform) {
+    auto *R =
----------------
CarolineConcatto wrote:
> Should we test if this is not predicated instruction here?
> I don't see any test for that in the recipe.
Ah you're right! I added an assert in the for loop below, but it's not quite the same thing. I'll add one here too.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-vpreplicate.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -loop-vectorize -scalable-vectorization=on -S | FileCheck %s
+
----------------
CarolineConcatto wrote:
> Is it possible to use that with for all architectures? 
> Or only  for now for AArch64?
Possibly so? I thought I might have tried that when I first started the patch and hit issues, but I can try again.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-vpreplicate.ll:136
+!3 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}
+!4 = !{ !5 }
+!5 = distinct !{ !5, !6 }
----------------
CarolineConcatto wrote:
> Why do you need 4,5,6 and 7?
This is needed for one of the loops above that contains metadata:

  tail call void @llvm.experimental.noalias.scope.decl(metadata !4)

This line is explicitly testing one of the code paths in the new scalarizeInstruction() function.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105199/new/

https://reviews.llvm.org/D105199