[PATCH] D111846: [LV] Drop integer poison-generating flags from instructions that need predication

Sun Nov 7 01:16:09 PDT 2021

fhahn added a comment.

Thanks for the update! I think the description/title would need a minor update, as they still talk about 'integer' poison generating flags.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:888
+  /// after vectorization, even when their operands are not poison.
+  SmallPtrSet<VPReplicateRecipe *, 16> MayGeneratePoisonRecipes;
 };
----------------
This is only used during codegen, right? In that case, it would probably be better to just add it to VPTransformState instead. Otherwise it just adds a little bit of additional coupling between ILV and codegen, making it harder to separate in the future.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1223
+            State.VF.isVector() &&
+            Legal->blockNeedsPredication(Instr->getParent()) &&
+            cast<Operator>(Instr)->hasPoisonGeneratingFlags())
----------------
Is there a way to avoid querying information from legal here? 

As we are interested in any operation contributing to an address computation of a masked memory operation, could we only seed the worklist with addresses of masked operations instead (both VPWidenMemoryInstructionRecipe & VPInterleaveRecipe have `hasMask`/`getMask` helpers)? Then we would not need to check legal here, right? 

Checking `Instrs` location here might not be sufficient in any case I think. There could be cases where `Instr` is in a block that does not need predication, but feeds a memory instruction in a block that needs predication. In those case, I think we should also drop the flags from `Instr`. In any case, it might be good to add such a test case.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1229
+      // Add new definitions to the worklist.
+      for (VPValue *operand : cast<VPRecipeBase>(CurDef)->operands())
+        if (VPDef *OpDef = operand->getDef())
----------------
Do we in some cases also need to drop flags from recipes other then VPReplicateRecipe?

If we need to drop flags from recipes in unpredicted blocks (see comment above). then we could have cases where a vector operation (e.g. VPWidenRecipe) feeds the address computation (the first lane will get extracted). An example could be

```
define void @drop_vector(float* noalias nocapture readonly %input,
                                 float* %output, i64* noalias %ints) local_unnamed_addr #0 {
entry:
  br label %loop.header

loop.header:
  %iv = phi i64 [ 0, %entry ], [ %iv.inc, %if.end ]
  %i23 = icmp eq i64 %iv, 0
  %gep = getelementptr inbounds i64, i64* %ints, i64 %iv
  %lv = load i64, i64* %gep
  %i27 = sub nuw nsw i64 %iv, 1
  %r = add i64 %lv, %i27
  store i64 %r, i64* %gep
  br i1 %i23, label %if.end, label %if.then

if.then:
  %i29 = getelementptr inbounds float, float* %input, i64 %i27
  %i30 = load float, float* %i29, align 4, !invariant.load !0
  br label %if.end

if.end:
  %i34 = phi float [ 0.000000e+00, %loop.header ], [ %i30, %if.then ]
  %i35 = getelementptr inbounds float, float* %output, i64 %iv
  store float %i34, float* %i35, align 4
  %iv.inc = add nuw nsw i64 %iv, 1
  %exitcond = icmp eq i64 %iv.inc, 4
  br i1 %exitcond, label %loop.exit, label %loop.header

loop.exit:
  ret void
}
attributes #0 = { noinline nounwind uwtable "target-features"="+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512vl" }

!0 = !{}

```

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111846/new/

https://reviews.llvm.org/D111846