[PATCH] D111846: [LV] Drop integer poison-generating flags from instructions that need predication

Sun Nov 7 03:00:51 PST 2021

dcaballe added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1229
+      // Add new definitions to the worklist.
+      for (VPValue *operand : cast<VPRecipeBase>(CurDef)->operands())
+        if (VPDef *OpDef = operand->getDef())
----------------
dcaballe wrote:
> fhahn wrote:
> > Do we in some cases also need to drop flags from recipes other then VPReplicateRecipe?
> > 
> > If we need to drop flags from recipes in unpredicted blocks (see comment above). then we could have cases where a vector operation (e.g. VPWidenRecipe) feeds the address computation (the first lane will get extracted). An example could be
> > 
> > ```
> > define void @drop_vector(float* noalias nocapture readonly %input,
> >                                  float* %output, i64* noalias %ints) local_unnamed_addr #0 {
> > entry:
> >   br label %loop.header
> > 
> > loop.header:
> >   %iv = phi i64 [ 0, %entry ], [ %iv.inc, %if.end ]
> >   %i23 = icmp eq i64 %iv, 0
> >   %gep = getelementptr inbounds i64, i64* %ints, i64 %iv
> >   %lv = load i64, i64* %gep
> >   %i27 = sub nuw nsw i64 %iv, 1
> >   %r = add i64 %lv, %i27
> >   store i64 %r, i64* %gep
> >   br i1 %i23, label %if.end, label %if.then
> > 
> > if.then:
> >   %i29 = getelementptr inbounds float, float* %input, i64 %i27
> >   %i30 = load float, float* %i29, align 4, !invariant.load !0
> >   br label %if.end
> > 
> > if.end:
> >   %i34 = phi float [ 0.000000e+00, %loop.header ], [ %i30, %if.then ]
> >   %i35 = getelementptr inbounds float, float* %output, i64 %iv
> >   store float %i34, float* %i35, align 4
> >   %iv.inc = add nuw nsw i64 %iv, 1
> >   %exitcond = icmp eq i64 %iv.inc, 4
> >   br i1 %exitcond, label %loop.exit, label %loop.header
> > 
> > loop.exit:
> >   ret void
> > }
> > attributes #0 = { noinline nounwind uwtable "target-features"="+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512vl" }
> > 
> > !0 = !{}
> > 
> > ```
> I assumed that address computation would be uniform for those cases and left scalar, wouldn't it?. I'll give it a try but there has been some back and forth wrt dropping the flags from vector instructions already. Based on previous feedback, I'm not sure if we should drop the flags from them when there are lanes that won't generate poison. I would need some clarification before proceeding with those changes (@nlopes). 
Looking closely at your example, the poison value generated by 'sub' in the first iteration of the loop is actually used in the scalar version of the code (it is stored). For that reason, my impression is that having the `nuw` flag in the input `sub` would be incorrect, isn't that the case?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111846/new/

https://reviews.llvm.org/D111846