[PATCH] D111846: [LV] Drop integer poison-generating flags from instructions that need predication

Sun Nov 7 12:00:07 PST 2021

nlopes added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1229
+      // Add new definitions to the worklist.
+      for (VPValue *operand : cast<VPRecipeBase>(CurDef)->operands())
+        if (VPDef *OpDef = operand->getDef())
----------------
dcaballe wrote:
> dcaballe wrote:
> > fhahn wrote:
> > > Do we in some cases also need to drop flags from recipes other then VPReplicateRecipe?
> > > 
> > > If we need to drop flags from recipes in unpredicted blocks (see comment above). then we could have cases where a vector operation (e.g. VPWidenRecipe) feeds the address computation (the first lane will get extracted). An example could be
> > > 
> > > ```
> > > define void @drop_vector(float* noalias nocapture readonly %input,
> > >                                  float* %output, i64* noalias %ints) local_unnamed_addr #0 {
> > > entry:
> > >   br label %loop.header
> > > 
> > > loop.header:
> > >   %iv = phi i64 [ 0, %entry ], [ %iv.inc, %if.end ]
> > >   %i23 = icmp eq i64 %iv, 0
> > >   %gep = getelementptr inbounds i64, i64* %ints, i64 %iv
> > >   %lv = load i64, i64* %gep
> > >   %i27 = sub nuw nsw i64 %iv, 1
> > >   %r = add i64 %lv, %i27
> > >   store i64 %r, i64* %gep
> > >   br i1 %i23, label %if.end, label %if.then
> > > 
> > > if.then:
> > >   %i29 = getelementptr inbounds float, float* %input, i64 %i27
> > >   %i30 = load float, float* %i29, align 4, !invariant.load !0
> > >   br label %if.end
> > > 
> > > if.end:
> > >   %i34 = phi float [ 0.000000e+00, %loop.header ], [ %i30, %if.then ]
> > >   %i35 = getelementptr inbounds float, float* %output, i64 %iv
> > >   store float %i34, float* %i35, align 4
> > >   %iv.inc = add nuw nsw i64 %iv, 1
> > >   %exitcond = icmp eq i64 %iv.inc, 4
> > >   br i1 %exitcond, label %loop.exit, label %loop.header
> > > 
> > > loop.exit:
> > >   ret void
> > > }
> > > attributes #0 = { noinline nounwind uwtable "target-features"="+avx512bw,+avx512cd,+avx512dq,+avx512f,+avx512vl" }
> > > 
> > > !0 = !{}
> > > 
> > > ```
> > I assumed that address computation would be uniform for those cases and left scalar, wouldn't it?. I'll give it a try but there has been some back and forth wrt dropping the flags from vector instructions already. Based on previous feedback, I'm not sure if we should drop the flags from them when there are lanes that won't generate poison. I would need some clarification before proceeding with those changes (@nlopes). 
> Looking closely at your example, the poison value generated by 'sub' in the first iteration of the loop is actually used in the scalar version of the code (it is stored). For that reason, my impression is that having the `nuw` flag in the input `sub` would be incorrect, isn't that the case?
> 
I mentioned this issue before: what happens if a value was already poison in the first place?

I suggest you commit this patch first, as it addresses half of the problem, and then we can discuss what's the best way to fix the second part. Just dropping flags from even all instructions within the function isn't sufficient as you may get a poison value as input (change the example above to have `[%arg, entry ]` rather than `[ 0, entry ]` for %iv. Though you only need to patch the code for the cases where you cannot prove the code would execute in the scalar version. The right fix isn't trivial.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111846/new/

https://reviews.llvm.org/D111846