[PATCH] D101680: Change the context instruction for computeKnownBits in LoadStoreVectorizer pass

Thu May 6 21:02:42 PDT 2021

wvoquine added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp:521
     KnownBits Known(BitWidth);
-    computeKnownBits(OpA, Known, DL, 0, &AC, OpA, &DT);
     APInt BitsAllowedToBeSet = Known.Zero.zext(IdxDiff.getBitWidth());
----------------
wvoquine wrote:
> wvoquine wrote:
> > arsenm wrote:
> > > wvoquine wrote:
> > > > wvoquine wrote:
> > > > > wvoquine wrote:
> > > > > > arsenm wrote:
> > > > > > > arsenm wrote:
> > > > > > > > How do we know A is the point the vectorized instruction will be inserted, and not B?
> > > > > > > e.g. can you add some tests with an assume between the two instructions to vectorize?
> > > > > > By the way, we have discussed some of these points with @rtereshin and @bogner 
> > > > > > 
> > > > > > Neither of OpA and OpB is a point of the vectorization. They both dominate the point of vectorization, but there's no information here about which one is closer to it.
> > > > > > 
> > > > > > This change only enables more cases to pass: if OpB is nullptr then it means we have bailed long ago in the function:
> > > > > > ( See this part:
> > > > > >   // At this point A could be a function parameter, i.e. not an instruction
> > > > > >   Value *ValA = OpA->getOperand(0);
> > > > > >   OpB = dyn_cast<Instruction>(OpB->getOperand(0));
> > > > > >   if (!OpB || ValA->getType() != OpB->getType())
> > > > > >     return false;
> > > > > > )
> > > > > > 
> > > > > > I just allow it not to bail if OpA appears to be nullptr.
> > > > > > 
> > > > > > There's no assumption as to which one goes at the point or closer to the point of vectorization.
> > > > > > 
> > > > > > Also without llvm.assume I couldn't differentiate the two contexts in a test.
> > > > > > 
> > > > > > And I added a test with the llvm.assume - ld_v2i8_add_different_contexts.
> > > > > > 
> > > > > > Now, sometimes OpA might be with a better context for the calculation of known bits. In such cases we might fail to vectorize.
> > > > > > 
> > > > > > If llvm.assume's are inconsistent in the two points then this ought to be an UB.
> > > > > > 
> > > > > > For an improvement in the future we might need to use the actual vectorization point as the context.
> > > > > Don't you think ld_v2i8_add_different_contexts would count as one, or do you prefer a test without a control flow in it?
> > > > > There's no assumption as to which one goes at the point or closer to the point of vectorization.
> > > > 
> > > > Having said that, if a case is a result of unrolling, in many such occasions, OpB indeed naturally goes after OpA. And that means the context is more precise
> > > > (the contexts shouldn't be inconsistent).
> > > ld_v2i8_add_different_contexts does not have an assume between the two loads, so I don't mean that
> > I see some unexpected results when I place llvm.assume between loads, or just before the first load.
> > I will have to debug a little bit, and will likely update the change.
> > 
> > A safer version could be to still use OpA if it's not nullptr and OpB if it is.
> In fact llvm.assume itself did block the transformation as it was considered to be writing to the memory.
> Similar to other passes we can ignore assume in this case though.
Now the tests `ld_v2i8_add_different_contexts1` and `ld_v2i8_add_context` have llvm.assume between the vectorized loads.
Also in `ld_v2i8_add_context1` the llvm.assume goes after both and a store coming after that, it still works with the single basic block.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101680/new/

https://reviews.llvm.org/D101680