[PATCH] D49516: [LoadStoreVectorizer] Use getMinusScev() to compute the distance between two pointers.

Thu Jul 19 13:56:44 PDT 2018

FarhanaAleen added inline comments.

================
Comment at: llvm/trunk/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp:339
   const SCEV *C = SE.getConstant(BaseDelta);
   const SCEV *X = SE.getAddExpr(PtrSCEVA, C);
   if (X == PtrSCEVB)
----------------
reames wrote:
> It really looks like the new check completely subsumes this check.  Have you tried simplify the code to remove this case?
Thanks for your comments.

Yes, this check is completely subsumed by the new check. The reason I did not remove this check because I thought performing this(const + an existing SCEV) is much cheaper than the new check (which requires to create minus scev, might have to work with more expressions) and only do this new expensive check if the result can't be achieved in a cheaper way. Also, most program execution might highly likely hit the first check.

Now, I am also thinking that it's probably just one time cost since we cache the result which should be pretty negligible. I will remove this check.

================
Comment at: llvm/trunk/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp:351
+
   // Sometimes even this doesn't work, because SCEV can't always see through
   // patterns that look like (gep (ext (add (shl X, C1), C2))). Try checking
----------------
reames wrote:
> This block may also be subsumed by the subtract.  If not, this would make a good bug against SCEV because it should be able to handle most any reasonable case here.
No, this block is not subsumed by the subtract. 

Currently, SCEV does not handle distribution around zext/sext. I am working on a patch that extends SCEV to handle this and remove this routine entirely from LoadStoreVectorizer.

================
Comment at: llvm/trunk/test/Transforms/LoadStoreVectorizer/AMDGPU/complex-index.ll:7
+
+declare double @llvm.fmuladd.f64(double, double, double)
+
----------------
reames wrote:
> You really should be able to construct a target independent test for this.  You don't need to exercise the actual vectorization, just the analysis phase.  I'm not sure the code is structured to make this easy, but if it isn't, it really should be.  (To be clear, this is not a must have, just a strong nice to have.)
Sounds good.

Repository:
  rL LLVM

https://reviews.llvm.org/D49516