[PATCH] D19984: [LV] Preserve order of dependences in interleaved accesses analysis

Wed Jun 8 09:17:41 PDT 2016

mssimpso added a comment.

Adam,

Welcome back! Thanks for following up. I replied to your comments inline.

Matt.


================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:959-960
@@ +958,4 @@
+
+  /// \brief Returns true if strided accesses \p A and \p B can be reordered.
+  bool canReorderStridedAccesses(StrideEntry *A, StrideEntry *B) const {
+
----------------
anemet wrote:
> Both A and B are not necessarily strided here.
True. I will rename this and update the comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:976-977
@@ +975,4 @@
+
+    // B precedes A in program order and is potentially the source of a
+    // dependence.
+    auto *Src = B->first;
----------------
anemet wrote:
> This requirement is part of the API so it should be in the function comment.  Also a nit, can you please flip the order.  I think that A preceding B is the more intuitive.  Let me know if you disagree.
Sounds good.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:5196-5200
@@ -5106,1 +5195,7 @@
 
+      // If A and B cannot be reordered, we can't add additional instructions
+      // to A's group since this may cause illegal code motion. If B is already
+      // in a group release it.
+      if (!canReorderStridedAccesses(&*I, &*II)) {
+        if (isInterleaved(B)) {
+          InterleaveGroup *StoreGroup = getInterleaveGroup(B);
----------------
anemet wrote:
> mssimpso wrote:
> > > It would be also great to add a testcase to the earlier problem that only candidate accesses were dependency-checked.
> > 
> > I've tried constructing a testcase that would generate incorrect code for this, but I think the current limitations of LAA might actually make this unrealizable at the moment. After the memory checks, two access can be dependent only at constant distances. Thus, if one access is strided, a dependent access at a constant distance would have to be strided as well.
> > 
> > > This sounds right to me. Let me think it over before posting another update.
> > 
> > After thinking about this more carefully, I think we will miss cases with a simplification like the one you suggested. For example, if we have a set of interleaved loads followed by a set of interleaved stores that access the same location (or vice versa), giving up in the outer loop would prevent the two groups from being created.
> > I've tried constructing a testcase that would generate incorrect code for this, but I think the current limitations of LAA might actually make this unrealizable at the moment. After the memory checks, two access can be dependent only at constant distances. Thus, if one access is strided, a dependent access at a constant distance would have to be strided as well.
> 
> Can't we use larger than stride offsets/indices to emulate non-interleaved accesses?  I believe that these are currently ignored by interleaved analysis.  I mean something like:
>  
> 
> ```
> for (i = 0; i < n; i+=3) {
>    ... = A[i]
>    A[i+4] = ...
>    ... = A[i+1]
> }
> 
> ```
> And then A[i+1] shouldn't be moved across A[i+4]
> 
> I haven't tried this, it's just an idea....
> 
> 
This was my first thought as well. In this case, A[i+4] is still a stride-greater-than-one access, so it would've been collected in SrideAccesses and sent through the original analysis. If its stride is equal to anything other than that of the other accesses (e.g, one or some value greater than MaxInterleaveGroupFactor), the distance between it and the other accesses will be non-constant (some SCEV expression). If the distance isn't constant LAA will report Dependence::Unknown, and we will create the memory checks.


http://reviews.llvm.org/D19984