[PATCH] [LoopVectorize]Teach Loop Vectorizer about interleaved memory access
Adam Nemet
anemet at apple.com
Thu Jun 4 12:18:25 PDT 2015
LAA parts LGTM.
================
Comment at: lib/Analysis/LoopAccessAnalysis.cpp:858-883
@@ +857,28 @@
+ // iteration needs TypeByteSize (No need to plus the last gap size).
+ unsigned MinSizeNeeded =
+ TypeByteSize * Stride * (MinNumIter - 1) + TypeByteSize;
+
+ // It's not vectorizable if the distance is smaller than the minimum size
+ // needed for a vectroized/unrolled version.
+ //
+ // E.g. Assume one char is 1 byte in memory and one int is 4 bytes.
+ // foo(int *A) {
+ // int *B = (int *)((char *)A + 14);
+ // for (i = 0 ; i < 1024 ; i += 2)
+ // B[i] = A[i] + 1;
+ // }
+ //
+ // Two accesses in memory (stride is 2):
+ // | A[0] | | A[2] | | A[4] | | A[6] | |
+ // | B[0] | | B[2] | | B[4] |
+ //
+ // The distance is 14 in bytes from B[i] to A[i].
+ // The minimum size needed is: 4 * 2 * (MinNumIter - 1) + 4.
+ //
+ // If the minimum number of iterations is 2, it is
+ // vectorizable as the minimum size needed is 12 which is less than distance.
+ //
+ // If the minimum number of iterations is 4 (Say if a user forces the
+ // vectorization factor to be 4), the minimum size needed is 28 which is
+ // greater than distance. It is not safe.
+ if (MinSizeNeeded > Distance) {
----------------
HaoLiu wrote:
> anemet wrote:
> > HaoLiu wrote:
> > > HaoLiu wrote:
> > > > anemet wrote:
> > > > > anemet wrote:
> > > > > > I think this is correct but I wonder if the example was less contrived if you used:
> > > > > >
> > > > > > for (i = 0; i < 1024; i+= 3)
> > > > > > A[i + 4] = A[i] + 1
> > > > > >
> > > > > > MinSizeNeeded is 4 * 3 * (2 - 1) + 4 = 16 which is equal to the distance.
> > > > > >
> > > > > > Also a nit: most or all of this comment is explaining why MinSizeNeeded is computed the way it is so the comment should be before the computation.
> > > > > MinDistanceNeeded is probably a better name.
> > > > This case is vectorizable as MinDistanceNeeded is exactly equal to the distance. If the distance is smaller (even 1 byte smaller), it can not be vectorizable.
> > > >
> > > > Actually we already has a similar case in memdep.ll:
> > > > for (i = 0; i < 1024; ++i)
> > > > A[i+2] = A[i] + 1;
> > > > The MinDistanceNeeded is 2*4 = 8. The distance is also 8.
> > > Fixed
> > I didn't quite understand what you were saying here but looks like you didn't change the comment, so I guess you disagree that my example is better. Your example is vectorizable if miniter is 2 and so is mine, so I don't understand your reply.
> Previously, I thought you just asked me a question about that case.
>
> Sorry about the misleading.
> Actually, your case is "indep", which returns early with "Dependence::NoDep". So it could not reach at this place.
>
> That's why I use an example with dependences.
Makes sense.
http://reviews.llvm.org/D9368
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list