[PATCH] D20789: Consecutive memory access in Loop Vectorizer - fixed and simplified
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 2 11:47:40 PDT 2016
We don't need to check the distance between op and src explicitly.
%13 is more or less %len, but it takes some edge cases into account.
Ignoring those edge cases, to determine overlap, we check (%op <= %src +
%len) && (%src <= %op + %len).
Let's assume the ranges [%op, %op + %len] and [%src, %src + %len] overlap.
If %op <= %src, then (%op <= %src + %len) obviously holds. But then, since
they overlap, %src must start somewhere in [%op, %op + %len], so (%src <=
%op + %len), and the condition we check is true.
The %src <= %op case is symmetrical.
As Adam said, it's possible that the actual calculation we use has an
off-by-one somewhere. I'm pretty sure my explanation above does. :-)
On Tue, Aug 2, 2016 at 11:16 AM, Demikhovsky, Elena via llvm-commits <
llvm-commits at lists.llvm.org> wrote:
> Right. As you see, the distance between %op and %src is not checked.
> %bound0 is some arithmetic between %op and %len
> And
> %bound1 is some arithmetic between %src and %len
>
> The loop just for the reference:
> while (len > 0) {
> *(reinterpret_cast<long long*>(op)) = *(reinterpret_cast<const long
> long*>(src));
> src += 8;
> op += 8;
> len -= 8;
> }
>
> - Elena
>
> >
> >mkuper added a comment.
> >
> >I tried building with r274114 (before this was reverted), and we did
> >generate a runtime alias check:
> >
> > vector.memcheck: ; preds =
> %min.iters.checked
> > %7 = sub i32 -1, %len
> > %8 = icmp sgt i32 %7, -9
> > %smax9 = select i1 %8, i32 %7, i32 -9
> > %9 = add i32 %len, %smax9
> > %10 = add i32 %9, 8
> > %11 = lshr i32 %10, 3
> > %12 = zext i32 %11 to i64
> > %13 = shl i64 %12, 3
> > %scevgep = getelementptr i8, i8* %op, i64 %13
> > %scevgep10 = getelementptr i8, i8* %src, i64 %13
> > %bound0 = icmp ule i8* %op, %scevgep10
> > %bound1 = icmp ule i8* %src, %scevgep
> > %found.conflict = and i1 %bound0, %bound1
> > %memcheck.conflict = and i1 %found.conflict, true
> > %cast.crd = trunc i64 %n.vec to i32
> > %14 = shl i32 %cast.crd, 3
> > %ind.end = sub i32 %len, %14
> > %15 = shl i64 %n.vec, 3
> > %ind.end12 = getelementptr i8, i8* %op, i64 %15
> > %ind.end14 = getelementptr i8, i8* %src, i64 %15
> > br i1 %memcheck.conflict, label %scalar.ph, label %vector.ph
> >
> >I'm not sure this runtime check is correct, though.
> >
> >
> >Repository:
> > rL LLVM
> >
> >https://reviews.llvm.org/D20789
> >
> >
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160802/2baff49d/attachment.html>
More information about the llvm-commits
mailing list