[PATCH] D20789: Consecutive memory access in Loop Vectorizer - fixed and simplified

Zaks, Ayal via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 2 14:59:35 PDT 2016


Elena> The error occurs when op - src = 0x1f

and len = ... 32?


Mkuper> Ignoring those edge cases, to determine overlap, we check (%op <= %src + %len) && (%src <= %op + %len).

Another way to reason about this: it's equivalent to checking |%op - %src| <= %len.


Mkuper> %13 is more or less %len, but it takes some edge cases into account.

Trying to be precise, %13 = (%len - 1) & !0x7   (as result of >>3 then <<3; or zero if %len <= 8).

Given that |%op - %src| = 31, we should reach scalar code iff  %len >= 33. I.e., vectorize for %len = 32 when we mustn't. (BTW, copying in the other direction is vectorizable, but our check is symmetric.)


Mkuper> As Adam said, it's possible that the actual calculation we use has an off-by-one somewhere. I'm pretty sure my explanation above does. :-)

Both seem to be accurate :-). Does the cast to long long* confuse us, leading us to reduce 32-1 into 24 in this case?



-----Original Message-----
From: Demikhovsky, Elena 
Sent: Tuesday, August 02, 2016 22:20
To: anemet at apple.com
Cc: reviews+D20789+public+564b514bcf1b605c at reviews.llvm.org; hfinkel at anl.gov; Zaks, Ayal <ayal.zaks at intel.com>; mzolotukhin at apple.com; wmi at google.com; silviu.baranga at arm.com; sanjoy at playingwithpointers.com; shirokiyroman at yandex.ru; mssimpso at codeaurora.org; junbuml at codeaurora.org; llvm-commits at lists.llvm.org
Subject: RE: [PATCH] D20789: Consecutive memory access in Loop Vectorizer - fixed and simplified



   >I am wondering if we have an off-by-one error with the bounds.
   >These casts look suspicious and I am wondering if we properly
   >account for the entire object accessed in the last iteration.  Elena, is
   >the overlap on the last element of the array or something?
[Demikhovsky, Elena] 
The error occurs when op - src = 0x1f

   >
   >> On Aug 2, 2016, at 11:16 AM, Demikhovsky, Elena
   ><elena.demikhovsky at intel.com> wrote:
   >>
   >> Right. As you see, the distance between %op and %src is not
   >checked.
   >> %bound0 is some arithmetic between %op and %len And
   >> %bound1 is some arithmetic between %src and %len
   >>
   >> The loop just for the reference:
   >>  while (len > 0) {
   >>      *(reinterpret_cast<long long*>(op)) = *(reinterpret_cast<const
   >long long*>(src));
   >>        src += 8;
   >>        op += 8;
   >>        len -= 8;
   >>  }
   >>
   >> -  Elena
   >>
   >>>
   >>> mkuper added a comment.
   >>>
   >>> I tried building with r274114 (before this was reverted), and we
   >did
   >>> generate a runtime alias check:
   >>>
   >>> vector.memcheck:                                  ; preds = %min.iters.checked
   >>>   %7 = sub i32 -1, %len
   >>>   %8 = icmp sgt i32 %7, -9
   >>>   %smax9 = select i1 %8, i32 %7, i32 -9
   >>>   %9 = add i32 %len, %smax9
   >>>   %10 = add i32 %9, 8
   >>>   %11 = lshr i32 %10, 3
   >>>   %12 = zext i32 %11 to i64
   >>>   %13 = shl i64 %12, 3
   >>>   %scevgep = getelementptr i8, i8* %op, i64 %13
   >>>   %scevgep10 = getelementptr i8, i8* %src, i64 %13
   >>>   %bound0 = icmp ule i8* %op, %scevgep10
   >>>   %bound1 = icmp ule i8* %src, %scevgep
   >>>   %found.conflict = and i1 %bound0, %bound1
   >>>   %memcheck.conflict = and i1 %found.conflict, true
   >>>   %cast.crd = trunc i64 %n.vec to i32
   >>>   %14 = shl i32 %cast.crd, 3
   >>>   %ind.end = sub i32 %len, %14
   >>>   %15 = shl i64 %n.vec, 3
   >>>   %ind.end12 = getelementptr i8, i8* %op, i64 %15
   >>>   %ind.end14 = getelementptr i8, i8* %src, i64 %15
   >>>   br i1 %memcheck.conflict, label %scalar.ph, label %vector.ph
   >>>

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.



More information about the llvm-commits mailing list