[PATCH] D154075: [LoopVectorize] Add pre-commit tests for D152366

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 2 04:45:34 PDT 2023


fhahn added a comment.

In D154075#4532246 <https://reviews.llvm.org/D154075#4532246>, @david-arm wrote:

>   LAA: Adding RT check for range:
>   Start: {%dst,+,(4 * (zext i32 (1 + %n) to i64))<nuw><nsw>}<%for.cond1.preheader.us> End: {((4 * (zext i32 %n to i64))<nuw><nsw> + %dst),+,(4 * (zext i32 (1 + %n) to i64))<nuw><nsw>}<%for.co
>   nd1.preheader.us>
>   LAA: Adding RT check for range:
>   Start: {%src,+,(4 * (zext i32 %n to i64))<nuw><nsw>}<%for.cond1.preheader.us> End: {((4 * (zext i32 %n to i64))<nuw><nsw> + %src),+,(4 * (zext i32 %n to i64))<nuw><nsw>}<%for.cond1.preheade
>   r.us>



> I don't see how the Start and End values correspond to actual range of addresses accessed in the inner loop?

IIUC `: {%dst,+,(4 * (zext i32 (1 + %n) to i64))<nuw><nsw>}<%for.cond1.preheader.us> `  is effectively `%dst + i * (4 *(1  + n))` where `j = 0` (lower bound) and `{((4 * (zext i32 %n to i64))<nuw><nsw> + %dst),+,(4 * (zext i32 (1 + %n) to i64))<nuw><nsw>}<%for.cond1.preheader.us>` is `%dst + n i * (4 *(1  + n))` for `j=n-1`.

Does that make sense?

In D154075#4542401 <https://reviews.llvm.org/D154075#4542401>, @david-arm wrote:

> In D154075#4531095 <https://reviews.llvm.org/D154075#4531095>, @fhahn wrote:
>
>> Taking another look, I think we need a few more tests for cases not covered by the current tests:
>
>
>
>> - inner and outer induction incremented by non-constant
>
> I've added the other tests you suggested, but I'm not sure of the value of having tests for non-constant IV increments. For the inner loop a non-constant IV increment will lead to SCEV checks to ensure we only enter the vector loop if the stride is 1, in which case it's going to be a constant anyway. For the outer loop the tests already have outer loop non-constant memory access strides, i.e. tests like this:
>
>   for (int i = m - 1; i >= 0; i--) {
>     for (int j = 0; j <= n; j++) {
>       dst[(i * stride1) + j] += src[(i * stride2) + j];
>    }
>
> where the stride for the outer loop is `stride1` so by making the outer IV increment non-constant all I'm doing is transforming an already unknown stride into another unknown stride, unless I'm missing something?

Even though we version with 1 at the moment, I think it is still valuable to have the additional test coverage in case the versioning (or something else) changes which may impact the correctness of the generated RT checks.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154075/new/

https://reviews.llvm.org/D154075



More information about the llvm-commits mailing list