[PATCH] D155049: [ScalarEvolution] Infer loop max trip count from memory accesses

Wed Jul 19 07:57:12 PDT 2023

Peakulorain added a comment.

In D155049#4510693 <https://reviews.llvm.org/D155049#4510693>, @nikic wrote:

> In D155049#4510451 <https://reviews.llvm.org/D155049#4510451>, @Peakulorain wrote:
>
>> In D155049#4509641 <https://reviews.llvm.org/D155049#4509641>, @nikic wrote:
>>
>>> Could you please explain at a high level why we need all this custom wrapping logic? Why are the no-wrap flags on the AddRec insufficient?
>>
>> The purpose of using the logic I implemented myself is to focus on calculating  **how many iterations** the index of GEP will wrap. With this value, we can know if the loop will fall into an infinite loop.
>>
>>   for.body:
>>     %iv = phi i8 [ %inc, %for.body ], [ 0, %for.body.preheader ]
>>     %idxprom = zext i8 %iv to i64
>>     %arrayidx = getelementptr inbounds [500 x i32], [500 x i32]* %a, i64 0, i64 %idxprom
>>     store i32 0, i32* %arrayidx, align 4
>>     %inc = add i8 %iv, 1
>>     %inc_zext = zext i8 %inc to i32
>>     %cmp = icmp ult i32 %inc_zext, %len
>>     br i1 %cmp, label %for.body, label %loopexit
>>
>> If the value of `%len` (which comes from argument) is greater than the maximum value that **i8** can represent, loop falls into infinite, but the **store** access is wandering in a fixed area without UB.  This is done to ensure that in this case the inference is correct.
>
> In such a case the pointer will be something like `((4 * (zext i8 {0,+,1}<%for.body> to i64))<nuw><nsw> + %a)<nuw>` rather than the `{%a,+,4}<nuw><%for.body>` it would be in the non-wrapping case. Why is the restriction to addrec pointers not sufficient for this case?

Thanks for your help, the above case is indeed filtered out by constraints. But please see :

  define void @test(i32 signext %len) {...
  for.body:
    %iv = phi i8 [ %inc, %for.body ], [ 0, %for.body.preheader ]
    %idxprom = zext i8 %iv to i64
    %arrayidx = getelementptr inbounds [500 x i32], [500 x i32]* %a, i64 0, i64 %idxprom
    store i32 0, i32* %arrayidx, align 4
    %inc = add nuw nsw i8 %iv, 1
    %inc_zext = zext i8 %inc to i32
    %cmp = icmp slt i32 %inc_zext, %len
    br i1 %cmp, label %for.body, label %loopexit
    ...
  }

this case would get **{%a,+,4}<nuw><%for.body>**. In such a situation, I think it is necessary to calculate how many iterations to wrap. :)

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155049/new/

https://reviews.llvm.org/D155049