[llvm] [DA] Check monotonicity for subscripts (PR #154527)

Tue Aug 26 01:29:12 PDT 2025

kasuga-fj wrote:

> Two points:
> 
>  * monotonicity checker belongs to ScalarEvolution.[h|cpp] (there's nothing specific to DA.)
>
>  * monotonic checker is redundant with (could be built on top of) SCEV's existing range analysis.
> 
> 
> Instead of a separate monotonicity checker, enhance SCEV's existing getRange() methods to detect wrapping more accurately and integrate this into the existing DA wrapping checks.

I think these checks include some logic specific to DA, such as inferring properties from `nusw` on GEPs, similar to what LoopAccessAnalysis does. Moreover, even if there are no DA specific checks, I think `getRange` is not appropriate in this case, since we need to recursively check the monotonicity of the AddRec for each loop.

> Lets suppose that `nusw` on `%add11` does not guarantee `nusw` on individual subscripts. But then you should be able to get back `%add11` using combination of wrapped/unwrapped subscripts. This is definitely not possible because `(anything + UB)=UB != %add11`. Do you agree here ?

No. I don't think that's true in DA. Delienarization can decompose the original offset into multiple subscripts that are not "equivalent" to how the offset actually computed, which makes the problem complicated. Consider the following case, which I raised in https://github.com/llvm/llvm-project/issues/152566:

```llvm
; void f(char *a, unsigned long long d) {
;   if (d == UINT64_MAX)
;     for (unsigned long long i = 0; i != d; i++)
;       a[i * (d + 1)] = 42;
; }
define void @f(ptr %a, i64 %d) {
entry:
  %guard = icmp eq i64 %d, -1
  br i1 %guard, label %loop.preheader, label %exit

loop.preheader:
  %stride = add nsw i64 %d, 1  ; since %d is -1, %stride is 0
  br label %loop

loop:
  %i = phi i64 [ 0, %loop.preheader ], [ %i.next, %loop ]
  %offset = phi i64 [ 0, %loop.preheader ], [ %offset.next, %loop ]
  %idx = getelementptr inbounds i8, ptr %a, i64 %offset
  store i8 42, ptr %idx
  %i.next = add nuw i64 %i, 1
  %offset.next = add nsw nuw i64 %offset, %stride
  %cond = icmp eq i64 %i.next, %d
  br i1 %cond, label %exit, label %loop

exit:
  ret void
}
```

The `%offset` in the above case is delinearized into:

```
AccessFunction: {0,+,(1 + %d)}<nuw><nsw><%loop>
Base offset: %a
ArrayDecl[UnknownSize][%d] with elements of 1 bytes.
ArrayRef[{0,+,1}<nuw><nsw><%loop>][{0,+,1}<nuw><nsw><%loop>]
```

Yeah, the `nsw` on each subscript is incorrect. This is a bug in SCEVDivision, and I'm currently working on it. That said, it also implies that even if the GEP has `nusw`, each subscript can still wrap introducing UB.

https://github.com/llvm/llvm-project/pull/154527