[llvm] [DA] Fix the check between Subscript and Size after delinearization (PR #151326)

Ryotaro Kasuga via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 7 03:58:17 PDT 2025


kasuga-fj wrote:

The more I think about it, the more confused I get. Still, unsigned feels dangerous. 

Consider the array size is estimated as `A[Sz0][Sz1]` and subscripts are inferred as `A[sub0][sub1]`. The actual offset would be `sub0*Sz1 + sub1`. So, what happens if `Sz1` is negative in signed sense? For example, if `Sz1` is UINT_MAX, `sub0*Sz1 + sub1` would yield 0 whenever `sub0` is equal to `sub1`. However, if `sub0` and `sub1` are IVs of different loops (like `A[i][j]`), probably DA doesn't take such things into account. At least, we must ensure that the original offset doesn't wrap.

So, is such check sufficient? Probably No. I found the following example ([godbolt](https://godbolt.org/z/GsbK19nve)):

```llvm
; void f(char *a, unsigned long long d) {
;   if (d == UINT64_MAX)
;     for (unsigned long long i = 0; i < d; i++)
;       a[i * (d + 1)] = 42;
; }
define void @f(ptr %a, i64 %d) {
entry:
  %guard = icmp eq i64 %d, -1
  %stride = add nsw i64 %d, 1
  br i1 %guard, label %loop, label %exit

loop:  ; %d being -1, %stride is 0
  %i = phi i64 [ 0, %entry ], [ %i.next, %loop ]
  %offset = phi i64 [ 0, %entry ], [ %offset.next, %loop ]
  %idx = getelementptr inbounds i8, ptr %a, i64 %offset
  store i8 42, ptr %idx
  %i.next = add nuw i64 %i, 1
  %offset.next = add nsw nuw i64 %offset, %stride
  %cond = icmp eq i64 %i.next, %d
  br i1 %cond, label %exit, label %loop

exit:
  ret void
}
```

This loop stores 42 to `A[0]` on every iteration when we pass `UINT64_MAX` to `d`, so the dependency should be `[*]`, but currently DA returns `none!`.

The SCEV representation for `%offset` is `{0,+,(1 + %d)}<nuw><nsw><%loop>`. It will be delinearized as follows:

```
ArrayDecl[UnknownSize][%d] with elements of 1 bytes.
ArrayRef[{0,+,1}<nuw><nsw><%loop>][{0,+,1}<nuw><nsw><%loop>]
```

Well, this result is also suspicious, maybe `nsw` should be dropped... 

Anyway, we know that the back-edge taken count is `%d`, and the subscript (`{0,+,1}<nuw><nsw><%loop>`) is less than the size (`%d`) in unsigned sense. Great, the in-range check passes, let's proceed the analysis -- Since all subscripts are monotonic, DA will conclude that no dependency exists...

https://github.com/llvm/llvm-project/pull/151326


More information about the llvm-commits mailing list