[llvm] [DependenceAnalysis] Extending SIV to handle fusable loops (PR #128782)

Tue Sep 16 18:06:03 PDT 2025

kasuga-fj wrote:

What I was concerned about is, for example, the following case:

godbolt: https://godbolt.org/z/aTjxa44qx

```llvm
; for (i = 0; i < 9223372036854775806; i++) {
;   if (i < 2147483640)
;     for (j = 0; j < 2147483640; j++)
;       a[i + j * 4294967296] = 0;
;
;   for (j = 0; j < 2147483640; j++)
;     a[j * 2] = 0;
; }
;
define void @f(ptr %a) {
entry:
  br label %loop.i.header

loop.i.header:
  %i = phi i64 [ 0 , %entry ], [ %i.next, %loop.i.latch ]
  br label %loop.j.0.pr

loop.j.0.pr:
  %guard.j.0 = icmp slt i64 %i, 2147483640
  br i1 %guard.j.0, label %loop.j.0, label %loop.j.1

loop.j.0:
  %j.0 = phi i64 [ 0, %loop.j.0.pr ], [ %j.0.next, %loop.j.0 ]
  %val.0 = phi i64 [ %i, %loop.j.0.pr ], [ %val.0.next, %loop.j.0 ]
  %j.0.next = add nsw i64 %j.0, 1
  %idx.0 = getelementptr inbounds i8, ptr %a, i64 %val.0
  store i8 0, ptr %idx.0
  %val.0.next = add nsw i64 %val.0, 4294967296
  %exitcond.j.0 = icmp eq i64 %j.0.next, 2147483640
  br i1 %exitcond.j.0, label %loop.j.1, label %loop.j.0

loop.j.1:
  %j.1 = phi i64 [ 0, %loop.j.0 ], [ 0, %loop.j.0.pr ], [ %j.1.next, %loop.j.1 ]
  %val.1 = phi i64 [ 0, %loop.j.0 ], [ 0, %loop.j.0.pr ], [ %val.1.next, %loop.j.1 ]
  %idx.1 = getelementptr inbounds i8, ptr %a, i64 %val.1
  store i8 0, ptr %idx.1
  %j.1.next = add nsw i64 %j.1, 1
  %val.1.next = add nsw i64 %val.1, 2
  %exitcond.j.1 = icmp eq i64 %j.1.next, 2147483640
  br i1 %exitcond.j.1, label %loop.i.latch, label %loop.j.1

loop.i.latch:
  %i.next = add nsw i64 %i, 1
  %exitcond.i = icmp eq i64 %i.next, 9223372036854775806
  br i1 %exitcond.i, label %exit, label %loop.i.header

exit:
  ret void
}
```

The result of DA:

```
Printing analysis 'Dependence Analysis' for function 'f':
Src:  store i8 0, ptr %idx.0, align 1 --> Dst:  store i8 0, ptr %idx.0, align 1
  da analyze - none!
Src:  store i8 0, ptr %idx.0, align 1 --> Dst:  store i8 0, ptr %idx.1, align 1
  da analyze - none!
Src:  store i8 0, ptr %idx.1, align 1 --> Dst:  store i8 0, ptr %idx.1, align 1
  da analyze - consistent output [S 0]!
Compiler returned: 0
```

Here is the SCEV for `%val.0`:

```
 %val.0 = phi i64 [ %i, %loop.j.0.pr ], [ %val.0.next, %loop.j.0 ]
  -->  {{0,+,1}<nuw><nsw><%loop.i.header>,+,4294967296}<nuw><nsw><%loop.j.0>
```

The result of DA already seems incorrect. I guess the root cause is that DA currently doesn't take offset (`%val.0` in this case) wrapping into account. I believe this issue will be fixed by checking the loop-guard properly.
Here, what I'm not confident about is, what happens when assuming the two loops (`%loop.j.0` and `%loop.j.1`) are fused? In this specific case, probably it would be classified to MIV so it shouldn't be a problem. But what about other similar cases? Is it impossible for something like this to happen? Or am I missing something?

(Anyway, while working on this case, I realized that this issue is probably not limited to fusion. Perhaps the analysis should always bail out when such loop guards are present...)

https://github.com/llvm/llvm-project/pull/128782