[llvm] [DependenceAnalysis] Extending SIV to handle fusable loops (PR #128782)

Tue Sep 16 08:48:25 PDT 2025

amehsan wrote:

> 
> Your argument seems correct if we are analyzing the C language (or other high-level languages). But our target is LLVM IR which does not have explicit loop constructs such as `for` or `while` statements. Thus, loop-releated information (e.g., backedge-taken count) must be inferred in some way. Loop guards can be utilized for this purpose, for instance, a predicate like `N > 0` may be used to compute the backedge-taken count of the loop. From this perspective, I think it can be said that DA makes use of loop guards. The condition `i < 10` in the above example would also be treated as a loop guard. I think they are both equally loop guards and it should be impossible to distinguish information related to backedge-taken count from any other information. Therefore it's safer to think that DA might be indirectly using `i < 10` (although I don't know if that condition actually provides useful information).

I believe I have mentioned this multiple times, but the key point here is that it is checked that the two inner loops have the same backedge-taken count. How this value is calculated is irrelevant. What is important is that the loop will never be executed more than this number.

> The current patch seems to be saying "let's regard the two fusable loops as the same", in this example the two j-loops. If the second j-loop is "mapped" to the first j-loop, it appears to me that this may potentially imply considering the following code rather than the original:
> 
> ```c
> for (int i = 0; i < M; i++) {
>   if (i < 10)
>     for (int j = 0; j < N; j++) {
>       A[i+10][j] = 42;
>       A[i][j] = 43;
>     }
> }
> 
> // Or, written in a form closer to LLVM IR
> for (int i = 0; i < M; i++) {
>   if (i < 10 && N > 0) {
>     int j = 0;
>     do {
>       A[i+10][j] = 42;
>       A[i][j] = 43;
>       j++;
>     } while (j < N)
>   }
> }
> ```
> 
> I've been thinking about why this happens, and I've come to feel that the root cause is trying to map different loops to the same Level. If this patch is modified so that `CommonLevels` is not changed due to fusability, then it may also address my concern.

I don't see the point of the example. If the two loops have different backedge taken counts then the new logic is not used. If they have the same iteration count then from DA point of view we have something like this:

> ```c
> for (int i = 0; i < M; i++) {
>    <There might be some code here>
>     for (int j = 0; j < N; j++) {
>       A[i+10][j] = 42;
>       A[i][j] = 43;
>     }
  <There might be some code here>
> }
> 

The key point here is that the set of instances of `A[i+10][j]` and `A[i][j]` that is considered by DA is a superset (or potentially equal) to the set of instances that  will be executed at run time. So DA result is always correct, but  it might be sometimes conservative. 

Regarding your first sentence in the comment: I don't mind nitpick comments. Unfortunately I fail to see a valid point so far.

https://github.com/llvm/llvm-project/pull/128782