<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/110609>110609</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [Flang] TSVC s115: compiler doesn't vectorize the loop considering an initial value of do-variable might overflow
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            flang:ir
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          yus3710-fj
      </td>
    </tr>
</table>

<pre>
    Flang can't vectorize the loop in `s115` of [TSVC](https://www.netlib.org/benchmark/vectors) while Clang can vectorize the loop written in C.

* Fortran
```fortran
!     Fortran version
      subroutine s115 (ntimes,ld,n,ctime,dtime,a,b,c,d,e,aa,bb,cc)

      integer ntimes, ld, n, i, nl, j
      real a(n), b(n), c(n), d(n), e(n), aa(ld,n), bb(ld,n), cc(ld,n)

      call init(ld,n,a,b,c,d,e,aa,bb,cc,'s115 ')
      do 10 j = 1,n
         do 20 i = j+1, n
            a(i) = a(i) - aa(i,j) * a(j)
  20     continue
  10  continue
      call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
      end
```
```console
$ flang-new -v -O3 -flang-experimental-integer-overflow s115.f -S -Rpass=vector
flang-new version 20.0.0git (https://github.com/llvm/llvm-project.git 2c770675ce36402b51a320ae26f369690c138dc1)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/build/bin
Build config: +assertions
Found candidate GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11
Selected GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/path/to/build/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -target-cpu generic -target-feature +outline-atomics -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -fversion-loops-for-stride -flang-experimental-integer-overflow -Rpass=vector -resource-dir /path/to/build/lib/clang/20 -mframe-pointer=non-leaf -O3 -o /dev/null -x f95-cpp-input s115.f
```

* C
```c
// C version
#define LEN 32000
#define LEN2 256
float a[LEN], b[LEN], c[LEN], d[LEN], e[LEN];
float aa[LEN2][LEN2], bb[LEN2][LEN2], cc[LEN2][LEN2];

int s115() {
  init( "s115 ");
  for (int j = 0; j < LEN2; j++) {
 for (int i = j+1; i < LEN2; i++) {
      a[i] -= aa[j][i] * a[j];
 }
  }
  dummy(a, b, c, d, e, aa, bb, cc, 0.);
  return 0;
}
```
```console
$ clang -O3 s115.c -S -Rpass=vector
s115.c:10:4: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
   10 | for (int i = j+1; i < LEN2; i++) {
      | ^
```

If `j+1` overflow, the access to `a(i)` and `a(j)` may overlap so vectorization is prevented.
IIRC, compilers don't have to consider it.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysV1-vIqkS_zT4UsHQtLb64IO2480km73Jzua-Y1OtuDR0gNaZ--k3QLfH828ym-zJSQtVxQ-qqH8I79XZIG7Jck-Wh5kYwsW67Y_Bl6uC0fY6O1n5Y3vUwpyhEYbwVYAbNsE69X-EcEHQ1vagDJCK-aJYkoqBbYEs939--19NlgfC15cQek_KHeFHwo_3-31uMGh1mlt3Jvx4QtNcOuH-IvyYsT3hG7hflEaop60_2vbuVAho4vb1nLADYbvxy3dwtC44YUZCxfJ_-4rKC4h_oyjc0HllR2bigB9Ozg5BGYSoHhC-NkF16AmvtSS8NoTXTaQQXsvxVxBenyI90givEy0RE7UhfPN82ryTMgHP6OABDwkf4gag0kjH7_V5kUOhQcRDRUxew-lp3DyN5dMYn8bxWOtJkYxwekuJB36ivDt5I7QGZVR4EvsFG9SEr0abrh64GVFaKBhcgZQHKBLgEzPzOQOV-FfC90U21GshgGQZFZ0pCj4mNGsdbXpNTL5LzOvTKTjLqlkTlBlwIhfsPe1hAjl03Y9_aINi_kZ1NPKNy76ZNtZ4q3Hy4AW0MUSowTvQG9D_lkAzBb_36FSHJghNR--i9oau1faevHneAv0G9I9eeE_KQ46wDPwCOgYFcDZnc3ZWAd6F9FmFy3CaN7Yj_Kj1bfqhvbNXbMI8ruLNasWq1bLBslowfloWouRMIK_astpUG9YU5Vo2xcMefwp3xkDKHQjhmku1oIP5y9i7oVqZ4Ts9m2EUvDgUEjorUUfx3nr1PbO-Gh-E1igPykUW4cdehAvhx2Bj6hlUvKzjSY3Os4-EeMOtOmf5vfAeXVDW-CxytIORMSNJJUVA-E9dg8rbiCg2bjN4F62gTtE-8aqPkxYO5UWErAThx6LIuN9QYxNQ_muA9eOI3aCDiivLHcxJuScL1lWLN9v-TAgI5z8xHT8-3IVwDrRtCqDBqV7j53eXXK9trLaOSiXOxvqgGg-0c6htk1Sn6U6hVw3QXjVU4w018DxRnvYKgYbkJ7TpBzijQReFR1qLIgwO4zXaIWhlkIpgu7TNe5HbWnxEbnsqXHdbf8QzaA3QdowRGquSp6111AenJP5aKL4JQKAOvR1cg1Qq95nLZkdoIjzhR86Adq0THdLexg0cKQ8mnghFm5OCjUgSb4QfzaA10O_Qbpa06XuqTD-EMSN8nHweVbV-m40mVswDUL8uoYSXEttYO3_78juUnDH2AYMDX1ZT1rEigCDL_W9ffk_NQ6xoz7Pm1Uy-muHLrNy_AhwReWS9jHKx-4TTNB9yJuD8VSabjfB1KiSr_ZTKx3IYA2escTwmtvIh0Np4teuIkOscI-U-DetklDQjfJ_-n6GfFj4XwHKfpi-L1UeLx6q43CuyPABNZTFOr1nLRM3lcKQ9DkxWhwnjaTiVPDH2HqnpSN1GajNyf5F7itxH1MDmrw3hMAzOJPVH007wv1wBUxQkL08-3Hxa1TKblLuCkXK3iKnOYWo9y91Lgylzd0n4eiKlZAR3JcMlSi5SSxajTKO4YSwYg0mViidzL_eP3SMSfSBHi04XUTAgq_pfuM6IQpZffhK4X9vYnmfg2J-PiSdqEXtp0TToPQQbpaYuKQoKIyfSdSR14kdar0UP3sJrAykPvcMbmoBy7Me_fv2jTndvu15pdB6kzc-Ii7hh3DPeppLoQIX5TG5LuSk3YobbYsVXq8ViXS5ml21V8WVVSs5KthFLXLOqlKdTs-JltS454kxtOeOLgrGCFWxVLOaFxEXDJWKL7aJalWTBsBNKz2NjEl8eM-X9gNuiYBXbzLQ4ofbpHcR5Stqk3CkXw3Z5mLltamdOw9mTBdPKB_-CE1TQ6QWVnkkxguLTJ-eFcvfQHKRF_-kLarKCMmcQJqUPJTTchB4wPqikpTfhlDhphE6dL-Fxi7PB6e0_68cIPyblfeoWkv63Lf87AAD___IkDBI">