<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/78934>78934</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [Flang] TSVC s243: the loop should be canonicalized for alias analysis in LLVM
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            loopoptim,
            flang:ir
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          yus3710-fj
      </td>
    </tr>
</table>

<pre>
    Flang can't vectorize the loop in `s243` of [TSVC](https://www.netlib.org/benchmark/vectors) while Clang can vectorize the loop written in C.

```fortran
! Fortran version
      module mod
      integer ld, nloops
      parameter (ld=1000,nloops=135)
 real a(ld), b(ld), c(ld), d(ld), e(ld)
      real aa(ld,ld), bb(ld,ld), cc(ld,ld)
      interface
      subroutine dummy(ld,n,a,b,c,d,e,aa,bb,cc,x)
         integer ld, n
         real a(ld), b(ld), c(ld), d(ld), e(ld)
         real aa(ld,ld), bb(ld,ld), cc(ld,ld)
         real, value :: x
      end subroutine
      end interface
      end module

      subroutine s243 (n)
      use mod
      integer n, i

      call init(ld,n,a,b,c,d,e,aa,bb,cc,'s243 ')
      do 10 i = 1,n-1
 a(i) = b(i) + c(i)   * d(i)
         b(i) = a(i) + d(i)   * e(i)
 a(i) = b(i) + a(i+1) * d(i)
  10  continue
      call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
      end
```

```c
// C version
#define LEN 32000
#define LEN2 256
float a[LEN], b[LEN], c[LEN], d[LEN], e[LEN];
float aa[LEN2][LEN2], bb[LEN2][LEN2], cc[LEN2][LEN2];

int s243() {
  init( "s243 ");
  for (int i = 0; i < LEN-1; i++) {
    a[i] = b[i] + c[i  ] * d[i];
    b[i] = a[i] + d[i  ] * e[i];
    a[i] = b[i] + a[i+1] * d[i];
  }
  dummy(a, b, c, d, e, aa, bb, cc, 0.);
  return 0;
}
```

```console
$ flang-new -v -Ofast s243.f -S -Rpass=vector
flang-new version 18.0.0git (https://github.com/llvm/llvm-project.git 0e93d04001e45f39cabf0ffb5093512a7f622cc0)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/install/bin
Found candidate GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Selected GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/path/to/install/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -ffast-math -target-cpu generic -target-feature +neon -target-feature +v8a -fstack-arrays -fversion-loops-for-stride -Rpass=vector -O3 -o s243.s -x f95-cpp-input s243.f
$ clang -Ofast s243.c -S -Rpass=vector
s243.c:15:3: remark: vectorized loop (vectorization width: 4, interleaved count: 1) [-Rpass=loop-vectorize]
   15 | for (int i = 0; i < LEN-1; i++) {
      | ^
```

The Fortran code isn't vectorized because DSE doesn't remove the first store to `a(i)`.
DSE uses BasicAA, and BasicAA can't say that `a(i)` and `a(i+1)` don't alias each other. So DSE doesn't work.

Actually, Flang generates LLVM IR like the following C code that makes alias analysis harder.
(FYI: BasicAA avoids complicated analyses which can affect the compilation time.)

```c
for (int i = 1; i <= n-1; i++) {
  a[i-1] = b[i-1] + c[i-1] * d[i-1];
  b[i-1] = a[i-1] + d[i-1] * e[i-1];
  a[i-1] = b[i-1] + a[i] * d[i-1];
}
```

Conversely, the following Fortran code is easy to be vectorized.

```fortran
      do 10 i = 0,n-2
         a(i-1) = b(i-1) + c(i-1) * d(i-1)
         b(i-1) = a(i-1) + d(i-1) * e(i-1)
 a(i-1) = b(i-1) + a(i)   * d(i-1)
  10 continue
```

```console
$ flang-new -Ofast s243_2.f -S -Rpass=vector
path/to/s243_2.f:14:7: remark: vectorized loop (vectorization width: 4, interleaved count: 1) [-Rpass=loop-vectorize]
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEWF2P6jjS_jXum5JR4hA-LriAcHh1pPPOSNNHR9qrleNUwNPGRrYDp_fXr-wkkADduzM7q221uqkqV9muj8dVcOfkXiOuSL4h-faFN_5g7Oq9cdk8TWj9-0tpqvfVTnG9B8E1YXMPZxTeWPkPBH9AUMacQGogs8SxaUZmCZgaSL75_vqjIPmWsMXB-5Mj2ZqwHWG7y-Uy0eiVLCfG7gnblajF4cjtG2G71rYjbAmXg1QIRb_1s20vVnqPOmxfTEiyJcm6-ztL2t_aWG-57rgshV3LgDNaJ00ngPhzNFWjMPwbcqX2uEcLqiKsAB32dUP5iVt-RI8WCFuoimTbNEkSwopuabZNs5ywZadjkSvg7VK2DCbLISGGRDUk8EoMNm-t9eaKm83ygSXEmHV3RVtzgUOma0prGi81QtUcj--9tias4IQVJWGFIKwIPAy8yIzcwP55t8ejJ--kf61j_jLfdIbCqjNXDULI42wNP4eLUFcDf91Lnro3CNqEG-btg-tDSYXE0nfHatyHiRoCBPLRquBKgdTS_8FIEjbvTjG_O0RlIE1AAsm2kAaDNO3EwesyFHEQlVeCbWIcIwFA2DpGUj66vBzq86F-NdbHsf7H-3aSTdoyHndOEwBhtJe6wQe3_ZkKSCd390Jd3aHTU8gSPVgFsIRijFSEZRXWITO-ffkFMhag5lHAgOWzll8rwz1wkm--ffklwnEorCElRlQ1ovBGZZuRwc4iC6Lbp7a8PpAI8VTSG27_Su1j0hO2iJGab3oXdqkLhLEuH1lwcHZdUJuIwcFCm5MJyTbxYxGcQtNIEraJvyPbED0kSb7tMqcnYsbmGwnQkuvoIDk8d5uvA20-1K7G2vhM--O9W0nI2o93J_Nt_7FPU97BZ4TNCJcRJosIhy0CtqBXQDIZO9Gib6yOruvC0pv_F1lrtDNXMGNTqMPDTTVegJ6B_lpz1wZ2UgN9BfrbibvwOLavep9avUqX85AuJskk2UsPD23EXvpDU06EORK2U-rc_6Mna35H4SdBK8FlViXTJElxmtfZUvCyTuq6zJNllqeMz-sZY0Ik10r9zu0efQB4zq04zKa00W_aXDRVUjc_6V433cKDRR4hHFVYfjJOdm_CV-08VwqrrbRBRNjuxP2BsJ03hO1kKw6Nj-yqemcaXYUmp5IV9wj_VxTQLeM-1P6jFSXL4IQQxN3HR2W7NJ2wSReyV1QoPFb_LfvF9QLHRnkZTGRrmIRUmibH2fTuFJ8taqv7E7-x3TVbCGNAa5EC9VaeFH4cuph5tTDKWFpJvtfGeSkc0KNFZUT0BY0hhZMUQE9SUIVnVMBaQjp6kgi0DulMj9wfgPqYMlScGtijRhsUO16N3DcWQylrNPoZ_7zgQGvnuXij3Fr-7oDWXfrT2EHS2ljqvJUV3pcN0F8zoKatKwf0J9TLnIrTiUp9avp6u9WkiM30sBjFh8XYikm2TnOSrbMQJIuxS8_Wt168ahtxwhY9KzoRLrLyh7ByGruR0AEp5GesQJhGxwJrX-J8c909WKJXywHkeoRMcyDz4j8DeIgmSP7lEzT7fsDrgCBMhSDd3chTQYmCh-5r-_oFKoPdAotHc24nk1ra4F1vLII3YS7quxIyS7opJSg3Dh1suJNivY7YrKuevI5ajr-DP3B_ZyWuvbFiUxPYlWnVuJLcAXJxAOMPaCfwau4OfDH2bTQyrYVvuFLv4SjtvBezmXt08O3bj_-Hr7-Bkm_dHY1S5iL1HorWUfGQR_6Grtuca67enXRw4LZC22_FFru_fQ3R72_Kz0ZWDoQ5npQUPOBCq4ouTIDiEGc_XtcofNw6rJQtbIGXR7x1Wc_7qIekSa9JEyj9WdbE55emo6e5I_vGgI6f5kgOntNybIGPLVRjC_jMwudnGPQaT0_w-etdGB2gBtuojwN7VweA3L2HfC5xUA3_xtD9OCrE4Ziyu5Y_5jJNR817R_ZjAx0375F8NjjcrPCxlWpsBe-tfH4G_mR0GZ0hTe4miD_fMt0g-u_sk45p-D72iwNkT0m2nv_vIPt6xZdqlVXLbMlfcJXOk3yxXE4Xs5fDKl8uyoxX9UyUmC0XPMkWGRMocJmnOVuIF7liCZsmKWPJNF8ks8kUkznmTOTlcllhxsk0wSOXahKavomx-xfpXIOr-WKZTV8UL1G5-LUWY-GU5uTlMTYVBWEseppka2kDK9--2FXsHctm78g0UdJ5dzPspVfxK7KIi6Havr_-KNoxJVvfvo5yB9Oo8EIEzDJaCq6ivwME3aGi1BFUXxqrVn-srw2dULioI2wX7_rPAAAA___AoXpc">