<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/79258>79258</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [Flang] TSVC s118: not vectorized because LICM doesn't work
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            flang:ir,
            vectorization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          yus3710-fj
      </td>
    </tr>
</table>

<pre>
    Flang can't vectorize the loop in `s118` of [TSVC](https://www.netlib.org/benchmark/vectors) while Clang can vectorize the loop written in C.

```fortran
! Fortran version
      module mod
      integer ld, nloops
      parameter (ld=1000,nloops=135)
 real a(ld), b(ld), c(ld), d(ld), e(ld)
      real aa(ld,ld), bb(ld,ld), cc(ld,ld)
      interface
      subroutine dummy(ld,n,a,b,c,d,e,aa,bb,cc,x)
         integer ld, n
         real a(ld), b(ld), c(ld), d(ld), e(ld)
         real aa(ld,ld), bb(ld,ld), cc(ld,ld)
         real, value :: x
      end subroutine
      end interface
      end module

      subroutine s118 (n)
      use mod
      integer n, i, j

      call init(ld,n,a,b,c,d,e,aa,bb,cc,'s118 ')
      do 10 i = 2,n
        do 20 j = 1,i-1
          a(i) = a(i) + bb(i,j) * a(i-j)
  20    continue
 10  continue
      call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
 end
```

```c
// C version
#define LEN 32000
#define LEN2 256
float a[LEN], b[LEN], c[LEN], d[LEN], e[LEN];
float aa[LEN2][LEN2], bb[LEN2][LEN2], cc[LEN2][LEN2];

int s118() {
  init( "s118 ");
  for (int i = 1; i < LEN2; i++) {
    for (int j = 0; j <= i - 1; j++) {
      a[i] += bb[j][i] * a[i-j-1];
    }
  }
  dummy(a, b, c, d, e, aa, bb, cc, 0.);
  return 0;
}
```

```console
$ flang-new -v -Ofast s118.f -S -Rpass=vector
flang-new version 18.0.0 (https://github.com/llvm/llvm-project.git 2759e47067ea286f6302adcfe93b653cfaf6f2eb)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/install/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/12
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/12
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/path/to/install/bin/flang-new" -fc1 -triple x86_64-unknown-linux-gnu -emit-obj -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -ffast-math -target-cpu x86-64 -fstack-arrays -fversion-loops-for-stride -mframe-pointer=none -O3 -o /tmp/s118-5868cd.o -x f95-cpp-input s118.f
$ clang -Ofast s118.c -S -Rpass=vector
/path/to/s118.c:16:4: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
   16 | for (int j = 0; j <= i - 1; j++) {
      | ^
```

Hoisting the store outside the loop is necessary for vectorization, but it doesn't work because BasicAA says `a(i)` and `a(i-j)` may alias each other.
It's similar to #74262 but BasicAA won't do complicated analyses, so I suspect it's difficult to fix BasicAA.

```llvm
25: ; preds = %.lr.ph, %25
  %indvars.iv = phi i64 [ 1, %.lr.ph ], [ %indvars.iv.next, %25 ], !dbg !30
  %26 = phi float [ %.promoted, %.lr.ph ], [ %32, %25 ], !dbg !30
  %27 = mul nuw nsw i64 %indvars.iv, 1000, !dbg !30
  %gep13 = getelementptr float, ptr %invariant.gep, i64 %27, !dbg !30
  %28 = load float, ptr %gep13, align 4, !dbg !30, !tbaa !31
  %29 = sub nuw nsw i64 %indvars.iv21, %indvars.iv, !dbg !30
  %gep12 = getelementptr float, ptr getelementptr ([1000 x float], ptr @_QMmodEa, i64 -1, i64 999), i64 %29, !dbg !30
  %30 = load float, ptr %gep12, align 4, !dbg !30, !tbaa !27
  %31 = fmul fast float %30, %28, !dbg !30
  %32 = fadd fast float %31, %26, !dbg !30
  store float %32, ptr %24, align 4, !dbg !30, !tbaa !27 ;; this can be hoisted
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !33
  %exitcond.not = icmp eq i64 %indvars.iv.next, %indvars.iv21, !dbg !26
  br i1 %exitcond.not, label %._crit_edge, label %25, !dbg !26
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysWN2S2yoSfhrmpguXhCzZvpgLWxPvpur81G5S5zaFBLKZSKAFNJ7Zp99qkGzZ85OTraRSY9M0XzdN90dj7pw6aCnvSb4j-cMdH_zR2PuXwWWrNKHN411lxMv9vuX6ADXXhK08PMnaG6v-K8EfJbTG9KA0kCJxabomRQKmAZLvvn75qyT5A2Hro_e9I9mWsD1h-9PptNDSt6paGHsgbF9JXR87br8Tto_YjrANnI6qlVBOpt8ye7LKe6nRfLkgyQNJtuPfIon_G2O95XqUshT2UQBP0jplxgkI_zojhlbix1yqtJcHaaEVhJWg0a6bz_fc8k56aYGwdStI9pAmSUJYOapmD2mWE7YZ11jJW-BRlW0QspoP6vlAzAfyPJgZj2gTXHnBrF6J6vpadLNF2_BazoVuqKwZvNISxNB1L9NqTVjJCSsrwsqasBJlEmVBGKQofr6x8TqSN7O_NjC_LDYjEGo98XaQgHmcbeF5riS1mMXrdubN8OJETLh53r4KPZYUJpa-cWtw7yYqHhAo_PP4GrrmbQtKK_-Tx0nYanRldeOJMJAmoIBkD8AC4FX0hAGWwGOYTgkrFU1vwhsOXWHFo85lwHbxqHArj1GyjdP0ceYDS8K-jPZKD1OI0-SV6LL9_yed08XFpNTihmXepJ56Ih0kPSivGYewTMgGT_i3T39AxpAyXk8wYHkR5U1ruAdO8t1vn_4ItIoFMh_VVyNxNZKXUba7AhwRGU5dvsUyeWemrt-cmYDjX6V9SF7C1uHsVrvpHMbsA8LYmFIMY5udFRoTuBQR1Jg32S58LUNQwoiwXfh_BX21NuZcgtr4tcSRAhrRHt9ZDyHEiuQPmIC4JMThMW51lG-jEn2k6XzjuJqsHqbB7OuUcXyktUBngcYCfZWBpiIzRTIqIVlcB8VKP1gdtjOGeYL_QRYa7cyZZNgSGrxQqZYnoE9A_2y4iwe1aIB-Afrvnju8tOJtO6XKtGTMYUjXi2SRwKu7_aD8cagWtekI27ft0_RBe2seZe0XB-WBrfKNXK6SYiU5WxdNkSWMi7qRm6wq8qxueFM0TFbnivvK7UH6wLrr4luxpIP-rs1J01bp4Zke9DDqHa3kgVZli9q9cWrk6c_aed62Ujwoi1OE7Xvuj4TtvSFsr-I0NiNqrNC9GbTAxkMowb2Ef5QljGrcYx1HlMFZ3KOqcPd4dPvRx4tvbJ-yiPlFtrL2UvwSsPLsWje0XuGqbAsLTJBl0hXLG5MfKcUa_CAibH_OAcIY0KZOgXqr-la-eyZAZac8NdUj6pvWWCoUP2jjvKod0M7K1tRh-zQcGfSqBtqrmrbySbbA4kA52isJtMFUpR33R6A-ZASt-wHN02IJtHGe198pt5a_OKDNmKo0dGG0MZY6b5WQQLsGOzbam3Axk-xBGy2B_pkBNXgKvusJ22NN0HxdrGuxMECfodnktO57qnQ_TCVzKas69KnzeqrfraebUEdtkm3TgmTbJR6QlaEfzraXrlfElpew9SQKoYOTEv6Imstw7-OWWsmfpIDaDDpUDQs0l-_OziASPSMjh00ElhZAVuUvoFFEIfmnDyjqn0Y5r_QhtPPOGyvBDN7hEV3eFQ60rKVz3L4Ep672Hjhz8KA8CCNdfJ2cjP0Olaw5tkg77lS93YLDnCBFMnUX-ErhWpxFsaMoEuj4C_BWcQeS10cw_ijt-LL47LEJAqc61XILHnMlWy1ZwYITk6mTiX4IA7Xp-lbVHMuPa96-OOnQZ2fgM7jB9bJG5wOsUE2j6qH1CNyo5wnv7WdNoNYgYnngjmwHvZXChcMiLF-0dtEf0RhhOcvPdxLLlRZP3LqFegq6_VGBKpaYHqE_uyyG8brHmat1Cy2f_Rn6rMZSUR3wI0tm5lhxNhM7jhFu0VvTGS_FRzYz9rfNrIKZbmhBDyfQ7hR3Nfcbl49vs_dgDrJPs4B0kF62spPa995G13EdDgLoE7eKa784yD7UXTSG3fH7Pq4Dcmu4eAUYDIdOoFUHHWv5CiaOfcV5EKRz3Ng5u6F6f-9sOtvrcHwQBvajMFxPYZ-X7zC88DzqxQMLk8vk279-74z4xKdg0XT6ttlsxmfYFMTNB65lycdBZH8_iGw1x00DboMZFDh8zFa0OCXh-iO_YrwaLsTt-in0rHhvfWS_ywo22xRb_syOkAmQDPxRufCrSSXhiEQrxZscEGo5PryE-EHp3BjPZoDyWfnaaLHQJqKpuutB_uc10pw8XqfnBM6KCbyyoNJbE6jc8kq2gTm-1Vb5b1Ic5JWc5W9jnkn0TtxnYpNt-J28T1dJkazyItncHe-TvGIp26TVJk02PF1XS7lKC5HyKi_Sdba8U_csYcskZcskzzO2WYhErFdJJdP1JlslQpBlIjuu2gUS9cLYw51ybpD3qw3L13fBRRd-c2MsNFYk2yobmrCSMHZzyeHr6s7ehz66Gg6OLJNWOe8u4F75NvyGF36ow0fK1y9_lfH9lW0BD2XWR0yX42-fy99vLs67wbb3P9fSY6-Ie3OE7cP2_hcAAP__t-imIQ">