<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/79258>79258</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[Flang] TSVC s118: not vectorized because LICM doesn't work
</td>
</tr>
<tr>
<th>Labels</th>
<td>
flang:ir,
vectorization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
yus3710-fj
</td>
</tr>
</table>
<pre>
Flang can't vectorize the loop in `s118` of [TSVC](https://www.netlib.org/benchmark/vectors) while Clang can vectorize the loop written in C.
```fortran
! Fortran version
module mod
integer ld, nloops
parameter (ld=1000,nloops=135)
real a(ld), b(ld), c(ld), d(ld), e(ld)
real aa(ld,ld), bb(ld,ld), cc(ld,ld)
interface
subroutine dummy(ld,n,a,b,c,d,e,aa,bb,cc,x)
integer ld, n
real a(ld), b(ld), c(ld), d(ld), e(ld)
real aa(ld,ld), bb(ld,ld), cc(ld,ld)
real, value :: x
end subroutine
end interface
end module
subroutine s118 (n)
use mod
integer n, i, j
call init(ld,n,a,b,c,d,e,aa,bb,cc,'s118 ')
do 10 i = 2,n
do 20 j = 1,i-1
a(i) = a(i) + bb(i,j) * a(i-j)
20 continue
10 continue
call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
end
```
```c
// C version
#define LEN 32000
#define LEN2 256
float a[LEN], b[LEN], c[LEN], d[LEN], e[LEN];
float aa[LEN2][LEN2], bb[LEN2][LEN2], cc[LEN2][LEN2];
int s118() {
init( "s118 ");
for (int i = 1; i < LEN2; i++) {
for (int j = 0; j <= i - 1; j++) {
a[i] += bb[j][i] * a[i-j-1];
}
}
dummy(a, b, c, d, e, aa, bb, cc, 0.);
return 0;
}
```
```console
$ flang-new -v -Ofast s118.f -S -Rpass=vector
flang-new version 18.0.0 (https://github.com/llvm/llvm-project.git 2759e47067ea286f6302adcfe93b653cfaf6f2eb)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/install/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/12
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/12
Candidate multilib: .;@m64
Selected multilib: .;@m64
"/path/to/install/bin/flang-new" -fc1 -triple x86_64-unknown-linux-gnu -emit-obj -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -ffast-math -target-cpu x86-64 -fstack-arrays -fversion-loops-for-stride -mframe-pointer=none -O3 -o /tmp/s118-5868cd.o -x f95-cpp-input s118.f
$ clang -Ofast s118.c -S -Rpass=vector
/path/to/s118.c:16:4: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
16 | for (int j = 0; j <= i - 1; j++) {
| ^
```
Hoisting the store outside the loop is necessary for vectorization, but it doesn't work because BasicAA says `a(i)` and `a(i-j)` may alias each other.
It's similar to #74262 but BasicAA won't do complicated analyses, so I suspect it's difficult to fix BasicAA.
```llvm
25: ; preds = %.lr.ph, %25
%indvars.iv = phi i64 [ 1, %.lr.ph ], [ %indvars.iv.next, %25 ], !dbg !30
%26 = phi float [ %.promoted, %.lr.ph ], [ %32, %25 ], !dbg !30
%27 = mul nuw nsw i64 %indvars.iv, 1000, !dbg !30
%gep13 = getelementptr float, ptr %invariant.gep, i64 %27, !dbg !30
%28 = load float, ptr %gep13, align 4, !dbg !30, !tbaa !31
%29 = sub nuw nsw i64 %indvars.iv21, %indvars.iv, !dbg !30
%gep12 = getelementptr float, ptr getelementptr ([1000 x float], ptr @_QMmodEa, i64 -1, i64 999), i64 %29, !dbg !30
%30 = load float, ptr %gep12, align 4, !dbg !30, !tbaa !27
%31 = fmul fast float %30, %28, !dbg !30
%32 = fadd fast float %31, %26, !dbg !30
store float %32, ptr %24, align 4, !dbg !30, !tbaa !27 ;; this can be hoisted
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !33
%exitcond.not = icmp eq i64 %indvars.iv.next, %indvars.iv21, !dbg !26
br i1 %exitcond.not, label %._crit_edge, label %25, !dbg !26
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysWN2S2yoSfhrmpguXhCzZvpgLWxPvpur81G5S5zaFBLKZSKAFNJ7Zp99qkGzZ85OTraRSY9M0XzdN90dj7pw6aCnvSb4j-cMdH_zR2PuXwWWrNKHN411lxMv9vuX6ADXXhK08PMnaG6v-K8EfJbTG9KA0kCJxabomRQKmAZLvvn75qyT5A2Hro_e9I9mWsD1h-9PptNDSt6paGHsgbF9JXR87br8Tto_YjrANnI6qlVBOpt8ye7LKe6nRfLkgyQNJtuPfIon_G2O95XqUshT2UQBP0jplxgkI_zojhlbix1yqtJcHaaEVhJWg0a6bz_fc8k56aYGwdStI9pAmSUJYOapmD2mWE7YZ11jJW-BRlW0QspoP6vlAzAfyPJgZj2gTXHnBrF6J6vpadLNF2_BazoVuqKwZvNISxNB1L9NqTVjJCSsrwsqasBJlEmVBGKQofr6x8TqSN7O_NjC_LDYjEGo98XaQgHmcbeF5riS1mMXrdubN8OJETLh53r4KPZYUJpa-cWtw7yYqHhAo_PP4GrrmbQtKK_-Tx0nYanRldeOJMJAmoIBkD8AC4FX0hAGWwGOYTgkrFU1vwhsOXWHFo85lwHbxqHArj1GyjdP0ceYDS8K-jPZKD1OI0-SV6LL9_yed08XFpNTihmXepJ56Ih0kPSivGYewTMgGT_i3T39AxpAyXk8wYHkR5U1ruAdO8t1vn_4ItIoFMh_VVyNxNZKXUba7AhwRGU5dvsUyeWemrt-cmYDjX6V9SF7C1uHsVrvpHMbsA8LYmFIMY5udFRoTuBQR1Jg32S58LUNQwoiwXfh_BX21NuZcgtr4tcSRAhrRHt9ZDyHEiuQPmIC4JMThMW51lG-jEn2k6XzjuJqsHqbB7OuUcXyktUBngcYCfZWBpiIzRTIqIVlcB8VKP1gdtjOGeYL_QRYa7cyZZNgSGrxQqZYnoE9A_2y4iwe1aIB-Afrvnju8tOJtO6XKtGTMYUjXi2SRwKu7_aD8cagWtekI27ft0_RBe2seZe0XB-WBrfKNXK6SYiU5WxdNkSWMi7qRm6wq8qxueFM0TFbnivvK7UH6wLrr4luxpIP-rs1J01bp4Zke9DDqHa3kgVZli9q9cWrk6c_aed62Ujwoi1OE7Xvuj4TtvSFsr-I0NiNqrNC9GbTAxkMowb2Ef5QljGrcYx1HlMFZ3KOqcPd4dPvRx4tvbJ-yiPlFtrL2UvwSsPLsWje0XuGqbAsLTJBl0hXLG5MfKcUa_CAibH_OAcIY0KZOgXqr-la-eyZAZac8NdUj6pvWWCoUP2jjvKod0M7K1tRh-zQcGfSqBtqrmrbySbbA4kA52isJtMFUpR33R6A-ZASt-wHN02IJtHGe198pt5a_OKDNmKo0dGG0MZY6b5WQQLsGOzbam3Axk-xBGy2B_pkBNXgKvusJ22NN0HxdrGuxMECfodnktO57qnQ_TCVzKas69KnzeqrfraebUEdtkm3TgmTbJR6QlaEfzraXrlfElpew9SQKoYOTEv6Imstw7-OWWsmfpIDaDDpUDQs0l-_OziASPSMjh00ElhZAVuUvoFFEIfmnDyjqn0Y5r_QhtPPOGyvBDN7hEV3eFQ60rKVz3L4Ep672Hjhz8KA8CCNdfJ2cjP0Olaw5tkg77lS93YLDnCBFMnUX-ErhWpxFsaMoEuj4C_BWcQeS10cw_ijt-LL47LEJAqc61XILHnMlWy1ZwYITk6mTiX4IA7Xp-lbVHMuPa96-OOnQZ2fgM7jB9bJG5wOsUE2j6qH1CNyo5wnv7WdNoNYgYnngjmwHvZXChcMiLF-0dtEf0RhhOcvPdxLLlRZP3LqFegq6_VGBKpaYHqE_uyyG8brHmat1Cy2f_Rn6rMZSUR3wI0tm5lhxNhM7jhFu0VvTGS_FRzYz9rfNrIKZbmhBDyfQ7hR3Nfcbl49vs_dgDrJPs4B0kF62spPa995G13EdDgLoE7eKa784yD7UXTSG3fH7Pq4Dcmu4eAUYDIdOoFUHHWv5CiaOfcV5EKRz3Ng5u6F6f-9sOtvrcHwQBvajMFxPYZ-X7zC88DzqxQMLk8vk279-74z4xKdg0XT6ttlsxmfYFMTNB65lycdBZH8_iGw1x00DboMZFDh8zFa0OCXh-iO_YrwaLsTt-in0rHhvfWS_ywo22xRb_syOkAmQDPxRufCrSSXhiEQrxZscEGo5PryE-EHp3BjPZoDyWfnaaLHQJqKpuutB_uc10pw8XqfnBM6KCbyyoNJbE6jc8kq2gTm-1Vb5b1Ic5JWc5W9jnkn0TtxnYpNt-J28T1dJkazyItncHe-TvGIp26TVJk02PF1XS7lKC5HyKi_Sdba8U_csYcskZcskzzO2WYhErFdJJdP1JlslQpBlIjuu2gUS9cLYw51ybpD3qw3L13fBRRd-c2MsNFYk2yobmrCSMHZzyeHr6s7ehz66Gg6OLJNWOe8u4F75NvyGF36ow0fK1y9_lfH9lW0BD2XWR0yX42-fy99vLs67wbb3P9fSY6-Ie3OE7cP2_hcAAP__t-imIQ">