<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/110611>110611</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[Flang][LAA] TSVC s2101, s233: not vectorized because the extents of arrays are not constant
</td>
</tr>
<tr>
<th>Labels</th>
<td>
loopoptim,
vectorization,
flang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
yus3710-fj
</td>
</tr>
</table>
<pre>
Flang can't vectorize the loops in `s2101` and `s233` of [TSVC](https://www.netlib.org/benchmark/vectors) while Clang can vectorize the loops written in C.
(Clang doesn't actually vectorize the loops because the vectorization of strided accesses is less beneficial.)
* Fortran
```fortran
! Fortran version
subroutine s2101(ntimes,ld,n,ctime,dtime,a,b,c,d,e,aa,bb,cc)
integer ntimes, ld, n, i, nl
real a(n), b(n), c(n), d(n), e(n), aa(ld,n), bb(ld,n), cc(ld,n)
call init(ld,n,a,b,c,d,e,aa,bb,cc,'s2101')
do 10 i = 1,n
aa(i,i) = aa(i,i) + bb(i,i) * cc(i,i)
10 continue
call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
end
```
```console
$ flang-new -v -O3 -flang-experimental-integer-overflow s2101.f -S -Rpass=vector -Rpass-analysis=vector -Rpass-missed=vector
flang-new version 20.0.0git (https://github.com/llvm/llvm-project.git 2c770675ce36402b51a320ae26f369690c138dc1)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/build/bin
Build config: +assertions
Found candidate GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11
Selected GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11
Candidate multilib: .;@m64
Selected multilib: .;@m64
"/path/to/build/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -target-cpu generic -target-feature +outline-atomics -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -fversion-loops-for-stride -flang-experimental-integer-overflow -Rpass=vector -Rpass-analysis=vector -Rpass-missed=vector -resource-dir /path/to/build/lib/clang/20 -mframe-pointer=non-leaf -O3 -o /dev/null -x f95-cpp-input s2101.f
path/to/s2101.f:9:10: remark: loop not vectorized: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
Unsafe indirect dependence. Memory location is the same as accessed at s2101.f:9:10 [-Rpass-analysis=loop-vectorize]
path/to/s2101.f:8:7: remark: loop not vectorized [-Rpass-missed=loop-vectorize]
```
* C
```c
// C version
#define LEN 32000
#define LEN2 256
float a[LEN], b[LEN], c[LEN], d[LEN], e[LEN];
float aa[LEN2][LEN2], bb[LEN2][LEN2], cc[LEN2][LEN2];
int s2101() {
init( "s2101");
for (int i = 0; i < LEN2; i++) {
aa[i][i] += bb[i][i] * cc[i][i];
}
dummy(a, b, c, d, e, aa, bb, cc, 0.);
return 0;
}
```
```console
$ clang -O3 s2101.c -S -Rpass=vector -Rpass-analysis=vector -Rpass-missed=vector
s2101.c:9:3: remark: the cost-model indicates that vectorization is not beneficial [-Rpass-analysis=loop-vectorize]
9 | for (int i = 0; i < LEN2; i++) {
| ^
s2101.c:9:3: remark: interleaved loop (interleaved count: 2) [-Rpass=loop-vectorize]
```
In Fortran, extents of arrays are sometimes not constant in compilation time. On the other hand, LAA requires that the pointer stride is constant.
I suspect the constraint is too restrictive. IIUC, it is sufficient for vectorization that the pointer stride is loop-invariant and never gets zero. SCEV can tell that but LAA doesn't check it at the moment.
(This might be resolved by MLIR or the polyhedral model.)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysWFtv47gO_jXqC-HAlnN96EPiTg4KzO4CO5d3RaIT7ciSjySn0_31B5ScNEnbOTPYLYLGomjeRFIfI0LQe4t4z2YbNnu4E0M8OH__PIR6UZVF-9fdzqnn-60Rdg9SWMYXEY4oo_P6b4R4QDDO9QG0BTYvA6_Kis1LEFbldV3T0rXAZpvPn742bPbA-PIQYx9YvWZ8y_j26elpYjEavZs4v2d8u0MrD53w3xjfZmWB8RU8HbRBaE62vGnHk9cxoiV7mgkrH1i5ZnyZ31EOQ_ZAyDgIY57fFLFDKYaQKad9EbWz5EaIXitUIKTEEDCADmAw0FsWWy21MBPGV6PmUf8ats5HL-xImJf5015ReXVigyP6oN24AekvDDvvhqgtQo4yX9qoOwyMN0Yx3ljGG0kUxhs1fgvGmx3RicZ4k2iJmKjyxtKsSduIe_RwFg9JPpAC0OnJXPJ7FAYE2UPieAO7i2d58awunvHimSxannzIEna3FLL1gvLKaCmMAW11vGD7CfcbxhdjOBdnuVmiclCVoIHVD1AlgXkzWUth0JSTtHtD4Zts_wVlne0_Uc5qqhKks1HbAV85o4aue_5Fb6rJjRNo1U3O3Syls8EZPKXgFFoqlcLiExRHKP6oocgU_N6j1x3aKEwxpkjhjuhb455ySk5aKD5B8WcvQmD1Q66dcV0IK8xz0K83Oh0CqjM5W_JixVgKwMtJOSn3OsKr_rHX8TDsJtJ1jG-NOZ6-it67v1DGCb3F5WJRzhczifV8WvLdrBI1LwXyeVvPV_NVKat6qWR1DuBn4fcYWU0n7uVhPi0G-826J1sYbYfvxd4OI-PBo1DQOYWG2HsX9Pe89WhDFMagetCethjf9iIeGN9GR31u0HS6250ek2tDBEqJVu8z_0aEgJ66T8gsWzdYRe1PaSUiwn-aBnRWk5rUqGYInqKgdxQfyo3tyQuP6iBidoLxbVVluZ_QoIyo_jWBzdnEbjBR05v1Gias3rBp2c2nN2p_xASM8x-Ejm_P6cI4h6KVFRTR697g-2eXUrWVzjhfKC321oWoZYCi82icTK4X6Uyh1xKKXsvC4BEN8LzQoeg1QhFTnhSyH2CPFj0xj7QWRRw80jG6IRptsRDRdUnNa5bjUrxFbvtC-O64fGvPorNQtGONFOn-Klrni3xN_Vzt_sOChcJjcIOXWCjt30vxnDiSzGF8y0soutaLDovekUGe1Q-WPEDR5q7jSJLCI-NbOxgDxXdoV7NC9n2hbT_EU8vJGXKp8rRRr1esXlclZZTHBCfqdbrjwboLFKOIPNggWgSFPVqFNkKHnfPP4Hr0KRcSxKGXJ_AlUPDr3ot9JyD5lMUqTXHfDZFuN7RiZ5D6f3QgDEX6mom6Gm3FiF0f6VEHZ6hgCHm4tkWrtN1fm0AvQMBeeGIkgTkAX7L92irtUcazIxIn8Ft25ZTVBFlIQxAdgggnKKNAvAT1FDuCba_zgdQW5_gRpPvBISxZvV783zO4UHTOr3fU3F5lZ5DV3N5tpy26JKC5RlWM1wpbglMfP_wONS_L8o0NDnw2P11JTkQQbLb5-OH3BGMJ6Vyu5NVKXa3wZVVvrgSOEjltvTxlEPTOjpRv7pwE5__axjNSTChksTkhgxEnUVsdOThde_WZoXVUyEsSkQFQyepNemxSVNKK8U36XImG7JDOltEX9SkSkdy5pq-zJxfECxvYYjzvMxISI7hMqDLByYQjM4DMoDEDxQbKybVDHuPgbXJjjNHinWx6FxjlOqfelHNb_mtgZ5Q3Fl19XStUqdKFON5FVOBSRKQSFvFmPtEhVdXLKPJL5UuHR2fZwO3fP0kHeFMim334Cd_TzWBQHFHlnpGNONOkG2xCaDypPfn6i63j0Z7HM0qo7xFtDDTrCe_FcwDhEYLrMI1DKcCUG1FQNCxI1_U6IyUgjgn8YXMDjwf0cBA25enHNTn230H709ERz3j5jVMlnd9J9Di7PkIYQk8NPeeBDdGLdA4BonPgkV6VUR9xAo-PX5o0o6XtMLSUBHSX0fldZ8oPLEih0_YovCYXaZK3eEQPe4wB_kbvJvCp-fA1jeARjcnCdkNMTr4M2fKA8hsZM6rqHCGQl6H880EH6PT-QBlLnjhDh7p7ht8-Pv4Jzo8GmucDKi9MRtgvE86duq_Vql6JO7yvFnyxmM7KxerucF-L1bzmy2m7WLZqqub1bLHk07kSq2opZ1zd6Xte8mlVllVZVVW1mqjFql6Uu2k5X-1mpdixaYmd0GZCY8TE-f2dDmHA-6oq51V1Z8QOTUi_mXBO8XJ91F3qog3j_CrUZ2qbsQ916jt_n-aT3bAPbFoaHWJ4URV1NOkHmfSry9ji12tqmZ8_fW1OTZ0e6lQwNzfp5e8Xb2fzZQ7fDd7c_9o8xfg2hSMktJ8icrzn_wsAAP__GIeRjg">