<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/78934>78934</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[Flang] TSVC s243: the loop should be canonicalized for alias analysis in LLVM
</td>
</tr>
<tr>
<th>Labels</th>
<td>
loopoptim,
flang:ir
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
yus3710-fj
</td>
</tr>
</table>
<pre>
Flang can't vectorize the loop in `s243` of [TSVC](https://www.netlib.org/benchmark/vectors) while Clang can vectorize the loop written in C.
```fortran
! Fortran version
module mod
integer ld, nloops
parameter (ld=1000,nloops=135)
real a(ld), b(ld), c(ld), d(ld), e(ld)
real aa(ld,ld), bb(ld,ld), cc(ld,ld)
interface
subroutine dummy(ld,n,a,b,c,d,e,aa,bb,cc,x)
integer ld, n
real a(ld), b(ld), c(ld), d(ld), e(ld)
real aa(ld,ld), bb(ld,ld), cc(ld,ld)
real, value :: x
end subroutine
end interface
end module
subroutine s243 (n)
use mod
integer n, i
call init(ld,n,a,b,c,d,e,aa,bb,cc,'s243 ')
do 10 i = 1,n-1
a(i) = b(i) + c(i) * d(i)
b(i) = a(i) + d(i) * e(i)
a(i) = b(i) + a(i+1) * d(i)
10 continue
call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
end
```
```c
// C version
#define LEN 32000
#define LEN2 256
float a[LEN], b[LEN], c[LEN], d[LEN], e[LEN];
float aa[LEN2][LEN2], bb[LEN2][LEN2], cc[LEN2][LEN2];
int s243() {
init( "s243 ");
for (int i = 0; i < LEN-1; i++) {
a[i] = b[i] + c[i ] * d[i];
b[i] = a[i] + d[i ] * e[i];
a[i] = b[i] + a[i+1] * d[i];
}
dummy(a, b, c, d, e, aa, bb, cc, 0.);
return 0;
}
```
```console
$ flang-new -v -Ofast s243.f -S -Rpass=vector
flang-new version 18.0.0git (https://github.com/llvm/llvm-project.git 0e93d04001e45f39cabf0ffb5093512a7f622cc0)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/install/bin
Found candidate GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Selected GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
"/path/to/install/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -ffast-math -target-cpu generic -target-feature +neon -target-feature +v8a -fstack-arrays -fversion-loops-for-stride -Rpass=vector -O3 -o s243.s -x f95-cpp-input s243.f
$ clang -Ofast s243.c -S -Rpass=vector
s243.c:15:3: remark: vectorized loop (vectorization width: 4, interleaved count: 1) [-Rpass=loop-vectorize]
15 | for (int i = 0; i < LEN-1; i++) {
| ^
```
The Fortran code isn't vectorized because DSE doesn't remove the first store to `a(i)`.
DSE uses BasicAA, and BasicAA can't say that `a(i)` and `a(i+1)` don't alias each other. So DSE doesn't work.
Actually, Flang generates LLVM IR like the following C code that makes alias analysis harder.
(FYI: BasicAA avoids complicated analyses which can affect the compilation time.)
```c
for (int i = 1; i <= n-1; i++) {
a[i-1] = b[i-1] + c[i-1] * d[i-1];
b[i-1] = a[i-1] + d[i-1] * e[i-1];
a[i-1] = b[i-1] + a[i] * d[i-1];
}
```
Conversely, the following Fortran code is easy to be vectorized.
```fortran
do 10 i = 0,n-2
a(i-1) = b(i-1) + c(i-1) * d(i-1)
b(i-1) = a(i-1) + d(i-1) * e(i-1)
a(i-1) = b(i-1) + a(i) * d(i-1)
10 continue
```
```console
$ flang-new -Ofast s243_2.f -S -Rpass=vector
path/to/s243_2.f:14:7: remark: vectorized loop (vectorization width: 4, interleaved count: 1) [-Rpass=loop-vectorize]
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEWF2P6jjS_jXum5JR4hA-LriAcHh1pPPOSNNHR9qrleNUwNPGRrYDp_fXr-wkkADduzM7q221uqkqV9muj8dVcOfkXiOuSL4h-faFN_5g7Oq9cdk8TWj9-0tpqvfVTnG9B8E1YXMPZxTeWPkPBH9AUMacQGogs8SxaUZmCZgaSL75_vqjIPmWsMXB-5Mj2ZqwHWG7y-Uy0eiVLCfG7gnblajF4cjtG2G71rYjbAmXg1QIRb_1s20vVnqPOmxfTEiyJcm6-ztL2t_aWG-57rgshV3LgDNaJ00ngPhzNFWjMPwbcqX2uEcLqiKsAB32dUP5iVt-RI8WCFuoimTbNEkSwopuabZNs5ywZadjkSvg7VK2DCbLISGGRDUk8EoMNm-t9eaKm83ygSXEmHV3RVtzgUOma0prGi81QtUcj--9tias4IQVJWGFIKwIPAy8yIzcwP55t8ejJ--kf61j_jLfdIbCqjNXDULI42wNP4eLUFcDf91Lnro3CNqEG-btg-tDSYXE0nfHatyHiRoCBPLRquBKgdTS_8FIEjbvTjG_O0RlIE1AAsm2kAaDNO3EwesyFHEQlVeCbWIcIwFA2DpGUj66vBzq86F-NdbHsf7H-3aSTdoyHndOEwBhtJe6wQe3_ZkKSCd390Jd3aHTU8gSPVgFsIRijFSEZRXWITO-ffkFMhag5lHAgOWzll8rwz1wkm--ffklwnEorCElRlQ1ovBGZZuRwc4iC6Lbp7a8PpAI8VTSG27_Su1j0hO2iJGab3oXdqkLhLEuH1lwcHZdUJuIwcFCm5MJyTbxYxGcQtNIEraJvyPbED0kSb7tMqcnYsbmGwnQkuvoIDk8d5uvA20-1K7G2vhM--O9W0nI2o93J_Nt_7FPU97BZ4TNCJcRJosIhy0CtqBXQDIZO9Gib6yOruvC0pv_F1lrtDNXMGNTqMPDTTVegJ6B_lpz1wZ2UgN9BfrbibvwOLavep9avUqX85AuJskk2UsPD23EXvpDU06EORK2U-rc_6Mna35H4SdBK8FlViXTJElxmtfZUvCyTuq6zJNllqeMz-sZY0Ik10r9zu0efQB4zq04zKa00W_aXDRVUjc_6V433cKDRR4hHFVYfjJOdm_CV-08VwqrrbRBRNjuxP2BsJ03hO1kKw6Nj-yqemcaXYUmp5IV9wj_VxTQLeM-1P6jFSXL4IQQxN3HR2W7NJ2wSReyV1QoPFb_LfvF9QLHRnkZTGRrmIRUmibH2fTuFJ8taqv7E7-x3TVbCGNAa5EC9VaeFH4cuph5tTDKWFpJvtfGeSkc0KNFZUT0BY0hhZMUQE9SUIVnVMBaQjp6kgi0DulMj9wfgPqYMlScGtijRhsUO16N3DcWQylrNPoZ_7zgQGvnuXij3Fr-7oDWXfrT2EHS2ljqvJUV3pcN0F8zoKatKwf0J9TLnIrTiUp9avp6u9WkiM30sBjFh8XYikm2TnOSrbMQJIuxS8_Wt168ahtxwhY9KzoRLrLyh7ByGruR0AEp5GesQJhGxwJrX-J8c909WKJXywHkeoRMcyDz4j8DeIgmSP7lEzT7fsDrgCBMhSDd3chTQYmCh-5r-_oFKoPdAotHc24nk1ra4F1vLII3YS7quxIyS7opJSg3Dh1suJNivY7YrKuevI5ajr-DP3B_ZyWuvbFiUxPYlWnVuJLcAXJxAOMPaCfwau4OfDH2bTQyrYVvuFLv4SjtvBezmXt08O3bj_-Hr7-Bkm_dHY1S5iL1HorWUfGQR_6Grtuca67enXRw4LZC22_FFru_fQ3R72_Kz0ZWDoQ5npQUPOBCq4ouTIDiEGc_XtcofNw6rJQtbIGXR7x1Wc_7qIekSa9JEyj9WdbE55emo6e5I_vGgI6f5kgOntNybIGPLVRjC_jMwudnGPQaT0_w-etdGB2gBtuojwN7VweA3L2HfC5xUA3_xtD9OCrE4Ziyu5Y_5jJNR817R_ZjAx0375F8NjjcrPCxlWpsBe-tfH4G_mR0GZ0hTe4miD_fMt0g-u_sk45p-D72iwNkT0m2nv_vIPt6xZdqlVXLbMlfcJXOk3yxXE4Xs5fDKl8uyoxX9UyUmC0XPMkWGRMocJmnOVuIF7liCZsmKWPJNF8ks8kUkznmTOTlcllhxsk0wSOXahKavomx-xfpXIOr-WKZTV8UL1G5-LUWY-GU5uTlMTYVBWEseppka2kDK9--2FXsHctm78g0UdJ5dzPspVfxK7KIi6Havr_-KNoxJVvfvo5yB9Oo8EIEzDJaCq6ivwME3aGi1BFUXxqrVn-srw2dULioI2wX7_rPAAAA___AoXpc">