<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/129474>129474</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[LV] Inefficient gather/scatter address calculation for strided access
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
kinoshita-fj
</td>
</tr>
</table>
<pre>
LLVM generates inefficient code for strided array access in loops. Address calculations within the loop use vector operations on offset vectors instead of the scalar base register, leading to performance degradation.
For example:
```c
void func(double* a, int n)
{
for (int i = 0; i < n; i++) {
a[i*5] = 1;
}
}
```
SVE
```
.LBB0_4:
add z3.d, z2.d, z0.d
mul z2.d, z2.d, #40
subs x11, x11, x9
st1d { z1.d }, p0, [x0, z2.d]
mov z2.d, z3.d
b.ne .LBB0_4
```
AVX-512
```
.LBB0_4:
vpmullq ymm4, ymm0, ymm1
kxnorw k1, k0, k0
vscatterqpd qword ptr [rdi + ymm4] {k1}, ymm2
vpaddq ymm0, ymm0, ymm3
add rdx, -4
jne .LBB0_4
```
https://godbolt.org/z/9MPnPvKG8
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyMVF2L6zYQ_TXKyxAjS_Z1_OAHe9OU0r1wobD0rcjWONFGlrySnE321xcp2bvZbQs1xmM0H-fMERzhvdobxIaUHSm3K7GEg3XNURnrDyqI9fi86q28NI-PT99hjwadCOhBGRxHNSg0AQYrEUbrwAenJEoQzokLiGFAHytBWzv7DFopXTwZhB4WLYKyxsOrCgdlIBwwlcHiEU44BOvAzhEsVVkDdhw9hlsujvUBhQQ7pl4_CC0c9MIjONwrH9AR9gAahVRmD8HCjG60bhJmQJC4d0Km4RmhLaHtzjrAs5hmjYS31zPyjV7fgdD2ZJWEcTEDYRtpl14jYS2ICKJMAENYHVuqjtAWAJIghG1iTgHhW6CEd-n3AUz6JaxLbw0fXfERpOwUYW1Jym3qzAlPeVJtE8T2ntuV6h9Pv3zhTGibPXYd_au47vNzupQpvvFMRvJv7BZpJmPZtOhrnn3OE8YLejfHL72P8ZznMf0e6vuSkCcsUnXwlmcyLcAeYKZpYNmd6U-EcnvXONnTZw78yu329JlBeF_uH1K0T3-uy5z9TzlO87Ro_QKXaSoi1GWa6C3msex4Nta9AhzTekd6-94N8IMIAd3LfNX15dU6CXNwcUEnFRDWXafH26y6Y35T4TJN7BMRIeUL3BN4j_xfrs_Jc8yuowDwbDAd_qcmhxBmHxdnO8J2eyt7q0Nm3Z6w3Rthu_r7D_Pj9Puvm5VsuKx5LVbY5FVB66qsNnx1aGTNC1qxWmApx-pb0Ve5KHIsEXHo-1quVMMoKymnnBaMU56V1dCXG16Igm8YlUgKipNQOtP6NEXslfJ-wSZndVEVKy161D75EGMGXyFlCWPRllwTm9b9svekoFr54D_GBBV0MrDHp6jxb3fWtBfhEI1gd7ukKN9XD_psXcm0VovTzRfNVDgsfTbYibBdhL6F9ezsMw6BsF0i7Anb3TY6NezvAAAA__9bfor_">