<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/56239>56239</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[SVE] Interleave selection problem
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
erickq
</td>
</tr>
</table>
<pre>
Recently, I was working on software optimization. Recently, I found that armclang performed better than clang in the following program. After static comparison, I found that armclang interleave was set to 2. and clang will be set to 1.
```
#include <cstdlib>
#include <iostream>
using namespace std;
#define Nx 4
#define Ny 200
#define Nz 200
#define INDEX3D_wDim(i, j, k, dimX, dimY, dimZ) \
(i) * (dimY) * (dimZ) + (j) * (dimZ) + (k)
void test(float *Hx, float *Hy, float *Hz, const float *Ex, const float *Ey,
const float *Ez, const float *cLx, const float *cLy,
const float *cLz, const float *cRx, const float *cRy,
const float *cRz) {
for (int i = 1; i < Nx - 1; ++i) {
for (int j = 1; j < Ny - 1; ++j) {
int ij = INDEX3D_wDim(i, j, 0, Nx, Ny, Nz);
for (int k = 1; k < Nz - 1; ++k) {
int ijk = ij + k;
int i_j1k = ijk + Nz;
int ij_k1 = ijk + Nz + 1;
float dzy = Ez[i_j1k] - Ez[ijk];
float dyz = Ey[ij_k1] - Ey[ijk];
Hx[ijk] = cLx[ijk] * Hx[ijk] + cRx[ijk] * (dyz - dzy);
}
}
}
}
```
Run the `-mllvm -small-loop-cost=26` options command to set the interleave value to 2. However, the default value of smallloopcost is 20.
Similarly, when `SmallLoopCost` is set to 20, armclang does not set interleave to 2. `Note that the performance of this test case deteriorates when interleave is set to 1.`
My question is, is SmallLoopCost too conservative? The default value can be 25 or 30.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyNVV1v4joQ_TXhxQLlA1J4yEMLXW2lLg90dbV3XyrjOGBwYjZ2oOHX3xk7NAlQ3Y2ixOOZM3Nsj2fWKq2TFWe8MLL2wjl5ISeqyUmVe1FsiCqIVpk50ZITdTAiF2dqhCpGpI_JVFWkxGypIbTMmaSAPfAyU2XOU7LmxvAS1QVxOlGAxAEmpTphoEOpNiXNR-QxQ1NtIAwjTOUHWgqtiq_DiAIAktMjt8w1N8QoEo4IBVNnchJSAomLLhh5_sLzH5tv7DevE8NIFExWKSdeNGfapFKsvej5nlYobUpO81Ztv5XGFRU05_pAGYQ1qRc99WKGUcozUXCy_CDjm7mahL5_M3vuzF7pXpaL51_R4v20ELkXTgXu1g4_e_ykIv_V_P9t_r-9cEb-5_EmcxcHhtbnDP6POHaOuuJvJz6huPtaBXRm3RUclYDj5NqALpMKjhVw3z-QZCvWffGMIlOFNu3k88e9SQReVtB5rs3uOWSv9zyy179zyV7v-lzd9bn6S5-rs93Ih6eLLVwuezCFIQKycUECyDI7nGNaDZ0MOw-vuML20LsWvXPouo_e3aAJsWEd8svs8_GztKte2lNc4hqu7oJz16Gzb-nsHZ1zn86-R6e_dY6Wc4H0IO_2nwF7Ru-74GK2t3ZA7q7h7n0fXBnaX3Bj7g4rPdfWHBJr8mTDeJMFrMDJO5S-QtZnh6ytJcRtkPUtso-HO3OxsB4wf9sJuIs9AyCPydgzwMta40YD_e4hNbXgYdGKHeFz2A765XRVuUoPE8NcymNOhjqnUg6lUochgwIKdMMY1La_QM5j0c-xdkOltgUb0J0af6Sy4k2F_65O_MhLzCy0gmJIK2kaE5URGwkDYRwiNBTQXul_g34maema2GnLC6T5hqBXAM2RHPASbVOxCf3ZeVLFNSmUseoORUcOoEtluOtWSK_ph7RglpzZgmOsfYRRjeQBL1RJYcpx6XhsKUDvintN4EdN_lTgBbYOzJAfGPfWADBlqwkvj9BVj9yLvpGfN_vFoDlDjwwnBK5i1N-oQZpE6Sya0YERRvIEUuftn2dMnZeWpOaSM8sDmvla8nxQlTLZGnPQXgTN6hu8G2G21XoERwwCpkPzGwJkB2gQhdawHhhM4jCaDbZJHMdj7vNpGtNpPAnjNIymbMYyNslYEI8nA0nXXGok5YVhwU_EuoAxEByIJPTD0I_DB38aBf50FMxomEVBMAvGURZEa2_s85wKOUIeI1VuBmViKa2rjQalFNroVkm1FpuC2z1A_7QyW1UmcHZs_2dgIyeW-X_6GLRP">