<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/65058>65058</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[bug] Different results of big and little endian's loop vectorize testcase on clang-15.x
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Jolyon0202
</td>
</tr>
</table>
<pre>
Here is our testcast:
```
unsigned char bitmap;
unsigned segNum;
#define MIN(a, b) ((a) <= (b) ? (a) : (b))
__attribute__((noinline)) void tf(unsigned char segnum) {
segNum = segnum;
unsigned char seg = MIN(segNum, 4);
for (unsigned int i = 0; i < seg; i++) {
bitmap |= (unsigned char)(1UL << i);
}
}
int main() {
tf(0);
printf("tf(0) bitmap: 0x%x\n", bitmap);
tf(1);
printf("tf(1) bitmap: 0x%x\n", bitmap);
tf(2);
printf("tf(2) bitmap: 0x%x\n", bitmap);
tf(3);
printf("tf(3) bitmap: 0x%x\n", bitmap);
tf(4);
printf("tf(4) bitmap: 0x%x\n", bitmap);
tf(5);
printf("tf(5) bitmap: 0x%x\n", bitmap);
return 0;
}
```
$clang test.cpp -O2 --target=aarch64-linux-gnu -march=armv8-a && a.out
tf(0) bitmap: 0x0
tf(1) bitmap: 0x1
tf(2) bitmap: 0x3
tf(3) bitmap: 0x7
tf(4) bitmap: 0xf
tf(5) bitmap: 0xf
$clang test.cpp -O2 --target=aarch64_be-linux-gnu -march=armv8-a && a.out
tf(0) bitmap: 0x0
tf(1) bitmap: 0x0
tf(2) bitmap: 0x0
tf(3) bitmap: 0x0
tf(4) bitmap: 0x0
tf(5) bitmap: 0x0
The truck llvm does not have this bug, because have different loop vectorize optimization of tf function. But the bug is still in CodeGen. Look at here: https://godbolt.org/z/zcGcvezdE
The key point looks like the two REV32, if we remove and we get right result.
Or the uzp1 , we replace to uzp2 it's also right.
15.x produce a trunc instruction in IR, and then get trunc and bitcast instruction in SelectionDAG, and then Codegen get XTN and REV32. But little endian is : trunc -> trun + bitcast -> XTN.
The different Instruction Selection
BE: def : Pat<(v4i16 (bitconvert (v2i32 FPR64:$src))), (v4i16 (REV32v4i16 FPR64:$src))>;
LE: def : Pat<(v4i16 (bitconvert (v2i32 FPR64:$src))), (v4i16 FPR64:$src)>;
Look at this commit: https://github.com/llvm/llvm-project/commit/30e0e11eb4d9157e8abeea3f0f9027a00617f2bb
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8VkFv2zgT_TX0ZWCDIiXZOvgQ23G_fuimRbe76C2gpJHEjUQaJOWm-fULUoojx9kEDbALyLTI0Ty-eZwhKayVtUJck2RDkt1M9K7RZv1_3f7UijLKZrkuf67_hwZBWtC9AYfWFcI6wq8I3RH62KZ0fEK3VwG4hKIRBnLpOnEgfPPMaLG-6bvT-NgyXmIlFcJvH28IWwnCtpATlgFhq6GfAeFbwnd-ZLDwPTyZrh7H_RMwb2-Fc0bmvcPb2wFGaalaqXD4DI5aluAqwlbn1C3Wqu8C7nKkCSNv8AxGOz_ZLtzDZ0MoY7xsC7Gf9smp0gamU0vlQAZHSvgmvG49VugQtgnPGSUYRQayfFTmjEkIcxX98WmQbuthJgzIcjeqf3oJrSfSCamCZmcTBq3oeRgHI1UYJ4yd7KfVvwJ6T1hyT5KtIoyFZR1MZyDBMXoDOPp14ODH3sBl7yXMz8YucPl7ceM3CMfvBU7eAE7eAWzQ9UaFnH2eTefbw2Olx0UrVB32lEVxOMD8M4P53AlToyN8J4QpmjSet1L19_Na9TDv_JA3me64mgsgLCUsBbHQvZtiv5x-dGK8SKFoYrzIAz4xXizmcmK8WJBqYrwQtXqPHrc5_keS0Nckoa9JQl-ThL4myXmKTNpvDYIzfXEHbXvsoNRoQWkHjTgiuEZayPs6pCYWorc4GEpZVWhQOWi1PsARC6eNfEDQByc7-SCc1Ap0Ba6CqleF7y5g0ztwDXpEf_BZJ9sWpIKtLvEDqgUAfNL6DoSDBg165o1zB-tPRbYnbF_rMtetW2hTE7Z_8L_iQ3HEh_L6eUx3-BMOWg4M7yy08g7D5O6Hhq_Xf_JQb7KCHwgGO31EEKr0vRodGFk3DgzavnULGFA_m-DfPxwi8L7B8dCKAsFpP8xAOsKWFkRr9QCxmNKKksU9HIwu-wJBeNVVAVJZL3_QSyr4-NVDeyauQRW4DN_5oVyGS8Jzn9-xxdDZXX048_a61iPK9283wRBCH5ailc61CKhKKZRfEC_4MNuc8OvwCoRtTvOG0e_fbhbPxX5Kh48Taideo4Cbaz9BiVWY6Itw_tBkq2MsozRcLqQrtDqicb53ZJIz2H_5msZh_WNrivH24Z8tTF1DVEPvRQ9-fdo9P_1LLC6_mU46pnUoqEJ3nXQvZLd0TZ8vCt0RtvfVOP7ND0b_hYUjbD-6sj2nSDGKMI_LLEqWuBI5ouAVrTLKloLSNFpWLM__qexfb2flmpcZz8QM11GacZZFjCWzZo2ZSNNlKRKeYsQizpOoElVG84pjTJNkJteMMk5XLKNpTFmyWK5WiadVZctVllMkMcVOyHbhI_OFPJPW9rhOE5qsZq3IsbXh4syYwh8QjP5wTHYzsw5q5H1tSUxbaZ19QnHSteHG7berZAe7U0oOVWz9bpTLOhTBWeaHkn22jY33cQStIJwec1-8s960619esxCCJWwfQvw7AAD__-4reAY">