<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/121306>121306</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[DAGCombiner, System>] WRONG code
</td>
</tr>
<tr>
<th>Labels</th>
<td>
miscompilation,
llvm:SelectionDAG
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
JonPsson1
</td>
</tr>
</table>
<pre>
A bisect shows `31b7d43 "Extend extract_element(bitcast(scalar_to_vector(X))) -> trunc(srl(X,C))"` to be the first bad commit, and the problem also goes away on main if this one commit is reverted.
This simple program is supposed to print 0, as both 'g' and 'b' are 0 throughout the program, so the expression assigning 'cPtr' is always false.
```
int printf(const char *, ...);
long a = 0, b = 0, c = 0, d = 0, e = 0, f = 0, g = 0;
static char Bytes[9] = {1, 2, 3, 4, 5, 6, 7, 8, 9};
void foo(char) {}
int main() {
if (a)
for (;; d++)
Bytes[0] = 0;
long *ePtr = &e;
long *cPtr = &c;
for (int IV = 2; IV <= 7; IV++) {
char Arg0 = Bytes[IV + 1] ^ f;
*cPtr = ((g || (*ePtr = b)) >= Bytes[IV + 1]);
foo(Arg0);
}
printf("%d\n", c);
}
```
The LoopVectorizer creates a vectorized loop which later optimizations turn into a block of code containing vector (<2 x i8>) operations. The comparison of Bytes[8] is made by first loading it like this:
```
%scevgep = getelementptr i8, ptr @Bytes, i64 %lsr.iv.next
%scevgep26 = getelementptr i8, ptr %scevgep, i64 7
%wide.load.le = load <2 x i8>, ptr %scevgep26, align 1, !tbaa !12
%8 = extractelement <2 x i8> %wide.load.le, i64 1
%conv8.le = zext i8 %8 to i32
```
Before DAGCombiner (Type-legalized selection DAG) the nodes look like this:
```
t77: i32,ch = load<(invariant load (s16) from %ir.scevgep1, align 1, !tbaa !12), anyext from i16> t0, t64, undef:i64
t71: v8i16 = scalar_to_vector t77
t72: v16i8 = bitcast t71
t76: i32 = extract_vector_elt t72, Constant:i32<1>
```
The guilty commit performs this dag combine:
```
Combining: t76: i32 = extract_vector_elt t72, Constant:i32<1>
... into: t99: i32 = srl t77, Constant:i32<16>
```
This seems broken to me: 2 bytes are loaded, containing the final <2 x i8> vector, but then those bytes are immediately shifted out to the right. This would have worked if in fact an i32 was loaded, but this is an extended i16 load.
If those two bytes are loaded as an integer, the order of the elements are reversed with BigEndian so the i32 becomes (extended i16 load) [0089]. So the correct shift amount should be 0, to keep the last byte in its place.
If the two bytes are loaded as a vector, the order in the vector becomes <..., 9, 8>, where the 8 is element 0. Therefore element 1 contains the last byte in this case.
The result is that instead of the final '9', a '0' from the upper half of the i16->i32 extended register is used, and the cPtr is assigned a 1, which is incorrect.
clang -march=z16 -O2 wrong0.i -o a.out -w -mllvm -unroll-count=1 -mllvm -unroll-full-max-count=1
llc -mtriple=s390x-linux-gnu -mcpu=z15 -O3 ./wrong0.ll
[tc_dagcombiner_extract_el.tar.gz](https://github.com/user-attachments/files/18271304/tc_dagcombiner_extract_el.tar.gz)
@RKSimon @davemgreen @uweigand
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJykV11zozjW_jXKzSlTIGzAF76wnc7UvO_WztR01-zepQQcQBshUZKw4_71W0eCOJ3u6bnYlEsRSDp6zveDcE72GvHAdie2e3wQsx-MPfyf0b87Z3T2UJv2djhCLR02Htxgrg5YkeZZXbbbHBjnn1496hbw1VvR-GdUOKL2jFe19I1wNHONUMI-e_N8wcYby3j1b8b38Qcbln8Cb2fd0Far4ur5vO7grEjBG6gR_IDQSes81KKFxoyj9IyfQeg2rE3W1ApHEMoZ6A06EFdxA6NhFFKD7MAP0oHRuBwG6cDiBa3HNmHpkaXHL7TDyXFSQV5vxUi73DxNxmFLSCYrtYc03OygNn4Axsue8TIgYbysw9wipOAHa-Z-MLNfIZJIOutMeIOvk0XnpNEQ3SF1TzKa370lMdKBUFdxc9AJ5XCByYp0-aVHQhMwdYxXjdHOQzMIC4wf6Z4kSciQ-YmlR2V0DwJY_hjx1_dpc5-29ynep9192i_TINN54WUTrzzdPDq2O-3Z7jHsYeUpoxOchpyGLQ07GgoaShoqGvasfIwSL0a20BlD6gzCUpSw8kTLQXfSlxzKeLUupUfyLuOVIFXTIwBAZ8gEFYnMT9Ayfgq_dRnewKYr2EWhsCFYivEj_u5t1IQXGNfvi827xWZdXO4llL_-GVY5AQjzMz2W8fEN0KoBQDTi0fZpOLcCpKP8BFnAufsE3XoVfABRMV71wMozK8_x8Y6_XvKN5Z_-QvZblETrkfkJyv31mwfgHm-UoHzXst1Zh-kZmrcDy_53ofplQPiHMdOfoRDIr2ihsSg8pSpc1pctKGMmuA6yGUAJjxbM5OUovwovjXbgZ6tBam9AQK1M8wKmg8a0lNfaCxlyKIqLIXDm8AqyIt35HsyENkpKgBA1ZpyElc5okrNapiJzSwejaBHq21J5lBEtSZcelHzBUFFYfvw-K8k3O9fgpccpOKBHv1THyVsCw89AM7ZN45X8DLLY0jHlbCIvicZX_0ESL34u623jKq1cBVxliwmhT1RMaprDt6b5KISHFBVK9hpCGjOe-VoI-p_xVXIVxC0dYIH1jeCP16_YslVCY_SlWnF9xVcPsoqSvQGZ8w-mDcdO2BmL8Hj85WzGWmoMrv5ym3CjsBcqxJFDhQ15mvaR66niatOioxB7-VsXrn--LFl-DFD4uRne7EcZTal-EVYK7Reb8splBd3WWTOSHtImi0WznxqUMpTa2Y1MEA7LrAj9MZRdX4TiOesWO5YfZbElkL7MCNulklkMjo_9NqBftfElD7uzQka_LX06iAnSikXT915dJD2j8kECP8OZGo3QnoDknOXnjELoBzakDOtnqfxt7boT2s7Y0cV23IqeFsiFf-mF6GKpe8L2P0MkiUmShAoSBO737wU6q4LJfiih-KGWkTUgjg5qa15QU-COpA9wqG-hvlkM4YFtKJL3OhVZjRbq25xZudIZ6sgdNPjBOHwnTo4jtlJ4VDdwg-w8thCIRmQWVvaDpwonHVzNrFoYxAXhauwLttQvpYZONB6EDrpfhXuHMN4qXSAgmqyMuqVzWRF2LUTk126B5a_mO02JH4lQqbHHoAvhMralit5F-hPrRTwVuBixrKv0A5xk_0m3UuiVKhHIGhszoqMk-w5SaG-7U5pWxD8S-ByPNcbaSF9l50GMZtaBy5JFaoyMxht4QZzCfkXpQKqQgaR3MCnR4Ht9f6LtO7_ddZU6PCzp-KZCfg7cjKhPIEGxBF8HtJHrVmT7taCmoVfZWPXWl9kaSO575MF5jXijjJSHFt2sAu31g_AgtfMo2tUXSxTycs94CH-qS2VKLDQUI9ozTxNaGITq1lMyK4jDk3PePGKxl476tnQwuxhQK0sPfIWCKn58tCBiKYztngJOLx5LYn-goVFC97AZhW0Glj9-zQrY_MYBrtboPk0kbAyIhIJ_c4XNqNRlhM2srVFq05DDWf6YfVzoZqU2o3i974jsTjWwGb2Vk0KWP7p8n75ulNTz66bXM2zGZpoDhh1sfsshYfxpgaEUmXp38s1zK_qlqNnn-8dR4oVN-q-BbFWD91NoPPyJ8ade-mGuk8aMjD_NDu1GeC-aIWQH40-dVMQRnrKKl1mebhl_-ttrAtml3zb94_8_y9FoYhutuODYW8TwNF9R9uSbh_aQt_t8Lx7wkJX5bpeVRVU-DIc826Io90WOhWj2O7ErdlneZrsO66LrmuJBHnjKtxnP0zTNttsy2RdVU9dYNM2eV0XXsW2Ko5AqIfMnxvYP0rkZDxnP8rR4UKJG5cJHKOejdMTHpAoMLXJKxjmdZPnx89rQQz_n9NFqD7S2qefesW2qpPPufo-XXoXP23c8gYLt8815HCnjdo8A__rjt3_-Egjkw2zV4SeOCTDiv81kzX-w8Yw_BW2Cc6JClwP_bwAAAP__qVembg">