<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/61461>61461</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Clang does not consider commutativity when combining vector multiplications
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
aqjune
</td>
</tr>
</table>
<pre>
Given this code:
```
#include <arm_neon.h>
void f (uint64x2_t *__restrict__ y, uint32x2_t x[4]) {
for (int i = 0; i < 4; ++i) {
for (int j = 0; j < 4; ++j) {
y[i * 4 + j] = vmull_u32(x[i], x[j]);
}
}
}
```
Clang trunk generates 16 umull instructions :https://godbolt.org/z/5qeE8MEG9
This is not optimal because x[i] * x[j] is x[j] * x[i].
GCC trunk considers this and generates 10 umull instructions: https://godbolt.org/z/jaxan53Gn
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyEk8uPuzgMx_8ac7GmAodHe-DQx7Snve29CiGUsCHpkKQz3b9-FWjnpZV-UlUc7K_9sU24c-pipKyh2EFxSHjwvZ1q_jYEI5PGtvf6pG7SoO-VQ2FbCWwL6QHSLZTp47cciSkjdGglAtvzaTwbac2qB_b6CJj_b1a12CHQOijjy_yDzh6BtufzJJ2flPDnM96B9hj9jGb_BxS7HIoD0Aah2i2ZEDs7xUTKeFQI7IApsN1s7jGPJtAOaKd-yX4Ihy_h8Fs4_BTeodipyIp5DMABisOsvo1B63NgBLSOqGpG3c_Yw4IN7Ft5qA6Pw6f1Zfyc6l5zc0E_BfMPXqSRE_fSYVZiiDVRGeenILyyxiGwbe_91cUN0RHoeLFtY7Vf2ekCdPwX6Fi8ydf1X6-nzZL977hU5dBYj_bq1cg1NlLw4CQ-G5kbfnYSgz_tpyNGrZaEp_3-ASuscaqVk1u-HG7a7_zp__AD2-Kf-Af-wU3BTiZpa9Zu2IYnss7KapOmLK8o6WuijESVUcZKUVWMCtEJwXmRbpqua2SXqJpSYinLSsqKLKPVJk-bdVEIUXZV2RUN5KkcudIrrW9jrJ0o54Ksyywvs0TzRmo3XxciI99xdgJRvD1THTUvTbg4yFOtnHdfWbzyWtbLQlsrl6k_x4TCjmPw3Kub8nd876WJrxpllLngTQpvJxyD9uqqleDzwJIw6frXxJTvQ7MSdgQ6xsqPx8t1soMUHug48zqg49zPfwEAAP__TzgwnQ">