<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/62703>62703</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Clang WebAssembly - Wrong optimization with rotates
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
HailToDodongo
</td>
</tr>
</table>
<pre>
I'm currently writing a bytes-wap function in C++ targeting WebAssembly, since WASM doesn't have one built in.
I noticed that clang (with O3) will detect a byteswap (in multiple variants) and inserts the following function:
```wat
(func (;240;) (type 6) (param i32) (result i32)
local.get 0
i32.const 24
i32.shl
local.get 0
i32.const 65280
i32.and
i32.const 8
i32.shl
i32.or
local.get 0
i32.const 8
i32.shr_u
i32.const 65280
i32.and
local.get 0
i32.const 24
i32.shr_u
i32.or
i32.or)
```
I wanted to try out a different, shorter solution i found using rotates:
```wat
(func (export "bswap") (param i32) (result i32)
(i32.or
(local.get 0)
(i32.const 0x00FF00FF)
(i32.and)
(i32.rotr (i32.const 8))
(local.get 0)
(i32.const 0xFF00FF00)
(i32.and)
(i32.rotl (i32.const 8))))
```
However when i write the C++ version of it:
```c++
inline u32 bswap(u32 x) {
return __builtin_rotateleft32((x & 0xFF00FF00), 8)
| __builtin_rotateright32((x & 0x00FF00FF), 8);
}
```
It will still be replaced with the longer version from before.
This seems to be a general issue with rotates of masked inputs, since a single rotate will also be "optimized" into a way longer version.
For example:
```c++
inline u32 bswap(u32 x) {
return __builtin_rotateleft32((x & 0xFF00FF00), 8);
}
```
Turns into:
```wat
(func (;240;) (type 6) (param i32) (result i32)
local.get 0
i32.const 65280
i32.and
i32.const 8
i32.shl
local.get 0
i32.const 24
i32.shr_u
i32.or)
```
instead of:
```wat
(func (;240;) (type 6) (param i32) (result i32)
local.get 0
i32.const -16711936
i32.and
i32.const 8
i32.rotl))
```
If i disable optimizations for this function with `[[clang::optnone]]`, it will keep the roate.
Is this a bug with the optimizer?
And is there a way to keep the rotate but still optimize the rest of the function?
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8Vk1v4zgM_TXKhWjgyPFHDjkkDYLpYbGHHWCOhWzTtnZkyZDoptlfv5DsfLTNdFp0sUBq1zRNPfGRTxTOyUYjrlmyZcluJgZqjV1_E1J9NztTGd2YWWGq4_qB8ayDcrAWNakjHKwkqRsQUBwJ3d1B9FAPuiRpNEgN94xvGd8CCdtg8PyBxcY57Ap1ZPwenNQlwo_NX39AZdBpxjOCVjwhGI1QDFIRSD1n0Y5FmwfQhmSJFVArCEoldAOM5wdJLfwZM76Cg1QKKiQs6QTKY2I8lxq6QZHsFcKTsFJocv4LoSuQ2qElB9Qi1EYpc_BQTxth8WZcf7qm0fg7CBot4ON7b39n8ZYvIxZvfXDGczr2COn00AsrOpAxn54tusHvMBhOwQCUKYWaN0gQXYwy5vPSaEfAl5PVm1yrPvxhmvD81Quhq6tgo1v-0uXFCt5g7IdXfBPKPg6fRvaJjPx6nTPo6eGU7zOd1xw_wEFo8oVmgOwRzODLqZJ1jb7yQ-W2xhJacEYNY7lDbQZdweB88VhDgtBdauf9qsHn3lgCxnnhC5Zx_rmK8QX-ihnG8-us3fAecxc9R9F-7_9u-XgWbpitIfsyTO7dzkkd_T8OYQQQ3fT5NQR1G8I1kFvsfjMHfEILhxY9bV7EMPT-Sa6e0DpPqalB0lsKy9FttEqtpEYYYg4Tdbn__znwlW0vsC3SYDU8PgZVk_pxLBGFNXkyc8bzZ2A8fZUNfh_2BZdAIQfZ_ZtIVjbtm1DX3E6hvDadKjDbvdcFNOqpI38tECz2Snj9DYrrM6aMbtCeE1Zb00GBtbE4afb3VjpwiJ3zvVQgCGhQoxUKpHMDjqGmbvEJ74T7iV6S-8EL9OmIEP7eKJxcR2BCuRCTcW56kp38ByvGOUhNBgQcxPEVwAnU3ljAZ9H1Cv8bdr_G7ZmQ9-n4Pljtwt4-qipfPou-cp7cPAc-e2K9J-zvC7nUjlBUYOr_LVu_387dIs0Wi1WcXm3hNxmbjF7tfitsDzVIqKQThUKYOkL4w8lBbSyQ78XzeBYazwfwQ982DFM-U_HG9KSNRpbs_C-NfJnKSQp-Ivah8a0RdOrxaXE3LiCgGJqLQpwa07J4Pzpu_MQVhi2LU5eSuY4cGrwYaBKeU4TxLTryMhFGtfOAtp9V67haxSsxw_UizZdRmkR5OmvXWVwnZYrFqs7qJMqqJVarhUh4lCYZltFiJtc84nGULJaLRbLg6TxP40WaLLGK8jhbCc6WEXZCqrlST93c2GYWhGud8iyKZ0oUqFyYnDnXeBhVzZ_fyW5m1_6bu2JoHFtGSjpylygkSeH6PgyxV1Mx3MEPa3Tzgr8XKjkbrFq3RH2YLvie8X0jqR2KeWk6xvd-hel211vzN5bE-D7gcozvA-5_AwAA__9hwWf1">