<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/122380>122380</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[AArch64] Failure to fold `lsr` or `asr` into `cmp`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:AArch64,
missed-optimization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Kmeakin
</td>
</tr>
</table>
<pre>
https://godbolt.org/z/rMMbbfMdW
Instead of performing an `lsr`/`asr` and then comparing the result against zero, the shift can be performed as part of a `cmp` against `xzr`:
```c
src:
lsr x8, x0, #32
cmp x8, #0
cset w0, ne
ret
tgt:
cmp xzr, x0, lsr 32
cset w0, ne
ret
```
LLVM does perform this fold for shifts <= 31, and for lsl it is able to find a different way of doing the comparison in one instruction using `tst`. It also seems to already perform the fold if comparing against a variable instead of `0`. I guess the fold fails when comparing against `0` because a comparison against zero can be represented either as `cmp x0, #0` or `cmp xzr, x0`
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJx0VE1v4zgM_TXyhWigSI7THnzItggw2Ol19yxbVKwdWTJEuh_59QvZSZMpMEYQC6b0Hsn3KEPkTxGxFbu_xO6lMjMPKbd_j2h--Vh1yX62A_NEQh-EOgp1PCXbpcCblE9CHc9CHfPra9e5V_uvkAchDz8iMRoLycGE2aU8-ngCE0E0MlAWjSxAjTTLGky0wANG6NM4mVz28oCQkebAYE7GR2I4Y05CPS8hGrxj6E2EDq8UaMEQTCZz4TWFqx-nBf-CIBr5cV7Y9WFNtKyXXy_kgXK_BiBQhvJ8PBbCD1n-hdJaleDl6cfpbo9QWt4HCbm835ejEUsoI6-kfOILz3eoc77xlRx-J_wz5rWKFf_nz39ewSaka2eAB0_gUrDgUl6bRyD0s9AvoLcFrkhQYoECeAZPYLqAwAmcjxYMWO8cZowM7-azNNimq04X1ShF8BFSRCjdznPPPkWYqewTjWRi0cgN_GAwgRIQ4kiFwYSMxn7eZYtrst7dOeIqooE3k_2Snb_ZTDRSruhwmpHoBuKMDwTvv9vrzhHlGHTYm5kQzH0x98a7ei3jlJEwMlpAzwPmYrrVajejLJgpf33_0rWRlW21fdJPpsJ2u9dN3Uj9qKqhdfVWybp-bGyjVK07t5fdfqt2Stb9fu-w8q2Saie38kkp2Si52e73WuMWnaufurrrRC1xND5sQngby3BWnmjGdquUfpRVMB0GWqZcqc70vzBaoQ-HQ-6HphZKCfUslBo9EdqHNLEf_dkUDUts91LltgA_dPOJRC2DJ6YbFXsOyxVyxdu9wNH4MOfVRUWKr-m_9OYy_j5yuk1rNefw_b7xPMzdpk-jUMfCeHk9TDn9hz0LdVwqJaGOl2LfWvV_AAAA__8RBH_o">