<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/82075>82075</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
target emission for ctlz doesn't account for range metadata
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
sadlerap
</td>
</tr>
</table>
<pre>
Consider this IR:
```llvm
define i32 @clz_nzu8(ptr %num) {
start:
%self = load i8, ptr %num, align 1, !range !0
%0 = tail call i8 @llvm.ctlz.i8(i8 %self, i1 true)
%_0 = zext i8 %0 to i32
ret i32 %_0
}
declare i8 @llvm.ctlz.i8(i8, i1 immarg)
!0 = !{ i8 1, i8 0 }
```
`llc --mcpu=x86-64-v3` (enables the `lzcnt` instruction) emits the following x86 assembly:
```asm
clz_nzu8: # @clz_nzu8
movzx eax, byte ptr [rdi]
lzcnt eax, eax
add eax, -24
movzx eax, al
ret
```
If we account for the range metadata (i.e. that `%self` cannot be zero), we could save an instruction:
```asm
clz_nzu8: # @clz_nzu8
movzx eax, byte ptr [rdi]
shl eax, 24
lzcnt eax, eax
ret
```
Or, as IR:
```llvm
define i32 @clz_nzu8(ptr %num) {
%self = load i8, ptr %num, align 1, !range !0
%zext = zext i8 %self to i32
%shl = shl nuw i32 %zext, 24
%ctlz = tail call i32 @llvm.ctlz.i32(i32 %shl, i1 true)
ret i32 %ctlz
}
declare i32 @llvm.ctlz.i32(i32, i1 immarg)
!0 = !{ i8 1, i8 0 }
```
This optimization isn't advantageous when the instructions emitted for `@llvm.ctlz.i*` is independent of the size of the input type, such as x86 without the `lzcnt` instruction. As far as I can tell, this means we should only really do this transformation when this intrinsic requires extending the provided integer to a larger type (for instance, i8 on x86, or i8 and i16 on aarch64).
[Alive Proof](https://alive2.llvm.org/ce/z/xxCdeg)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysVl2P6jYQ_TXmZQRKHAjhgYfdRUj3qVXV96tJPCGuHJvaDl-_vhoH9pJVd1Wpi1abkBnPx5kzJ2AI-mCJtmL1Kla7GQ6xc34bUBnyeJzVTl23b84GrchD7HSAH3-I4kVkO5E9_pfZ-GfMqR8fKWq1JdCFBLHMGnP7aW9DJWR1jB6EXNmhF3IDYv06-oeIPr6HBXYJZFoQxQ6MQwW6EvINnk-_ARp9sJDzrZC5R3sgvsmegmQpQkRtoEFjQFdcD9e5aKK5LThsxQ_HfBxK5xD9QEJunuL8HAPd6BJhdM8gOu7v4eQpju2y8x2X9e4ZJkWNQU-f1XDPrfse_eE9O_eTUguZi_UrH04N6woy-JXgMYEPYzGmgfm8b46DKHaXqpyXy_mpEGUGQlZksTYUIHYE7HtrbGSTtiH6oYnaWZ4R9TqOTq0zxp21PcClKgFDoL4218_IgOHOhffpFy_w9UfIYkKXO7Tjp3en2wUACC8MQH2NNPJh9eqVFqvd1D2188udLxM7KpWud_tcLr9Mh-Zu9hS_wPxHC2cCbBo32Ait8wm4kZo9RVQYkbHXC1pA7DAy8g_ylRk0aK2LUBPcyDtmgXzjiI0bjIKAJwK0kwkV_wn3b0E2dOYZso-IfYr415D95hO-U135_4ryfRqSlv7D9qfAUwHgp51Jjny1w_mhB3xuCpiQK179j-I09vakDIVkroxRQmf-XZ6elIdPfa09n6b4dvn5k98V7hh1r2_IVAUdrJDrCKhOaCMeyA0Bzh3ZtCRPpA5JdCKptEEcd1KykC9JpwJoq-hIVpGN4NoUJugbPe61PQ4R4vVIXHMYmo55xuJ11rFzbPtc-xYALwFa9ImbvJkQyaQZpNdgT2gD72bo0nI6a67gCY25gnKjT_RoQ-t8PwJw7zUVHr22QTfg6e9BewpAl0hWsbpyUUfvTlqRYk868KvXAYJBn-6vR2ZpxehwxWgbuk_FWe6Pv7CtArQKdF7yc0TfdOVSyM1iItir1xejTwS_e-da3nZZdTEeA2-j3Au5RzbLRZqAY3LsOd3-JuT-cnlTxHSZqW2hNsUGZ7TN11mVy2pT5rNu29REeUlNU2Jbr4tmVRS4yaui3VSIbVnP9FZmcpnJfJ1leZnli1VZympTr2RBy1LVhVhm1KM27wXMdAgDbSuZrVczgzWZkH6-SGnpDMkopORfM37LZ-b1cAhMIR1i-BUl6mhoGxnSyIQLgWfEmKbdVI4efH2S86mUzwZvtlOwDjp2Q71oXC_kPsnXeJkfvfuLmijkPlUYhNynDv4JAAD__5sxxHg">