[llvm] [CGP] Despeculate ctlz/cttz with "illegal" integer types (PR #137197)
Sergei Barannikov via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 24 22:05:59 PDT 2025
================
@@ -285,30 +285,35 @@ define i32 @ctlo_i32_undef(i32 %x) {
ret i32 %tmp2
}
-define i64 @ctlo_i64(i64 %x) {
+define i64 @ctlo_i64(i64 %x) nounwind {
; X86-NOCMOV-LABEL: ctlo_i64:
; X86-NOCMOV: # %bb.0:
+; X86-NOCMOV-NEXT: pushl %esi
----------------
s-barannikov wrote:
RHS looks bigger because of additional tail duplication (and a spill).
In the case of "zero input" it is 3 instructions less, and on other code paths it is one less or the same (not counting the spill). It also avoids one high-latency(?) `bsr` on all paths.
The only disadvantage I see is that it uses an extra register, but that may not be a big deal when this is inlined into a larger function.
If that doesn't sound convincing, I can play with the heuristic (isCheapToSpeculateCtlz) to restore the behavior on 32-bit platform and >= 64-bit operand. Just let me know.
https://github.com/llvm/llvm-project/pull/137197
More information about the llvm-commits
mailing list