[llvm] [X86] Handle BSF/BSR "zero-input pass through" behaviour (PR #123623)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 23 02:50:40 PST 2025
================
@@ -227,9 +227,8 @@ define i64 @PR89533(<64 x i8> %a0) {
; SSE-NEXT: orl %eax, %edx
; SSE-NEXT: shlq $32, %rdx
; SSE-NEXT: orq %rcx, %rdx
-; SSE-NEXT: bsfq %rdx, %rcx
; SSE-NEXT: movl $64, %eax
-; SSE-NEXT: cmovneq %rcx, %rax
+; SSE-NEXT: rep bsfq %rdx, %rax
----------------
RKSimon wrote:
We use "REP BSF" so that it can be recognized as TZCNT on BMI capable machines, which is a lot quicker than BSF - it uses this pattern as we no longer have any EFLAGS dependency. In that case we just pay the trivial penalty of the extra MOV.
https://github.com/llvm/llvm-project/pull/123623
More information about the llvm-commits
mailing list