[llvm] [AMDGPU] Analyze REG_SEQUENCE To Remove Redundant CMP Instructions (PR #167364)

Patrick Simmons via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 10 13:55:48 PST 2025


================
@@ -2701,12 +2697,10 @@ define amdgpu_kernel void @srem_v2i64(ptr addrspace(1) %out, ptr addrspace(1) %i
 ; GCN-NEXT:    s_waitcnt vmcnt(0)
 ; GCN-NEXT:    v_readfirstlane_b32 s11, v5
 ; GCN-NEXT:    v_readfirstlane_b32 s10, v4
-; GCN-NEXT:    s_or_b64 s[6:7], s[10:11], s[8:9]
----------------
linuxrocks123 wrote:

Hi @LU-JOHN, we are guaranteed that `s6` is zero, so it looks like the transformation is making use of the fact that the `s_or_b64 s[6:7]` instruction could only set `SCC` to indicate a nonzero output if `s7` is nonzero and therefore eliminating the redundant `s_cmp_lg_u64`.  I think this is correct.

Here is master versus branch in meld.
<img width="1234" height="449" alt="meld" src="https://github.com/user-attachments/assets/4e10ecaf-73ce-4bb9-88dd-8aed5ab69d47" />


https://github.com/llvm/llvm-project/pull/167364


More information about the llvm-commits mailing list