[llvm] [AMDGPU] si-peephole-sdwa: Fix cndmask vcc use for wave32 (PR #139541)
Frederik Harwath via llvm-commits
llvm-commits at lists.llvm.org
Tue May 13 01:14:44 PDT 2025
================
@@ -0,0 +1,40 @@
+; RUN: llc %s -march=amdgcn -mcpu=gfx1030 -o - 2>&1 | FileCheck %s
+
+; In this test, V_CNDMASK_B32_e64 gets converted to V_CNDMASK_B32_e32,
+; but the expected conversion to SDWA does not occur. This led to a
+; compilation error, because the use of $vcc in the resulting
+; instruction must be fixed to $vcc_lo for wave32 which only happened
+; after the full conversion to SDWA.
+
+
+; CHECK-NOT: {{.*}}V_CNDMASK_B32_e32{{.*}}$vcc
+; CHECK-NOT: {{.*}}Bad machine code: Virtual register defs don't dominate all uses
+; CHECK: {{.*}}v_cndmask_b32_e32{{.*}}vcc_lo
+
+define amdgpu_kernel void @quux(i32 %arg, i1 %arg1, i1 %arg2) {
----------------
frederik-h wrote:
This went trough llvm-reduce. I cannot reduce it signficantly by hand either. In particular, the control flow seems to be necessary. It doesn't need to be a kernel.
https://github.com/llvm/llvm-project/pull/139541
More information about the llvm-commits
mailing list