[PATCH] D124450: [AMDGPU] Remove hasOneUse check from scalar select pattern

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 29 07:16:00 PDT 2022


foad added a comment.

> The should be able to use SReg_32 which we do handle

I can get that to work with a patch like this: https://reviews.llvm.org/differential/diff/426045/

I'm not too happy that I had to change InstrEmitter::EmitCopyFromReg. It makes me wonder if we should be handling uniform compare+select patterns much more like a flags-based CPU does, either by gluing the s_cmp to the s_cselect, or using ISD::SELECT_CC instead of ISD::SELECT in the first place, so that it is all in one DAG node.

Also the codegen is not particularly pretty, but maybe it can be cleaned up by tweaking SIFixSGPRCopies, which has for some reason converted s_cmp to v_cmp but not converted the following s_cselect to v_cndmask:

  diff --git a/llvm/test/CodeGen/AMDGPU/setcc-multiple-use.ll b/llvm/test/CodeGen/AMDGPU/setcc-multiple-use.ll
  index bff4c0c1533a..6c99d04f6410 100644
  --- a/llvm/test/CodeGen/AMDGPU/setcc-multiple-use.ll
  +++ b/llvm/test/CodeGen/AMDGPU/setcc-multiple-use.ll
  @@ -18,7 +18,9 @@ define i32 @f() {
   ; CHECK-NEXT:    v_cmp_ne_u32_e32 vcc_lo, 0, v0
   ; CHECK-NEXT:    s_cmpk_lg_u32 vcc_lo, 0x0
   ; CHECK-NEXT:    s_subb_u32 s4, 1, 0
  -; CHECK-NEXT:    v_cndmask_b32_e64 v0, 0, s4, vcc_lo
  +; CHECK-NEXT:    s_and_b32 s5, vcc_lo, exec_lo
  +; CHECK-NEXT:    s_cselect_b32 s4, s4, 0
  +; CHECK-NEXT:    v_mov_b32_e32 v0, s4
   ; CHECK-NEXT:    s_setpc_b64 s[30:31]
   bb:
     %i = load i32, i32 addrspace(3)* null, align 16


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124450/new/

https://reviews.llvm.org/D124450



More information about the llvm-commits mailing list