[PATCH] D144715: [AMDGPU] Use `S_BFE_U64` for uniform i1-i64 ext

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 24 03:55:24 PST 2023


foad added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/saddo.ll:32
 ; SI-NEXT:    s_xor_b64 s[4:5], s[6:7], vcc
-; SI-NEXT:    v_cndmask_b32_e64 v0, 0, 1, s[4:5]
-; SI-NEXT:    v_mov_b32_e32 v1, s11
-; SI-NEXT:    v_add_i32_e32 v0, vcc, s10, v0
-; SI-NEXT:    v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; SI-NEXT:    s_bfe_u64 s[4:5], s[4:5], 0x10000
+; SI-NEXT:    s_add_u32 s4, s10, s4
----------------
I'm not sure this is correct. The old code treated s[4:5] like a divergent boolean, with a bit for each *active* lane. The new code assumes the boolean value is in bit 0 - but will that work if lane 0 is not active?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144715/new/

https://reviews.llvm.org/D144715



More information about the llvm-commits mailing list