[PATCH] D144715: [AMDGPU] Use `S_BFE_U64` for uniform i1-i64 ext
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 24 04:23:05 PST 2023
Pierre-vh added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/saddo.ll:32
; SI-NEXT: s_xor_b64 s[4:5], s[6:7], vcc
-; SI-NEXT: v_cndmask_b32_e64 v0, 0, 1, s[4:5]
-; SI-NEXT: v_mov_b32_e32 v1, s11
-; SI-NEXT: v_add_i32_e32 v0, vcc, s10, v0
-; SI-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; SI-NEXT: s_bfe_u64 s[4:5], s[4:5], 0x10000
+; SI-NEXT: s_add_u32 s4, s10, s4
----------------
foad wrote:
> I'm not sure this is correct. The old code treated s[4:5] like a divergent boolean, with a bit for each *active* lane. The new code assumes the boolean value is in bit 0 - but will that work if lane 0 is not active?
Very good question; I have a bit of trouble following what V_CNDMASK does exactly in this case.
v0 is the destination, but what do the 0/1/s[4:5] correspond to?
This function doesn't seem to select with global isel so I can't compare with that
================
Comment at: llvm/test/CodeGen/AMDGPU/usubo.ll:19
+; SI-NEXT: v_cmp_gt_u64_e32 vcc, s[0:1], v[0:1]
+; SI-NEXT: s_bfe_u64 s[6:7], vcc, 0x10000
+; SI-NEXT: s_add_u32 s6, s0, s6
----------------
arsenm wrote:
> Pierre-vh wrote:
> > This looks a bit like a regression but I'm not sure how to address it. The pattern comes from `zext (setcc)`.
> > I thought about adding a PatFrag that doesn't accept setcc operands to zext but it feels hacky.
> > Thoughts?
> Before this was a 32-bit select, so I assume this was a zext to i32 so I don't see why this zext to i64 change matters. What was the DAG here?
It was a zext to i64
```
t41: i1 = setcc t39, t51, setugt:ch
t30: i64 = zero_extend t41
t31: i64 = add t39, t30
t37: v2i32 = bitcast t31
```
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144715/new/
https://reviews.llvm.org/D144715
More information about the llvm-commits
mailing list