[PATCH] D144715: [AMDGPU] Use `S_BFE_U64` for uniform i1-i64 ext

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 24 04:03:40 PST 2023


arsenm added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/saddo.ll:36
+; SI-NEXT:    v_mov_b32_e32 v0, s4
+; SI-NEXT:    v_mov_b32_e32 v1, s5
 ; SI-NEXT:    buffer_store_dwordx2 v[0:1], off, s[0:3], 0
----------------
Pierre-vh wrote:
> Not sure if it's a regression here. Yes there's one more instruction, but we're using more scalar instructions so isn't it beneficial in the end?
It's one more instruction because of the copy to VGPR for the store. We probably should have a rematerialize-as-VALU optimization to handle this kind of case.


================
Comment at: llvm/test/CodeGen/AMDGPU/usubo.ll:19
+; SI-NEXT:    v_cmp_gt_u64_e32 vcc, s[0:1], v[0:1]
+; SI-NEXT:    s_bfe_u64 s[6:7], vcc, 0x10000
+; SI-NEXT:    s_add_u32 s6, s0, s6
----------------
Pierre-vh wrote:
> This looks a bit like a regression but I'm not sure how to address it. The pattern comes from `zext (setcc)`.
> I thought about adding a PatFrag that doesn't accept setcc operands to zext but it feels hacky.
> Thoughts?
Before this was a 32-bit select, so I assume this was a zext to i32 so I don't see why this zext to i64 change matters. What was the DAG here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144715/new/

https://reviews.llvm.org/D144715



More information about the llvm-commits mailing list