[PATCH] D62100: [DAGCombine][X86][AMDGPU][AArch64] (srl (shl x, c1), c2) with c1 != c2 handling
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat May 18 15:29:21 PDT 2019
arsenm added inline comments.
================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.ubfe.ll:686-687
; SI-NEXT: s_waitcnt vmcnt(0)
-; SI-NEXT: v_lshlrev_b32_e32 v0, 31, v0
-; SI-NEXT: v_lshrrev_b32_e32 v0, 1, v0
+; SI-NEXT: v_lshlrev_b32_e32 v0, 30, v0
+; SI-NEXT: v_and_b32_e32 v0, 2.0, v0
; SI-NEXT: buffer_store_dword v0, off, s[4:7], 0
----------------
arsenm wrote:
> lebedev.ri wrote:
> > @arsenm will AMDGPU prefer 2 shifts or shift+mask here?
> In this particular case, they're the same. In general 2 shifts is probably better. The mask value is less likely to be an inline immediate. It seems like we have a BFE matching problem though
For 64-bit, shift and mask would be better
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D62100/new/
https://reviews.llvm.org/D62100
More information about the llvm-commits
mailing list