[PATCH] D62100: [DAGCombine][X86][AMDGPU][AArch64] (srl (shl x, c1), c2) with c1 != c2 handling

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat May 18 15:29:21 PDT 2019


arsenm added inline comments.


================
Comment at: test/CodeGen/AMDGPU/llvm.amdgcn.ubfe.ll:686-687
 ; SI-NEXT:    s_waitcnt vmcnt(0)
-; SI-NEXT:    v_lshlrev_b32_e32 v0, 31, v0
-; SI-NEXT:    v_lshrrev_b32_e32 v0, 1, v0
+; SI-NEXT:    v_lshlrev_b32_e32 v0, 30, v0
+; SI-NEXT:    v_and_b32_e32 v0, 2.0, v0
 ; SI-NEXT:    buffer_store_dword v0, off, s[4:7], 0
----------------
arsenm wrote:
> lebedev.ri wrote:
> > @arsenm will AMDGPU prefer 2 shifts or shift+mask here?
> In this particular case, they're the same. In general 2 shifts is probably better. The mask value is less likely to be an inline immediate. It seems like we have a BFE matching problem though
For 64-bit, shift and mask would be better


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D62100/new/

https://reviews.llvm.org/D62100





More information about the llvm-commits mailing list