[PATCH] D55380: [AMDGPU] Shrink scalar AND, OR, XOR instructions

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 6 18:47:40 PST 2018


arsenm added inline comments.


================
Comment at: test/CodeGen/AMDGPU/andorbitset.ll:15
+; GCN: s_or_b32 s{{[0-9]+}}, s{{[0-9]+}}, 0x80000000
+; SI: s_bitset1_b32 s{{[0-9]+}}, 31
+define amdgpu_kernel void @s_set_msb(i32 addrspace(1)* %out, i32 %in) {
----------------
grahamsellers wrote:
> arsenm wrote:
> > Why only SI?
> I'm not sure what's going on but if you supply only -march=amdgcn and no -mcpu, then it appears the register allocator doesn't end up allocating the source and destination of the s_or_b32 instruction to the same register even though we use setRegAllocationHint. That part of the code was modeled on a similar chunk for S_MULK_I32 earlier in the function. Interestingly, the s_mulk.ll test doesn't have a RUN line which doesn't specify the CPU. I'll do the same.
You should generally just pick an explicit target. The default cpu is something approximating Tahiti but isn't quite the same


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55380/new/

https://reviews.llvm.org/D55380





More information about the llvm-commits mailing list