[PATCH] D55380: [AMDGPU] Shrink scalar AND, OR, XOR instructions

Thu Dec 6 13:22:53 PST 2018

grahamsellers marked 4 inline comments as done.
grahamsellers added inline comments.

================
Comment at: test/CodeGen/AMDGPU/andorbitset.ll:15
+; GCN: s_or_b32 s{{[0-9]+}}, s{{[0-9]+}}, 0x80000000
+; SI: s_bitset1_b32 s{{[0-9]+}}, 31
+define amdgpu_kernel void @s_set_msb(i32 addrspace(1)* %out, i32 %in) {
----------------
arsenm wrote:
> Why only SI?
I'm not sure what's going on but if you supply only -march=amdgcn and no -mcpu, then it appears the register allocator doesn't end up allocating the source and destination of the s_or_b32 instruction to the same register even though we use setRegAllocationHint. That part of the code was modeled on a similar chunk for S_MULK_I32 earlier in the function. Interestingly, the s_mulk.ll test doesn't have a RUN line which doesn't specify the CPU. I'll do the same.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55380/new/

https://reviews.llvm.org/D55380