[PATCH] D55380: [AMDGPU] Shrink scalar AND, OR, XOR instructions
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 6 18:47:40 PST 2018
arsenm added inline comments.
================
Comment at: test/CodeGen/AMDGPU/andorbitset.ll:15
+; GCN: s_or_b32 s{{[0-9]+}}, s{{[0-9]+}}, 0x80000000
+; SI: s_bitset1_b32 s{{[0-9]+}}, 31
+define amdgpu_kernel void @s_set_msb(i32 addrspace(1)* %out, i32 %in) {
----------------
grahamsellers wrote:
> arsenm wrote:
> > Why only SI?
> I'm not sure what's going on but if you supply only -march=amdgcn and no -mcpu, then it appears the register allocator doesn't end up allocating the source and destination of the s_or_b32 instruction to the same register even though we use setRegAllocationHint. That part of the code was modeled on a similar chunk for S_MULK_I32 earlier in the function. Interestingly, the s_mulk.ll test doesn't have a RUN line which doesn't specify the CPU. I'll do the same.
You should generally just pick an explicit target. The default cpu is something approximating Tahiti but isn't quite the same
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D55380/new/
https://reviews.llvm.org/D55380
More information about the llvm-commits
mailing list