[PATCH] D30416: [BitfieldShrinking] Shrink Bitfields load/store when the bitfields are legal to access independently
Wei Mi via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 21 16:31:58 PDT 2017
wmi added inline comments.
================
Comment at: include/llvm/Target/TargetLowering.h:1908
+ virtual bool isNarrowingExpensive(EVT /*VT1*/, EVT /*VT2*/) const {
+ return true;
+ }
----------------
efriedma wrote:
> I'm not sure I see the point of this hook. Every in-tree target has cheap i8 load/store and aligned i16 load/store operations. And we have existing hooks to check support for misaligned operations.
>
> If there's some case I'm not thinking of, please add an example to the comment.
It is because some testcase for amdgpu. Like the testcase below:
define void @s_sext_in_reg_i1_i16(i16 addrspace(1)* %out, i32 addrspace(2)* %ptr) #0 {
%ld = load i32, i32 addrspace(2)* %ptr
%in = trunc i32 %ld to i16
%shl = shl i16 %in, 15
%sext = ashr i16 %shl, 15
store i16 %sext, i16 addrspace(1)* %out
ret void
}
code with the patch:
s_load_dwordx2 s[4:5], s[0:1], 0x9
s_load_dwordx2 s[0:1], s[0:1], 0xb
s_mov_b32 s7, 0xf000
s_mov_b32 s6, -1
s_mov_b32 s2, s6
s_mov_b32 s3, s7
s_waitcnt lgkmcnt(0)
buffer_load_ushort v0, off, s[0:3], 0
s_waitcnt vmcnt(0)
v_bfe_i32 v0, v0, 0, 1
buffer_store_short v0, off, s[4:7], 0
s_endpgm
code without the patch:
s_load_dwordx2 s[4:5], s[0:1], 0x9
s_load_dwordx2 s[0:1], s[0:1], 0xb
s_mov_b32 s7, 0xf000
s_mov_b32 s6, -1
s_waitcnt lgkmcnt(0)
s_load_dword s0, s[0:1], 0x0
s_waitcnt lgkmcnt(0)
s_bfe_i32 s0, s0, 0x10000
v_mov_b32_e32 v0, s0
buffer_store_short v0, off, s[4:7], 0
s_endpgm
amdgpu codegen chooses to use buffer_load_short instead of s_load_dword and generates longer code sequence. I know almost nothing about amdgpu so I simply add the hook and only focus on the architectures I am more faimiliar with before the patch becomes in better shape and stable.
================
Comment at: lib/CodeGen/MemAccessShrinking.cpp:1014
+ else
+ CandidatesToErase.insert(OpI);
+ }
----------------
efriedma wrote:
> If an instruction has no uses and isn't trivially dead, we're never going to erase it; no point to adding it to CandidatesToErase.
An instruction which is not trivially dead for now may become dead after other instructions in CandidatesToErase are removed. That is why I want to add it to CandidatesToErase.
================
Comment at: lib/CodeGen/MemAccessShrinking.cpp:1027
+ } while (!DeadInsts.empty());
+ }
+ return Changed;
----------------
efriedma wrote:
> Should we clear CandidatesToErase here, as opposed to modifying it inside the loop?
Ah, right. Actually, I shouldn't use range based loop since the iterator will be invalidated after insertion and deletion.
Repository:
rL LLVM
https://reviews.llvm.org/D30416
More information about the llvm-commits
mailing list