[PATCH] D18162: AMDGPU: Add SIWholeQuadMode pass
Nicolai Hähnle via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 17 09:16:26 PDT 2016
nhaehnle updated this revision to Diff 50943.
nhaehnle added a comment.
[This time with the correct --update parameter for arc]
Use isSchedulingBoundary instead of implicit-use of EXEC, which gets rid of
the target-independent modifications.
This is indeed more conservative, as you can tell from the change in
si-scheduler.ll: previously, the later scheduling passes managed to move the
initial s_wqm_b64 after the s_load_dwordx4 & x8, i.e. we lose slightly in
latency hiding.
The impact should be small, and it does make sense to land a more conservative
and robust patch initially.
http://reviews.llvm.org/D18162
Files:
lib/CodeGen/ProcessImplicitDefs.cpp
lib/CodeGen/TwoAddressInstructionPass.cpp
lib/Target/AMDGPU/AMDGPUInstrInfo.cpp
lib/Target/AMDGPU/AMDGPUInstrInfo.h
lib/Target/AMDGPU/SIWholeQuadMode.cpp
test/CodeGen/AMDGPU/si-scheduler.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D18162.50943.patch
Type: text/x-patch
Size: 7311 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160317/0228ee19/attachment.bin>
More information about the llvm-commits
mailing list