[PATCH] D35967: [AMDGPU] Collapse adjacent SI_END_CF
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 27 16:43:04 PDT 2017
rampitec created this revision.
Herald added subscribers: t-tye, tpr, dstuttard, yaxunl, mgorny, nhaehnle, wdng, kzhuravl.
Add a pass to remove redundant S_OR_B64 instructions enabling lanes in
the exec. If two SI_END_CF (lowered as S_OR_B64) come together without any
vector instructions between them we can only keep outer SI_END_CF, given
that CFG is structured and exec bits of the outer end statement are always
not less than exec bit of the inner one.
This needs to be done before the RA to eliminate saved exec bits registers
but after register coalescer to have no vector registers copies in between
of different end cf statements.
https://reviews.llvm.org/D35967
Files:
lib/Target/AMDGPU/AMDGPU.h
lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
lib/Target/AMDGPU/CMakeLists.txt
lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp
test/CodeGen/AMDGPU/collapse-endcf.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D35967.108550.patch
Type: text/x-patch
Size: 15722 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170727/c76eded6/attachment.bin>
More information about the llvm-commits
mailing list