[PATCH] D54164: [AMDGPU] Optimize S_CBRANCH_VCC[N]Z -> S_CBRANCH_EXEC[N]Z
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 6 12:14:32 PST 2018
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/SIInsertSkips.cpp:341
+ bool ReadsCond = false;
+ for (++A ; A != B ; ++A) {
+ if (A->modifiesRegister(ExecReg, TRI))
----------------
This should probably have something to avoid pointlessly scanning through the whole block. Maybe add a 4 instruction limit? The only instruction's you're looking for should only really ever be at the very end
================
Comment at: lib/Target/AMDGPU/SIInsertSkips.cpp:390
+
+ if (!ReadsCond && A->registerDefIsDead(AMDGPU::SCC) &&
+ MI.killsRegister(CondReg, TRI))
----------------
Is this registerDefIsDead really what you need? I would expect to need to use LivePhysRegs and check if it's live out (which would also avoid dependence on dead flags)
================
Comment at: test/CodeGen/AMDGPU/insert-skip-from-vcc.mir:20
+ S_CBRANCH_VCCZ %bb.1, implicit killed $vcc
+ S_ENDPGM
+...
----------------
The S_ENDPGMs at the end are very weird
================
Comment at: test/CodeGen/AMDGPU/insert-skip-from-vcc.mir:302
+ S_ENDPGM
+...
----------------
Needs a case with scc live out
https://reviews.llvm.org/D54164
More information about the llvm-commits
mailing list