[PATCH] D41292: [AMDGPU] Fixed incorrect uniform branch condition
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 15 08:19:11 PST 2017
arsenm added a comment.
The clearing extra clearing when moving branch on scc was a leftover that should have been removed in r286134. I also don't understand why it would be needed in that case, since the moved condition is going to be a V_CMP* that does the right thing for inactive lanes.
I think this is backwards. We shouldn't be looking at the already lowered control flow instructions and trying to insert semantically required instructions. This pass is specifically for expanding the exec mask modifications, not general control flow. It could perhaps be renamed. I think you should partially revert r286134 unless the condition source is known to be a compare.
An optimisation pass should then try to eliminate the unnecessary ands, perhaps SIOptimizeExecMaskingPreRA
https://reviews.llvm.org/D41292
More information about the llvm-commits
mailing list