[all-commits] [llvm/llvm-project] 50b5b3: [AMDGPU] Iterate LoweredEndCf in the reverse order
Christudasan Devadasan via All-commits
all-commits at lists.llvm.org
Wed Jan 5 21:28:00 PST 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 50b5b367c1ae72be5265f81b4dba03b3deb0c4e4
https://github.com/llvm/llvm-project/commit/50b5b367c1ae72be5265f81b4dba03b3deb0c4e4
Author: Christudasan Devadasan <Christudasan.Devadasan at amd.com>
Date: 2022-01-06 (Thu, 06 Jan 2022)
Changed paths:
M llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
M llvm/test/CodeGen/AMDGPU/collapse-endcf.mir
Log Message:
-----------
[AMDGPU] Iterate LoweredEndCf in the reverse order
The function that optimally inserts the exec mask
restore operations by combining the blocks currently
visits the lowered END_CF pseudos in the forward
direction as it iterates the setvector in the order
the entries are inserted in it.
Due to the absence of BranchFolding at -O0, the
irregularly placed BBs cause the forward traversal
to incorrectly place two unconditional branches in
certain BBs while combining them, especially when
an intervening block later gets optimized away in
subsequent iterations.
It is avoided by reverse iterating the setvector.
The blocks at the bottom of a function will get
optimized first before processing those at the top.
Fixes: SWDEV-315215
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D116273
More information about the All-commits
mailing list