[all-commits] [llvm/llvm-project] 50b5b3: [AMDGPU] Iterate LoweredEndCf in the reverse order

Wed Jan 5 21:28:00 PST 2022

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 50b5b367c1ae72be5265f81b4dba03b3deb0c4e4
      https://github.com/llvm/llvm-project/commit/50b5b367c1ae72be5265f81b4dba03b3deb0c4e4
  Author: Christudasan Devadasan <Christudasan.Devadasan at amd.com>
  Date:   2022-01-06 (Thu, 06 Jan 2022)

  Changed paths:
    M llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
    M llvm/test/CodeGen/AMDGPU/collapse-endcf.mir

  Log Message:
  -----------
  [AMDGPU] Iterate LoweredEndCf in the reverse order

The function that optimally inserts the exec mask
restore operations by combining the blocks currently
visits the lowered END_CF pseudos in the forward
direction as it iterates the setvector in the order
the entries are inserted in it.

Due to the absence of BranchFolding at -O0, the
irregularly placed BBs cause the forward traversal
to incorrectly place two unconditional branches in
certain BBs while combining them, especially when
an intervening block later gets optimized away in
subsequent iterations.

It is avoided by reverse iterating the setvector.
The blocks at the bottom of a function will get
optimized first before processing those at the top.

Fixes: SWDEV-315215

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D116273