[PATCH] D116273: [AMDGPU] Iterate LoweredEndCf in the reverse order

Fri Dec 24 13:54:02 PST 2021

cdevadas created this revision.
cdevadas added reviewers: arsenm, rampitec.
Herald added subscribers: foad, kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl.
cdevadas requested review of this revision.
Herald added subscribers: llvm-commits, wdng.
Herald added a project: LLVM.

The function that optimally inserts the exec mask
restore operations by combining the blocks currently
visits the lowered END_CF pseudos in the forward
direction as it iterates the setvector in the order
the entries are inserted in it.

Due to the absence of BranchFolding at -O0, the
irregularly placed BBs cause the forward traversal
to incorrectly place two unconditional branches in
certain BBs while combining them, especially when
an intervening block later gets optimized away in
subsequent iterations.

It can be avoided by reversing the iteration of the
setvector. The blocks at the bottom of a function
will get optimized first before processing those
at the top.

Fixes: SWDEV-315215

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D116273

Files:
  llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
  llvm/test/CodeGen/AMDGPU/collapse-endcf.mir

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D116273.396180.patch
Type: text/x-patch
Size: 8864 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20211224/201c7087/attachment-0001.bin>