[PATCH] D145329: AMDGPU: Always split blocks for si_end_cf

Carl Ritson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Mar 5 18:16:41 PST 2023


critson added a comment.

Unfortunately this interferes with WQM mode change insertion.
You can see this in the reordered s_or + s_and instruction pairs.
I guess this was always a risk with block splitting.

Seems like we need to modify the WQM pass to handle terminators that modify exec.



================
Comment at: llvm/test/CodeGen/AMDGPU/collapse-endcf.ll:264
+; GCN-O0-NEXT:  .LBB1_5: ; %bb.inner.end
+; GCN-O0-NEXT:    v_readlane_b32 s0, v1, 4
+; GCN-O0-NEXT:    v_readlane_b32 s1, v1, 5
----------------
Is this reordering fixing the bug mentioned in the description?
(Exec mask is restored before buffer_load, rather than after.)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145329/new/

https://reviews.llvm.org/D145329



More information about the llvm-commits mailing list