[PATCH] D149348: RFD: Do not CSE convergent calls in different basic blocks
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 27 07:51:07 PDT 2023
foad added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/cse-convergent.ll:37
+; GCN-NEXT: s_or_saveexec_b32 s5, -1
+; GCN-NEXT: v_mov_b32_dpp v2, v3 row_xmask:1 row_mask:0xf bank_mask:0xf
+; GCN-NEXT: s_mov_b32 exec_lo, s5
----------------
This is the effect of the fix: we repeat the DPP subgroup operation over a reduced set of lanes, instead of reusing the result of the first DPP subgroup operation over all lanes.
================
Comment at: llvm/test/Transforms/SimplifyCFG/convergent.ll:85
; SINK-NEXT: [[CMP_NOT:%.*]] = icmp eq i32 [[REM]], 0
+; SINK-NEXT: [[IDXPROM4:%.*]] = zext i32 [[TMP0]] to i64
+; SINK-NEXT: [[ARRAYIDX5:%.*]] = getelementptr inbounds i32, ptr [[Y_COERCE:%.*]], i64 [[IDXPROM4]]
----------------
This is a completely accidental hoisting improvement due to https://reviews.llvm.org/D129370#inline-1442432. Convergent calls in the "then" and "else" branches are now treated as not identical, which weirdly allows *more* hoisting than when they were considered identical.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D149348/new/
https://reviews.llvm.org/D149348
More information about the llvm-commits
mailing list