[PATCH] D154205: [MachineLICM] Handle subloops
JinGu Kang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 10 08:36:56 PDT 2023
jaykang10 updated this revision to Diff 538657.
jaykang10 added subscribers: sunfish, arsenm.
jaykang10 added a comment.
Herald added subscribers: wangpc, pmatos, asb, kerbowa, aheejin, jgravelle-google, sbc100, jvesely, dschuff.
Updated test files.
For AMDGPU target, after hoisting some MIRs, `SIOptimizeExecMaskingPreRA` pass fails to remove them.
On `CodeGen/AMDGPU/agpr-copy-no-free-registers.ll`, it has below loop in MIR level.
Loop at depth 1 containing: %bb.1<header>,%bb.3,%bb.5,%bb.6,%bb.7,%bb.8,%bb.11,%bb.12,%bb.4,%bb.2,%bb.9<latch><exiting>
Loop at depth 2 containing: %bb.5<header>,%bb.6,%bb.7,%bb.8,%bb.11<latch><exiting>
With this patch, below MIRs are hoisted from bb.5 to bb.3 in inner loop.
%155:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, %13:sreg_64_xexec, implicit $exec
%258:sreg_64_xexec = V_CMP_NE_U32_e64 %155:vgpr_32, %90:sreg_32, implicit $ex
After that, `SIOptimizeExecMaskingPreRA` pass fails to optimize the MIRs rather than original one so it looks there are more instructions with this patch. I have not checked the pass in detail but I guess the pass could handle the case. Other AMDGPU regressions have same issue.
A comment on SIOptimizeExecMaskingPreRA
// Optimize sequence
// %sel = V_CNDMASK_B32_e64 0, 1, %cc
// %cmp = V_CMP_NE_U32 1, %sel
// $vcc = S_AND_B64 $exec, %cmp
// S_CBRANCH_VCC[N]Z
// =>
// $vcc = S_ANDN2_B64 $exec, %cc
// S_CBRANCH_VCC[N]Z
@arsenm If this change causes something wrong for AMDGPU target, please let me know.
For Webassembly target, on `CodeGen/WebAssembly/reg-stackify.ll`, I can see below MIR is hoisted to inner loop's preheader and it looks ok.
%3:fr64 = ADDSDrr %1:fr64(tied-def 0), %28:fr64, implicit $mxcsr
@sunfish If this change causes something wrong for WebAssembly target, please let me know.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D154205/new/
https://reviews.llvm.org/D154205
Files:
llvm/lib/CodeGen/MachineLICM.cpp
llvm/test/CodeGen/AArch64/machine-licm-sub-loop.ll
llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll
llvm/test/CodeGen/AMDGPU/optimize-negated-cond.ll
llvm/test/CodeGen/AMDGPU/tuple-allocation-failure.ll
llvm/test/CodeGen/Thumb2/mve-gather-scatter-optimisation.ll
llvm/test/CodeGen/WebAssembly/reg-stackify.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D154205.538657.patch
Type: text/x-patch
Size: 26895 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230710/8d36aeeb/attachment.bin>
More information about the llvm-commits
mailing list