[clang] [llvm] [AMDGPU][SIInsertWaitcnt] Implement Waitcnt Expansion for Profiling (PR #169345)
Jay Foad via cfe-commits
cfe-commits at lists.llvm.org
Tue Jan 13 07:57:32 PST 2026
================
@@ -2013,13 +2092,55 @@ bool WaitcntGeneratorGFX12Plus::applyPreexistingWaitcnt(
/// Generate S_WAIT_*CNT instructions for any required counters in \p Wait
bool WaitcntGeneratorGFX12Plus::createNewWaitcnt(
MachineBasicBlock &Block, MachineBasicBlock::instr_iterator It,
- AMDGPU::Waitcnt Wait) {
+ AMDGPU::Waitcnt Wait, WaitcntBrackets *ScoreBrackets) {
assert(ST);
assert(!isNormalMode(MaxCounter));
bool Modified = false;
const DebugLoc &DL = Block.findDebugLoc(It);
+ // Helper to emit expanded waitcnt sequence for profiling.
+ auto EmitExpandedWaitcnt = [&](unsigned Outstanding, unsigned Target,
+ auto EmitWaitcnt) {
+ if (Outstanding > Target) {
----------------
jayfoad wrote:
You should not need this "if". The loop should work in all cases.
https://github.com/llvm/llvm-project/pull/169345
More information about the cfe-commits
mailing list