[PATCH] D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary
Nicolai Hähnle via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 1 06:24:08 PDT 2022
nhaehnle added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1629
+ if (Block.getFirstTerminator() == Block.end() &&
+ isPreheaderToFlush(Block, ScoreBrackets))
----------------
bsaleil wrote:
> foad wrote:
> > It is a shame that you have to implement this in two places, for blocks with and without terminators. I'm not sure if there is a better way. Maybe generateWaitcntInstBefore could be changed to take an iterator (which is allowed to be `end()`) instead of MI, so you would not need the new function generateWaitcntBlockEnd. But that would be quite invasive.
> Yes, unfortunately I also think changing that would be too pervasive. generateWaitcntInstBefore relies a lot on the fact that MI is a valid instruction.
generateWaitcntInstBefore has two distinct halves: the first half determines the counts to be waited for based on MI, and the second half (I would say starting at the comment `// Early-out if no wait is indicated.`) is agnostic to *how* the counts were obtained.
It seems to me that it could be fairly natural to split up the function and use the bottom half of it on both paths.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D115747/new/
https://reviews.llvm.org/D115747
More information about the llvm-commits
mailing list