[PATCH] D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 1 06:24:08 PDT 2022


nhaehnle added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1629
 
+  if (Block.getFirstTerminator() == Block.end() &&
+      isPreheaderToFlush(Block, ScoreBrackets))
----------------
bsaleil wrote:
> foad wrote:
> > It is a shame that you have to implement this in two places, for blocks with and without terminators. I'm not sure if there is a better way. Maybe generateWaitcntInstBefore could be changed to take an iterator (which is allowed to be `end()`) instead of MI, so you would not need the new function generateWaitcntBlockEnd. But that would be quite invasive.
> Yes, unfortunately I also think changing that would be too pervasive. generateWaitcntInstBefore relies a lot on the fact that MI is a valid instruction.
generateWaitcntInstBefore has two distinct halves: the first half determines the counts to be waited for based on MI, and the second half (I would say starting at the comment `// Early-out if no wait is indicated.`) is agnostic to *how* the counts were obtained.

It seems to me that it could be fairly natural to split up the function and use the bottom half of it on both paths.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115747/new/

https://reviews.llvm.org/D115747



More information about the llvm-commits mailing list