[PATCH] D115747: [AMDGPU] Flush the vmcnt counter in loop preheader when necessary

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 8 03:05:42 PDT 2022


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:831-832
   MachineInstr *WaitcntVsCntInstr = nullptr;
-  for (auto II = OldWaitcntInstr.getIterator(), NextI = std::next(II);
-       &*II != MI; II = NextI, ++NextI) {
+  for (auto II = OldWaitcntInstr.getIterator(), NextI = std::next(II); II != It;
+       II = NextI, ++NextI) {
     if (II->isMetaInstruction())
----------------
Not related to your patch, but this could probably be something like:
```
  for (auto &I : make_early_inc_range(make_range(OldWaitcntInstr, It)))
```


================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1629
 
+  if (Block.getFirstTerminator() == Block.end() &&
+      isPreheaderToFlush(Block, ScoreBrackets))
----------------
nhaehnle wrote:
> bsaleil wrote:
> > nhaehnle wrote:
> > > bsaleil wrote:
> > > > foad wrote:
> > > > > It is a shame that you have to implement this in two places, for blocks with and without terminators. I'm not sure if there is a better way. Maybe generateWaitcntInstBefore could be changed to take an iterator (which is allowed to be `end()`) instead of MI, so you would not need the new function generateWaitcntBlockEnd. But that would be quite invasive.
> > > > Yes, unfortunately I also think changing that would be too pervasive. generateWaitcntInstBefore relies a lot on the fact that MI is a valid instruction.
> > > generateWaitcntInstBefore has two distinct halves: the first half determines the counts to be waited for based on MI, and the second half (I would say starting at the comment `// Early-out if no wait is indicated.`) is agnostic to *how* the counts were obtained.
> > > 
> > > It seems to me that it could be fairly natural to split up the function and use the bottom half of it on both paths.
> > I can't find the comment `// Early-out if no wait is indicated` in that function, but I assume the second part you mention starts from `if (OldWaitcntInstr) { ...`.
> > To extract that second half into a separate function, I'd need to pass an iterator for the call to `applyPreexistingWaitcnt` and the two calls to `BuildMI`. The problem is that I cannot really pass an iterator here because it may be invalidated by all these calls. I guess I could still use a lambda or something to retrieve the iterator each time we need it but I think this may be a bit overkill.
> I'm afraid I don't quite follow. MachineInstruction iterators aren't invalidated by BuildMI?
I think it would be OK to leave this cleanup for a follow-up NFC patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D115747/new/

https://reviews.llvm.org/D115747



More information about the llvm-commits mailing list