[PATCH] D124671: [AMDGPU] Do not raise wave priority beyond a specific number of VALU instructions.

Wed May 18 07:56:23 PDT 2022

kosarev added a comment.

In D124671#3521606 <https://reviews.llvm.org/D124671#3521606>, @foad wrote:

> I'm not sure how to review this. Can you explain why this heuristic makes sense intuitively? Or do you have any benchmarks to back it up? Or preferably both? :)

Right, that's the hardest part of it, isn't it. Because we don't have much use cases provided, the idea was to start with replicating more or less the same counting logic we have in the other compiler, and then adjust things using feedback from people who can give it some proper testing.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSetWavePriority.cpp:106
+  // that backedges/loops, branch probabilities and other details can be
+  // ignored.
   SmallVector<const MachineBasicBlock *, 16> Worklist;
----------------
foad wrote:
> If you're ignoring loops, can you described succinctly what you *are* counting? Is it something like the minimum number of VALU instructions along any path from the start of the function to the VMEM load in question?
Good point. Amended the comment.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUSetWavePriority.cpp:198-199
+    if (MachineInstr *LastVMEMLoad = MBBInfos[MBB].LastVMEMLoad) {
+      MBB->insertAfter(MachineBasicBlock::instr_iterator(LastVMEMLoad),
+                       Setprio);
+      continue;
----------------
foad wrote:
> kosarev wrote:
> > arsenm wrote:
> > > kosarev wrote:
> > > > arsenm wrote:
> > > > > Why not construct directly at the insert point?
> > > > Well, the instruction we create here is not just an auxiliary value that we coincidentally happen to use in both the cases, if that's what you mean. That instruction must be spent in all cases and we do want it be the same instruction.
> > > But you can construct it in the right place rather than constructing the instruction and then inserting after. You have the insert point you want, you can just construct it there? It's unusual to need insert/insertAfter
> > I'm not sure I see how this answers the point. Granted that constructing the instruction at the points of insertion is possible, but as I said in this case we have reasons to create it in a single place.
> I agree with Matt that it would be more normal to construct-and-insert at the same time, though I don't feel very strongly about it. I think I suggested before that you could pass the insertion point into BuildSetprioMI. In this case the insertion point would be something like `MBBInfos[MBB].LastVMEMLoad ? std::next(MBBInfos[MBB].LastVMEMLoad) : MBB->begin()`.
OK, changed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124671/new/

https://reviews.llvm.org/D124671