[PATCH] D117544: [AMDGPU] Fix missing waitcnt issue

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 18 01:44:05 PST 2022


foad accepted this revision.
foad added a comment.
This revision is now accepted and ready to land.

In D117544#3250636 <https://reviews.llvm.org/D117544#3250636>, @piotr wrote:

> In our usual compile-time tests it shows 0.056% degradation on average, worst case 0.9%.

In that case I agree we should go with your fix as it is the simplest.



================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1428
 
-    if (RegStrictDom && !OldOutOfOrder)
       StrictDom = true;
----------------
piotr wrote:
> foad wrote:
> > foad wrote:
> > > As an experiment, would changing the condition to `RegStrictDom && !OldOutOfOrder && MyPending != 0` also fix the bug?
> > > As an experiment, would changing the condition to `RegStrictDom && !OldOutOfOrder && MyPending != 0` also fix the bug?
> > 
> > Sorry I meant `RegStrictDom && !(OldOutOfOrder && MyPending != 0)`.
> > > As an experiment, would changing the condition to `RegStrictDom && !OldOutOfOrder && MyPending != 0` also fix the bug?
> > 
> > Sorry I meant `RegStrictDom && !(OldOutOfOrder && MyPending != 0)`.
> 
> Do you mean this:
> 
> if (RegStrictDom && !(OldOutOfOrder && MyPending == 0))
> 
> I think that would also fix the bug in the test, but I wasn't convinced that was a complete fix, so I went for a safer and simpler approach.
> Do you mean this:

I'm not sure what I meant; sorry for the noise. Now I think it *might* be possible to salvage the original optimisation by remembering whether any individual register has a wait in the merged state, when it had no wait in the original state. But I don't think it's worth the complexity.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117544/new/

https://reviews.llvm.org/D117544



More information about the llvm-commits mailing list