[PATCH] D117544: [AMDGPU] Fix missing waitcnt issue
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 18 01:44:05 PST 2022
foad accepted this revision.
foad added a comment.
This revision is now accepted and ready to land.
In D117544#3250636 <https://reviews.llvm.org/D117544#3250636>, @piotr wrote:
> In our usual compile-time tests it shows 0.056% degradation on average, worst case 0.9%.
In that case I agree we should go with your fix as it is the simplest.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1428
- if (RegStrictDom && !OldOutOfOrder)
StrictDom = true;
----------------
piotr wrote:
> foad wrote:
> > foad wrote:
> > > As an experiment, would changing the condition to `RegStrictDom && !OldOutOfOrder && MyPending != 0` also fix the bug?
> > > As an experiment, would changing the condition to `RegStrictDom && !OldOutOfOrder && MyPending != 0` also fix the bug?
> >
> > Sorry I meant `RegStrictDom && !(OldOutOfOrder && MyPending != 0)`.
> > > As an experiment, would changing the condition to `RegStrictDom && !OldOutOfOrder && MyPending != 0` also fix the bug?
> >
> > Sorry I meant `RegStrictDom && !(OldOutOfOrder && MyPending != 0)`.
>
> Do you mean this:
>
> if (RegStrictDom && !(OldOutOfOrder && MyPending == 0))
>
> I think that would also fix the bug in the test, but I wasn't convinced that was a complete fix, so I went for a safer and simpler approach.
> Do you mean this:
I'm not sure what I meant; sorry for the noise. Now I think it *might* be possible to salvage the original optimisation by remembering whether any individual register has a wait in the merged state, when it had no wait in the original state. But I don't think it's worth the complexity.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D117544/new/
https://reviews.llvm.org/D117544
More information about the llvm-commits
mailing list