[PATCH] D65961: AMDGPU/SILoadStoreOptimizer: Optimize scanning for mergeable instructions

Wed Dec 4 19:19:55 PST 2019

tstellar added a comment.

In D65961#1766760 <https://reviews.llvm.org/D65961#1766760>, @foad wrote:

> In D65961#1765026 <https://reviews.llvm.org/D65961#1765026>, @tstellar wrote:
>
> > In D65961#1763143 <https://reviews.llvm.org/D65961#1763143>, @foad wrote:
> >
> > > Hi @tstellar, I'm looking into a case where this patch slowed down a shader by 10%. Before I go too far, was this patch supposed to change the behaviour at all, or was it supposed to be purely a compile time improvement?
> >
> >
> > The intention was to not change the behavior at all.
> >
> > > In the case I'm looking at it seems to do the same amount of load merging as before, but the merged loads are inserted at different places in the basic block.
> >
> > Do you have a MIR or .ll dump of the shader I could look at ?  Also, does https://reviews.llvm.org/D65966 help?
>
>
> P8173 <https://reviews.llvm.org/P8173> is a MIR test case. See the RUN line for how to run it. I see significant differences in the placing of the merged BUFFER_LOAD instructions with/without D65961 <https://reviews.llvm.org/D65961> (or before/after it was committed).
>
> I tried applying D65966 <https://reviews.llvm.org/D65966> on top of rGe6f51713054f <https://reviews.llvm.org/rGe6f51713054f8199c61a9dc94faf3dbce1221c02> but it made no difference to the output.

You need to apply it on top of  3a8d80944b7766449e2c8784a8fb30d19a2ba16c or newer for it to have an impact.  When I do that, D65966 <https://reviews.llvm.org/D65966> does help improve the merging.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65961/new/

https://reviews.llvm.org/D65961