[PATCH] D48586: [AMDGPU] Early expansion of 32 bit udiv/urem

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 26 07:22:42 PDT 2018


rampitec added a comment.

In https://reviews.llvm.org/D48586#1143302, @arsenm wrote:

> In https://reviews.llvm.org/D48586#1143251, @rampitec wrote:
>
> > In https://reviews.llvm.org/D48586#1143243, @arsenm wrote:
> >
> > > Should we enable BypassSlowDivision or possibly merge this expansion with it?
> >
> >
> > Bypass is a separate question as it does runtime resolution. In fact it is questionable optimization for a SIMT, that is enough to have just one thread doing slow division to get the overhead penalty. In anyway this is really a separate optimization.
>
>
> I believe it was specifically introduced for PTX, so apparently it is common enough in real workloads


I still can see this is a trade-off depending on the test. And still a separate change as a result.


https://reviews.llvm.org/D48586





More information about the llvm-commits mailing list