[PATCH] D48586: [AMDGPU] Early expansion of 32 bit udiv/urem

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jun 26 01:48:36 PDT 2018


arsenm added a comment.

In https://reviews.llvm.org/D48586#1143251, @rampitec wrote:

> In https://reviews.llvm.org/D48586#1143243, @arsenm wrote:
>
> > Should we enable BypassSlowDivision or possibly merge this expansion with it?
>
>
> Bypass is a separate question as it does runtime resolution. In fact it is questionable optimization for a SIMT, that is enough to have just one thread doing slow division to get the overhead penalty. In anyway this is really a separate optimization.


I believe it was specifically introduced for PTX, so apparently it is common enough in real workloads


https://reviews.llvm.org/D48586





More information about the llvm-commits mailing list