[PATCH RFC 0/4] R600: Implement 64bit div/rem

Tom Stellard tom at stellard.net
Tue Apr 29 16:20:27 PDT 2014


On Fri, Apr 25, 2014 at 03:08:34PM -0400, Jan Vesely wrote:
> Hi,
> 
> I tried added 64bit div/rem support for r600 and this is what I come up with.
> The first patch is just a random cleanup in 32 bit version.
> The second patch changes UDIV/UREM nodes to UDIVREM, as this does not happen automatically during type legalizing phase.
> The third patch implements the basic iterative division alg
> (loop unrolled version), and the last one adds some optimizations that
> I could think of. I still have the original commits if you prefer to apply them
> individually. The optimizations result in cca 60% fewer instructions and 40%
> fewer instruction groups.
> 
> My assumption was that people should only ever use 64 integers if they intend
> to use large numbers, so the additional overhead of speculative UDIVREM32
> does not really matter (it's about 5% of the total instruction count).
> 
> A better version would have one initial runtime check and either branch to
> UDVIREM-64-by-32 or UDIVREM-64-by-64 that does not do the initial
> speculation (and requires the divisor to be >= 2^32). This would speed up the execution if all threads in workgroup have the same kind of divisor
> (either all < 2^32 or all >= 2^32)
> 
> regards,
> Jan
> 
> Jan Vesely (4):
>   R600: remove unused variable
>   R600: Change UDIV/UREM to UDIVREM when legalizing types
>   R600: Implement iterative algorithm for udivrem
>   R600: optimize the UDIVREM algorithm for 64bit operands
> 
>  lib/Target/R600/AMDGPUISelLowering.cpp | 94 +++++++++++++++++++++++++++++++++-
>  1 file changed, 92 insertions(+), 2 deletions(-)
> 

I've pushed all these patches, thanks!

-Tom

> -- 
> 1.9.0
> 



More information about the llvm-commits mailing list