[Mesa-dev] [PATCH v2] R600/SI: Add pattern for AMDGPUurecip

Wed Apr 10 05:51:54 PDT 2013

On Mit, 2013-04-10 at 12:59 +0200, Christian König wrote: 
> Am 10.04.2013 12:21, schrieb Michel Dänzer:
> > On Mit, 2013-04-10 at 12:07 +0200, Christian König wrote:
> >> Am 10.04.2013 11:46, schrieb Michel Dänzer:
> >>
> >> But why the heck is multiplying with 0x4f800000 fixing the result?
> > I'm afraid I can't explain how it works, I basically copied it from the
> > Cayman section in R600Instructions.td...

[...]

> Anyway even if I can't explain exactly why it seems to work fine on both 
> Cayman and SI, so the patch is:
> 
> Reviewed-by: Christian König <christian.koenig at amd.com>

Thanks. I think I can actually explain again how it works:

The unsigned integer input value is converted to float, its reciprocal
value is computed and multiplied by (1 << 32) (encoded as 0x4f800000 in
floating point) and converted back to unsigned integer. I think this
might lose up to 8 of the least significant bits of the result... 

-- 
Earthling Michel Dänzer           |                   http://www.amd.com
Libre software enthusiast         |          Debian, X and DRI developer