[Mesa-dev] [PATCH v2] R600/SI: Add pattern for AMDGPUurecip

Wed Apr 10 06:48:52 PDT 2013

On 04/10/2013 02:51 PM, Michel Dänzer wrote:
> On Mit, 2013-04-10 at 12:59 +0200, Christian König wrote:
>> Am 10.04.2013 12:21, schrieb Michel Dänzer:
>>> On Mit, 2013-04-10 at 12:07 +0200, Christian König wrote:
>>>> Am 10.04.2013 11:46, schrieb Michel Dänzer:
>>>>
>>>> But why the heck is multiplying with 0x4f800000 fixing the result?
>>> I'm afraid I can't explain how it works, I basically copied it from the
>>> Cayman section in R600Instructions.td...
>
> [...]
>
>> Anyway even if I can't explain exactly why it seems to work fine on both
>> Cayman and SI, so the patch is:
>>
>> Reviewed-by: Christian König <christian.koenig at amd.com>
>
> Thanks. I think I can actually explain again how it works:
>
> The unsigned integer input value is converted to float, its reciprocal
> value is computed and multiplied by (1 << 32) (encoded as 0x4f800000 in
> floating point) and converted back to unsigned integer. I think this
> might lose up to 8 of the least significant bits of the result...

Very interesting. This seems to be worth a comment in the source code.

Tobi