[Mesa-dev] [PATCH v2] R600/SI: Add pattern for AMDGPUurecip

Owen Anderson resistor at mac.com
Wed Apr 10 10:28:06 PDT 2013


On Apr 10, 2013, at 6:48 AM, Tobias Grosser <tobias at grosser.es> wrote:

> On 04/10/2013 02:51 PM, Michel Dänzer wrote:
>> On Mit, 2013-04-10 at 12:59 +0200, Christian König wrote:
>>> Am 10.04.2013 12:21, schrieb Michel Dänzer:
>>>> On Mit, 2013-04-10 at 12:07 +0200, Christian König wrote:
>>>>> Am 10.04.2013 11:46, schrieb Michel Dänzer:
>>>>> 
>>>>> But why the heck is multiplying with 0x4f800000 fixing the result?
>>>> I'm afraid I can't explain how it works, I basically copied it from the
>>>> Cayman section in R600Instructions.td...
>> 
>> [...]
>> 
>>> Anyway even if I can't explain exactly why it seems to work fine on both
>>> Cayman and SI, so the patch is:
>>> 
>>> Reviewed-by: Christian König <christian.koenig at amd.com>
>> 
>> Thanks. I think I can actually explain again how it works:
>> 
>> The unsigned integer input value is converted to float, its reciprocal
>> value is computed and multiplied by (1 << 32) (encoded as 0x4f800000 in
>> floating point) and converted back to unsigned integer. I think this
>> might lose up to 8 of the least significant bits of the result...
> 
> Very interesting. This seems to be worth a comment in the source code.

I agree.  It would be good to document the trick being applied here so that future target writers don't have to rediscover it each time.

--Owen



More information about the llvm-commits mailing list