[Mesa-dev] [PATCH v2] R600/SI: Add pattern for AMDGPUurecip
Owen Anderson
resistor at mac.com
Wed Apr 10 10:28:06 PDT 2013
On Apr 10, 2013, at 6:48 AM, Tobias Grosser <tobias at grosser.es> wrote:
> On 04/10/2013 02:51 PM, Michel Dänzer wrote:
>> On Mit, 2013-04-10 at 12:59 +0200, Christian König wrote:
>>> Am 10.04.2013 12:21, schrieb Michel Dänzer:
>>>> On Mit, 2013-04-10 at 12:07 +0200, Christian König wrote:
>>>>> Am 10.04.2013 11:46, schrieb Michel Dänzer:
>>>>>
>>>>> But why the heck is multiplying with 0x4f800000 fixing the result?
>>>> I'm afraid I can't explain how it works, I basically copied it from the
>>>> Cayman section in R600Instructions.td...
>>
>> [...]
>>
>>> Anyway even if I can't explain exactly why it seems to work fine on both
>>> Cayman and SI, so the patch is:
>>>
>>> Reviewed-by: Christian König <christian.koenig at amd.com>
>>
>> Thanks. I think I can actually explain again how it works:
>>
>> The unsigned integer input value is converted to float, its reciprocal
>> value is computed and multiplied by (1 << 32) (encoded as 0x4f800000 in
>> floating point) and converted back to unsigned integer. I think this
>> might lose up to 8 of the least significant bits of the result...
>
> Very interesting. This seems to be worth a comment in the source code.
I agree. It would be good to document the trick being applied here so that future target writers don't have to rediscover it each time.
--Owen
More information about the llvm-commits
mailing list