[Libclc-dev] [PATCH 1/2] amdgcn/fmin: Explicitly check for NaNs
Arsenault, Matthew via Libclc-dev
libclc-dev at lists.llvm.org
Thu Nov 16 17:10:12 PST 2017
The compiler always assumes IEEE mode is enabled for compute kernels. I'm not sure if the driver respects that or not.
________________________________
From: Jeroen Ketema <j.ketema at xs4all.nl>
Sent: Thursday, November 16, 2017 4:46:12 PM
To: Jan Vesely
Cc: Arsenault, Matthew; libclc-dev at lists.llvm.org
Subject: Re: [Libclc-dev] [PATCH 1/2] amdgcn/fmin: Explicitly check for NaNs
If I’m not mistaken these all change sNaNs into qNaNs. Looking at [1], this seems reminiscent of ieee_mode. Are you sure the chip is correctly set up before the kernels are run? In particular, the IEEE bit (bit 9) of the Mode Register?
Jeroen
[1] http://developer.amd.com/wordpress/media/2013/07/AMD_Sea_Islands_Instruction_Set_Architecture1.pdf
> On 15 Nov 2017, at 22:16, Jan Vesely via Libclc-dev <libclc-dev at lists.llvm.org> wrote:
>
> On Wed, 2017-11-15 at 11:43 -0800, Matt Arsenault via Libclc-dev wrote:
>> On 11/15/2017 09:21 AM, Jan Vesely via Libclc-dev wrote:
>>> v_min instruction fails to handle certain NaNs correctly.
>>>
>>> Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
>>> ---
>>> Fixes most of fmin CTS on carrizo. It still fails on denormals, but IMO
>>> that's CTS bug.
>>> EG/NI does not seem to suffer the same problem.
>>
>> What exact problem are you trying to solve? Do you have an example? NaNs
>> should work correctly.
>
> they don't.
> some nan encodings return incorrect value.
> examples from CTS:
> ERROR: fmin: -nan ulp error at {0x1.211a92p-95 (0x10108d49), -nan (0xffa66c5d)}: *0x1.211a92p-95 vs. -nan (0xffe66c5d) at index: 448
>
> ERROR: fmin: nan ulp error at {nan (0x7fba759e), 0x1.94ea58p-89 (0x134a752c)}: *0x1.94ea58p-89 vs. nan (0x7ffa759e) at index: 450
>
> ERROR: fmin: nan ulp error at {nan (0x7f9a4655), -0x1.9ed39p-22 (0xb4cf69c8)}: *-0x1.9ed39p-22 vs. nan (0x7fda4655) at index: 617
>
> ERROR: fmin: nan ulp error at {0x1.303c36p+46 (0x56981e1b), nan (0x7fa7afae)}: *0x1.303c36p+46 vs. nan (0x7fe7afae) at index: 739
>
> it only shows 4 errors because it used 4 testing threads, there might be more.
> tested on carrizo (fx9800p).
>
> Jan
>
>
>> We do have some issues with denorm flushing
>> behavior changing on gfx9.
>> _______________________________________________
>> Libclc-dev mailing list
>> Libclc-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20171117/8aaaff56/attachment-0001.html>
More information about the Libclc-dev
mailing list