[Libclc-dev] [PATCH 1/2] amdgcn/fmin: Explicitly check for NaNs

Mon Nov 27 14:00:03 PST 2017

On Mon, 2017-11-27 at 22:18 +0100, Jeroen Ketema via Libclc-dev wrote:
> > On 27 Nov 2017, at 21:27, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> > 
> > On Fri, 2017-11-17 at 01:57 +0100, Jeroen Ketema wrote:
> > > Below:
> > > 
> > > s/correct/incorrect/
> > > 
> > > My apologies if that caused confusion.
> > > 
> > > Jeroen
> > > 
> > > > On 17 Nov 2017, at 01:55, Jeroen Ketema <j.ketema at xs4all.nl> wrote:
> > > > 
> > > > Mmm,
> > > > 
> > > > From the LLVM IR documentation “Follows the IEEE-754 semantics for
> > > > minNum, which also match for libm’s fmin.”
> > > > 
> > > > What does IEEE-754 mean here? 1985 or 2008? If 2008, then the
> > > > statement is in correct, because libm treats sNaNs and qNaNs in the
> > > > same way. I’m not sure about 1985 (don’t have access to that
> > > > version of the spec at the moment).
> > 
> > llvm's fcanonicalize op, which among other things silences SNaNs should
> > be implementable using llvm.minnum(x, x) if the environment supports
> > SNaNs[0].
> 
> Ok, so that requires following the IEEE-754 semantics, because sNaNs
> need to be turned into qNaNs.
> 
> On the other hand, the libm remark seems to suggest that you may drop
> in libm’s version if there’s not hardware support.
> 
> It might be best to get this clarified on llvm-dev?

since those operations were added by Matt, he can comment either in
this thread or on the llvm patch. I don't think we'd get more info on
llvm-dev.

Kan

> 
> Jeroen
> 
> > I think we need these patches if GCN backend insists on supporting
> > SNaNs for all compute kernels. It'd still be preferable to disable SNaN
> > support for OpenCL [1]. Pending the outcome of that patch I'll
> > restrict these patches to llvm 3.9/4.0/5.0.
> > 
> > Jan
> > 
> > [0] https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic
> > [1] https://reviews.llvm.org/D40514
> > 
> > > > 
> > > > Jeroen
> > > > 
> > > > > On 17 Nov 2017, at 01:46, Jeroen Ketema <j.ketema at xs4all.nl>
> > > > > wrote:
> > > > > 
> > > > > If I’m not mistaken these all change sNaNs into qNaNs. Looking at
> > > > > [1], this seems reminiscent of ieee_mode. Are you sure the chip
> > > > > is correctly set up before the kernels are run? In particular,
> > > > > the IEEE bit (bit 9) of the Mode Register?
> > > > > 
> > > > > Jeroen
> > > > > 
> > > > > [1] http://developer.amd.com/wordpress/media/2013/07/AMD_Sea_Isla
> > > > > nds_Instruction_Set_Architecture1.pdf
> > > > > 
> > > > > > On 15 Nov 2017, at 22:16, Jan Vesely via Libclc-dev <libclc-dev
> > > > > > @lists.llvm.org> wrote:
> > > > > > 
> > > > > > On Wed, 2017-11-15 at 11:43 -0800, Matt Arsenault via Libclc-
> > > > > > dev wrote:
> > > > > > > On 11/15/2017 09:21 AM, Jan Vesely via Libclc-dev wrote:
> > > > > > > > v_min instruction fails to handle certain NaNs correctly.
> > > > > > > > 
> > > > > > > > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > > > > > > > ---
> > > > > > > > Fixes most of fmin CTS on carrizo. It still fails on
> > > > > > > > denormals, but IMO
> > > > > > > > that's CTS bug.
> > > > > > > > EG/NI does not seem to suffer the same problem.
> > > > > > > 
> > > > > > > What exact problem are you trying to solve? Do you have an
> > > > > > > example? NaNs 
> > > > > > > should work correctly.
> > > > > > 
> > > > > > they don't.
> > > > > > some nan encodings return incorrect value.
> > > > > > examples from CTS:
> > > > > > ERROR: fmin: -nan ulp error at {0x1.211a92p-95 (0x10108d49),
> > > > > > -nan (0xffa66c5d)}: *0x1.211a92p-95 vs. -nan (0xffe66c5d) at
> > > > > > index: 448
> > > > > > 
> > > > > > ERROR: fmin: nan ulp error at {nan (0x7fba759e), 0x1.94ea58p-89 
> > > > > > (0x134a752c)}: *0x1.94ea58p-89 vs. nan (0x7ffa759e) at index:
> > > > > > 450
> > > > > > 
> > > > > > ERROR: fmin: nan ulp error at {nan (0x7f9a4655), -0x1.9ed39p-22 
> > > > > > (0xb4cf69c8)}: *-0x1.9ed39p-22 vs. nan (0x7fda4655) at index:
> > > > > > 617
> > > > > > 
> > > > > > ERROR: fmin: nan ulp error at {0x1.303c36p+46 (0x56981e1b), nan
> > > > > > (0x7fa7afae)}: *0x1.303c36p+46 vs. nan (0x7fe7afae) at index:
> > > > > > 739
> > > > > > 
> > > > > > it only shows 4 errors because it used 4 testing threads, there
> > > > > > might be more.
> > > > > > tested on carrizo (fx9800p).
> > > > > > 
> > > > > > Jan
> > > > > > 
> > > > > > 
> > > > > > > We do have some issues with denorm flushing 
> > > > > > > behavior changing on gfx9.
> > > > > > > _______________________________________________
> > > > > > > Libclc-dev mailing list
> > > > > > > Libclc-dev at lists.llvm.org
> > > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
> > > > > > 
> > > > > > _______________________________________________
> > > > > > Libclc-dev mailing list
> > > > > > Libclc-dev at lists.llvm.org
> > > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
> > > 
> > > 
> 
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20171127/7492a557/attachment.sig>