[Libclc-dev] [PATCH 3/3] powr: Use denormal path only

Tue Apr 17 18:45:30 PDT 2018

On Fri, 2018-04-13 at 22:03 -0500, Aaron Watry wrote:
> These 3 look ok to me, and I just ran them through 1.2 CTS on my RX580
> as well (passed).
> 
> I did notice that clicking through the patches on the mailing list
> archive, the contents of the affected parts of these files are
> identical. Maybe a possible thing we can de-duplicate at some point
> once things are in better shape.

yes, this is it result of disentangling amd builtin ifdef maze. I'm not
big fun of that approach, but maybe we can find a nicer way to improve
code reuse.
The same problem also affects rootn, but the same fix does not help
Turks, so I'll dig a bit more.

Thanks for all the reviews. Just FYI, we are down to 5 failing math
tests;
asin -4.06 ULP out of 4 allowed. This is not really a priority.
log10D this should be ported from amd_builtins instead of the current approach. WIP.
remquo{,D} not implemented. I have a port from amd_builtins, but it fails on denormals (in quo part, remainder is identical to remainder function)
rootn described above.

Moreover, the whole test segfaults around remquo (my suspicion is allocation failure due to memory leaks).

thanks,
Jan

> 
> --Aaron
> 
> On Thu, 2018-04-12 at 14:53 -0400, Jan Vesely via Libclc-dev wrote:
> > It's OK to either flush to 0 or return denormal result if the device
> > does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
> > Fixes CTS on carrizo and turks.
> > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > ---
> >  generic/lib/math/clc_powr.cl | 12 +-----------
> >  1 file changed, 1 insertion(+), 11 deletions(-)
> > 
> > diff --git a/generic/lib/math/clc_powr.cl
> > b/generic/lib/math/clc_powr.cl
> > index 9074a8b..ef97d3c 100644
> > --- a/generic/lib/math/clc_powr.cl
> > +++ b/generic/lib/math/clc_powr.cl
> > @@ -165,17 +165,7 @@ _CLC_DEF _CLC_OVERLOAD float __clc_powr(float x,
> > float y)
> >      tv = USE_TABLE(exp_tbl_ep, j);
> >  
> >      float expylogx = mad(tv.s0, poly, mad(tv.s1, poly, tv.s1)) +
> > tv.s0;
> > -    float sexpylogx;
> > -    if (!__clc_fp32_subnormals_supported()) {
> > -		int explg = ((as_uint(expylogx) & EXPBITS_SP32 >>
> > 23) - 127);
> > -		m = (23-(m + 149)) == 0 ? 1: m;
> > -		uint mantissa =  ((as_uint(expylogx) &
> > MANTBITS_SP32)|IMPBIT_SP32) >> (23-(m + 149));
> > -		sexpylogx = as_float(mantissa);
> > -    } else {
> > -		sexpylogx = expylogx * as_float(0x1 << (m + 149));
> > -    }
> > -
> > -
> > +    float sexpylogx = expylogx * as_float(0x1 << (m + 149));
> >      float texpylogx = as_float(as_int(expylogx) + m2);
> >      expylogx = m < -125 ? sexpylogx : texpylogx;
> >  

-- 
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20180417/d73df127/attachment.sig>