[Libclc-dev] [PATCH v2 1/1] rootn: Flush denormals if not supported.
    Aaron Watry via Libclc-dev 
    libclc-dev at lists.llvm.org
       
    Wed May  2 20:16:49 PDT 2018
    
    
  
On Wed, 2018-05-02 at 21:51 -0400, Jan Vesely wrote:
> On Wed, 2018-05-02 at 07:03 -0500, Aaron Watry via Libclc-dev wrote:
> > Am I being dense or just lucky (device supports denormals?)..  This
> > already passed on my RX580 before I applied your patch.
> 
> IIRC, the problem is not with denormal support (unless you enabled it
> explicitly), but that 'indx' variable was computed incorrectly. My
> guess would be that one of the earlier operations (mad?) improved wrt
> ULP precision (rootn still failed on my carrizo).
> Anyway, flushing denormals just hides the issue. it'll probably still
> fail if run with denormals enabled, but fixing denormal support is a
> story for another day.
> 
> > I'm currently rebuilding new newer llvm on my r600 box that
> > hopefully
> > won't segfault when running rootn to test there.
> 
> thanks. It works OK on my turks when math_bruteforce is run in single
> thread mode.
Oh yeah, the compute memory pool on r600 isn't thread-safe...
Let's just say that the email I sent this morning was while the first
cup of coffee was still unconsumed, and I had a small child in my lap
trying to commandeer my mouse. Not a great time for deep thoughts. :)
--Aaron
> 
> Jan
> 
> > 
> > --Aaron
> > 
> > On Mon, Apr 30, 2018 at 1:05 PM, Jan Vesely via Libclc-dev
> > <libclc-dev at lists.llvm.org> wrote:
> > > On Tue, 2018-04-24 at 12:31 -0400, Jan Vesely wrote:
> > > > It's OK to either flush to 0 or return denormal result if the
> > > > device
> > > > does not support denormals. See sec 7.2 and 7.5.3 of OCL specs
> > > > 
> > > > v2: Use 0.0f explicitly intead of relying on GPU to flush it.
> > > > 
> > > > Fixes CTS on carrizo and turks
> > > > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > > > ---
> > > > This removes the need for the second patch
> > > >  generic/lib/math/clc_rootn.cl | 11 +----------
> > > >  1 file changed, 1 insertion(+), 10 deletions(-)
> > > > 
> > > > diff --git a/generic/lib/math/clc_rootn.cl
> > > > b/generic/lib/math/clc_rootn.cl
> > > > index d7ee185..0a2c98d 100644
> > > > --- a/generic/lib/math/clc_rootn.cl
> > > > +++ b/generic/lib/math/clc_rootn.cl
> > > > @@ -170,16 +170,7 @@ _CLC_DEF _CLC_OVERLOAD float
> > > > __clc_rootn(float x, int ny)
> > > >      tv = USE_TABLE(exp_tbl_ep, j);
> > > > 
> > > >      float expylogx = mad(tv.s0, poly, mad(tv.s1, poly, tv.s1))
> > > > + tv.s0;
> > > > -    float sexpylogx;
> > > > -    if (!__clc_fp32_subnormals_supported()) {
> > > > -             int explg = ((as_uint(expylogx) & EXPBITS_SP32 >>
> > > > 23) - 127);
> > > > -             m = (23-(m + 149)) == 0 ? 1: m;
> > > > -             uint mantissa =  ((as_uint(expylogx) &
> > > > MANTBITS_SP32)|IMPBIT_SP32) >> (23-(m + 149));
> > > > -             sexpylogx = as_float(mantissa);
> > > > -    } else {
> > > > -             sexpylogx = expylogx * as_float(0x1 << (m +
> > > > 149));
> > > > -    }
> > > > -
> > > > +    float sexpylogx = __clc_fp32_subnormals_supported() ?
> > > > expylogx * as_float(0x1 << (m + 149)) : 0.0f;
> > > > 
> > > >      float texpylogx = as_float(as_int(expylogx) + m2);
> > > >      expylogx = m < -125 ? sexpylogx : texpylogx;
> > > 
> > > ping.
> > > _______________________________________________
> > > Libclc-dev mailing list
> > > Libclc-dev at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
> > > 
> > 
> > _______________________________________________
> > Libclc-dev mailing list
> > Libclc-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
> 
> 
    
    
More information about the Libclc-dev
mailing list