[Libclc-dev] [PATCH 01/14] half_rsqrt: Switch implementation to native_rsqrt

Jan Vesely via Libclc-dev libclc-dev at lists.llvm.org
Thu Nov 9 13:55:44 PST 2017


On Thu, 2017-11-09 at 22:22 +0100, Jeroen Ketema via Libclc-dev wrote:
> > On 9 Nov 2017, at 22:15, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> > 
> > On Thu, 2017-11-09 at 22:01 +0100, Jeroen Ketema wrote:
> > > This assumes that native_rsqrt is more accurate than half_rsqrt,
> > > which is not guaranteed by the OpenCL spec as far as I know.
> > 
> > yes. the entire series assumes that native ops are accurate enough for
> > half_* (8192 ulps), I've tested this on carrizo, and afaik it should be
> > generally OK for both GCN and EG+.**
> > It'd be safer to redirect half_ ops to full ops, and include per target
> > overrides, but since I expect both nvidia and amdgpu to have those
> > overrides it'd be just a bunch of dead code in generic directory.
> 
> Maybe. Then again, no one is currently testing this on Nvidia.
> 
> In general I would be worried about edge cases, but these are
> apparently fine on AMD platforms.

I took a look at cuda math intrinsics[0] which should give us an idea
about PTX opcode error values.
sqrt, rsqrt, recip look to be properly rounded
divide, log, log10, log2 look to be OK for half_ and even regular ops
exp, exp10 are not good enough
sin, cos can't tell

I can add special overloads for the last 4 ops for nvptx.

Jan

[0] http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#int
rinsic-functions

> 
> Jeroen
> 
> > 
> > Jan
> > 
> > **cos/sin/tan have issues with large inputs, but I think that can be
> > fixed in llvm by improving the initial scaling op.
> > 
> > > 
> > > Jeroen
> > > 
> > > > On 4 Nov 2017, at 02:32, Jan Vesely via Libclc-dev <libclc-dev at lists.llvm.org> wrote:
> > > > 
> > > > Passes CTS on carrizo
> > > > 
> > > > Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
> > > > ---
> > > > generic/lib/math/half_native_unary.inc | 11 +++++++++++
> > > > generic/lib/math/half_rsqrt.cl         | 26 ++------------------------
> > > > generic/lib/math/half_rsqrt.inc        | 25 -------------------------
> > > > 3 files changed, 13 insertions(+), 49 deletions(-)
> > > > create mode 100644 generic/lib/math/half_native_unary.inc
> > > > delete mode 100644 generic/lib/math/half_rsqrt.inc
> > > > 
> > > > diff --git a/generic/lib/math/half_native_unary.inc b/generic/lib/math/half_native_unary.inc
> > > > new file mode 100644
> > > > index 0000000..3ab1fba
> > > > --- /dev/null
> > > > +++ b/generic/lib/math/half_native_unary.inc
> > > > @@ -0,0 +1,11 @@
> > > > +#include <utils.h>
> > > > +
> > > > +#define __CLC_HALF_FUNC(x) __CLC_CONCAT(half_, x)
> > > > +#define __CLC_NATIVE_FUNC(x) __CLC_CONCAT(native_, x)
> > > > +
> > > > +_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __CLC_HALF_FUNC(__CLC_FUNC)(__CLC_GENTYPE val) {
> > > > +  return __CLC_NATIVE_FUNC(__CLC_FUNC)(val);
> > > > +}
> > > > +
> > > > +#undef __CLC_NATIVE_FUNC
> > > > +#undef __CLC_HALF_FUNC
> > > > diff --git a/generic/lib/math/half_rsqrt.cl b/generic/lib/math/half_rsqrt.cl
> > > > index 726f65c..2585911 100644
> > > > --- a/generic/lib/math/half_rsqrt.cl
> > > > +++ b/generic/lib/math/half_rsqrt.cl
> > > > @@ -1,28 +1,6 @@
> > > > -/*
> > > > - * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
> > > > - *
> > > > - * Permission is hereby granted, free of charge, to any person obtaining a copy
> > > > - * of this software and associated documentation files (the "Software"), to deal
> > > > - * in the Software without restriction, including without limitation the rights
> > > > - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> > > > - * copies of the Software, and to permit persons to whom the Software is
> > > > - * furnished to do so, subject to the following conditions:
> > > > - *
> > > > - * The above copyright notice and this permission notice shall be included in
> > > > - * all copies or substantial portions of the Software.
> > > > - *
> > > > - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > > > - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > > > - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> > > > - * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > > > - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> > > > - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> > > > - * THE SOFTWARE.
> > > > - */
> > > > -
> > > > #include <clc/clc.h>
> > > > 
> > > > -#define __CLC_BODY <half_rsqrt.inc>
> > > > +#define __CLC_FUNC rsqrt
> > > > +#define __CLC_BODY <half_native_unary.inc>
> > > > #define __FLOAT_ONLY
> > > > #include <clc/math/gentype.inc>
> > > > -#undef __FLOAT_ONLY
> > > > diff --git a/generic/lib/math/half_rsqrt.inc b/generic/lib/math/half_rsqrt.inc
> > > > deleted file mode 100644
> > > > index 33ce6c2..0000000
> > > > --- a/generic/lib/math/half_rsqrt.inc
> > > > +++ /dev/null
> > > > @@ -1,25 +0,0 @@
> > > > -/*
> > > > - * Copyright (c) 2014,2015 Advanced Micro Devices, Inc.
> > > > - *
> > > > - * Permission is hereby granted, free of charge, to any person obtaining a copy
> > > > - * of this software and associated documentation files (the "Software"), to deal
> > > > - * in the Software without restriction, including without limitation the rights
> > > > - * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> > > > - * copies of the Software, and to permit persons to whom the Software is
> > > > - * furnished to do so, subject to the following conditions:
> > > > - *
> > > > - * The above copyright notice and this permission notice shall be included in
> > > > - * all copies or substantial portions of the Software.
> > > > - *
> > > > - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > > > - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > > > - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> > > > - * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > > > - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> > > > - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> > > > - * THE SOFTWARE.
> > > > - */
> > > > -
> > > > -_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE half_rsqrt(__CLC_GENTYPE val) {
> > > > -  return rsqrt(val);
> > > > -}
> > > > -- 
> > > > 2.13.6
> > > > 
> > > > _______________________________________________
> > > > Libclc-dev mailing list
> > > > Libclc-dev at lists.llvm.org
> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
> > > 
> > > 
> 
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20171109/2a3a704e/attachment-0001.sig>


More information about the Libclc-dev mailing list