[Libclc-dev] [PATCH 2/2] math: Add tan implementation

Jan Vesely jan.vesely at rutgers.edu
Fri Sep 5 16:19:08 PDT 2014


On Fri, 2014-09-05 at 14:26 -0400, Matt Arsenault wrote:
> On Sep 5, 2014, at 1:57 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> 
> > On Thu, 2014-09-04 at 12:35 -0500, Aaron Watry wrote:
> >> Uses the algorithm:
> >> tan(x) = sin(x) / sqrt(1-sin^2(x))
> >> 
> >> An alternative is:
> >> tan(x) = sin(x) / cos(x)
> >> 
> >> Which produces more verbose bitcode and longer assembly.
> > 
> > this is weird. both EG and SI have both sin and cos instructions. Is the
> > input normalization code so bad that we are better of doing MUL+SUB+SQRT
> > instead?
> 
> Those are only useful for native_sin / native_cos. For the standard
> function, they are far from precise enough. The current (float) sin
> implementation should be correct, though native_sin right now is still
> defined to just be the regular sin function instead of the LLVM
> intrinsic

oh I didn't know the hw implementaion was so imprecise. In that case it
makes sense. Although I wonder why it ended up needing twice as many
instructions. it looks to me that sin and cos don't differ in more than
4 operations, so CSE should have eliminated most of it.
either way it's not going to be more efficient than this patch.

LGTM

> 
> 
> > 
> >> 
> >> Either way, the generated bitcode seems pretty nasty and a more optimized
> >> but still precise-enough solution is welcome.
> >> 
> >> Signed-off-by: Aaron Watry <awatry at gmail.com>
> >> ---
> >> generic/include/clc/clc.h        | 1 +
> >> generic/include/clc/math/tan.h   | 2 ++
> >> generic/include/clc/math/tan.inc | 1 +
> >> generic/lib/SOURCES              | 1 +
> >> generic/lib/math/tan.cl          | 8 ++++++++
> >> generic/lib/math/tan.inc         | 8 ++++++++
> >> 6 files changed, 21 insertions(+)
> >> create mode 100644 generic/include/clc/math/tan.h
> >> create mode 100644 generic/include/clc/math/tan.inc
> >> create mode 100644 generic/lib/math/tan.cl
> >> create mode 100644 generic/lib/math/tan.inc
> >> 
> >> diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
> >> index 079c674..69e44b6 100644
> >> --- a/generic/include/clc/clc.h
> >> +++ b/generic/include/clc/clc.h
> >> @@ -60,6 +60,7 @@
> >> #include <clc/math/sin.h>
> >> #include <clc/math/sincos.h>
> >> #include <clc/math/sqrt.h>
> >> +#include <clc/math/tan.h>
> >> #include <clc/math/trunc.h>
> >> #include <clc/math/native_cos.h>
> >> #include <clc/math/native_divide.h>
> >> diff --git a/generic/include/clc/math/tan.h b/generic/include/clc/math/tan.h
> >> new file mode 100644
> >> index 0000000..d2d52a9
> >> --- /dev/null
> >> +++ b/generic/include/clc/math/tan.h
> >> @@ -0,0 +1,2 @@
> >> +#define __CLC_BODY <clc/math/tan.inc>
> >> +#include <clc/math/gentype.inc>
> >> diff --git a/generic/include/clc/math/tan.inc b/generic/include/clc/math/tan.inc
> >> new file mode 100644
> >> index 0000000..50c5b1d
> >> --- /dev/null
> >> +++ b/generic/include/clc/math/tan.inc
> >> @@ -0,0 +1 @@
> >> +_CLC_OVERLOAD _CLC_DECL __CLC_GENTYPE tan(__CLC_GENTYPE x);
> >> diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
> >> index 30e182f..0667d14 100644
> >> --- a/generic/lib/SOURCES
> >> +++ b/generic/lib/SOURCES
> >> @@ -48,6 +48,7 @@ math/pown.cl
> >> math/sin.cl
> >> math/sincos.cl
> >> math/sincos_helpers.cl
> >> +math/tan.cl
> >> relational/all.cl
> >> relational/any.cl
> >> relational/isequal.cl
> >> diff --git a/generic/lib/math/tan.cl b/generic/lib/math/tan.cl
> >> new file mode 100644
> >> index 0000000..a447999
> >> --- /dev/null
> >> +++ b/generic/lib/math/tan.cl
> >> @@ -0,0 +1,8 @@
> >> +#include <clc/clc.h>
> >> +
> >> +#ifdef cl_khr_fp64
> >> +#pragma OPENCL EXTENSION cl_khr_fp64 : enable
> >> +#endif
> >> +
> >> +#define __CLC_BODY <tan.inc>
> >> +#include <clc/math/gentype.inc>
> >> diff --git a/generic/lib/math/tan.inc b/generic/lib/math/tan.inc
> >> new file mode 100644
> >> index 0000000..8d9d9fe
> >> --- /dev/null
> >> +++ b/generic/lib/math/tan.inc
> >> @@ -0,0 +1,8 @@
> >> +/*
> >> + * Note: tan(x) = sin(x)/cos(x) also, but the final assembly ends up being
> >> + *       twice as long for R600 (maybe for others as well).
> >> + */
> >> +_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE tan(__CLC_GENTYPE x) {
> >> +  __CLC_GENTYPE sinx = sin(x);
> >> +  return sinx / sqrt( (__CLC_GENTYPE) 1.0 - (sinx*sinx) );
> >> +}
> > 
> > -- 
> > Jan Vesely <jan.vesely at rutgers.edu>
> > _______________________________________________
> > Libclc-dev mailing list
> > Libclc-dev at pcc.me.uk
> > http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
> 

-- 
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20140905/a269acc3/attachment.sig>


More information about the Libclc-dev mailing list