[Libclc-dev] [PATCH 2/2] math: Add tan implementation

Aaron Watry awatry at gmail.com
Fri Sep 5 16:35:20 PDT 2014


On Sep 5, 2014 6:19 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
>
> On Fri, 2014-09-05 at 14:26 -0400, Matt Arsenault wrote: 
> > On Sep 5, 2014, at 1:57 PM, Jan Vesely <jan.vesely at rutgers.edu> wrote: 
> > 
> > > On Thu, 2014-09-04 at 12:35 -0500, Aaron Watry wrote: 
> > >> Uses the algorithm: 
> > >> tan(x) = sin(x) / sqrt(1-sin^2(x)) 
> > >> 
> > >> An alternative is: 
> > >> tan(x) = sin(x) / cos(x) 
> > >> 
> > >> Which produces more verbose bitcode and longer assembly. 
> > > 
> > > this is weird. both EG and SI have both sin and cos instructions. Is the 
> > > input normalization code so bad that we are better of doing MUL+SUB+SQRT 
> > > instead? 
> > 
> > Those are only useful for native_sin / native_cos. For the standard 
> > function, they are far from precise enough. The current (float) sin 
> > implementation should be correct, though native_sin right now is still 
> > defined to just be the regular sin function instead of the LLVM 
> > intrinsic 
>
> oh I didn't know the hw implementaion was so imprecise. In that case it 
> makes sense. Although I wonder why it ended up needing twice as many 
> instructions. it looks to me that sin and cos don't differ in more than 
> 4 operations, so CSE should have eliminated most of it. 
> either way it's not going to be more efficient than this patch. 
>
> LGTM

Thanks for the review.  I'm guessing that the CSE isn't recognizing the pattern correctly...  And then doing the sqrt, sub, and mul ends up being cheap in comparison to another sin or cos operation

--Aaron
>
> > 
> > 
> > > 
> > >> 
> > >> Either way, the generated bitcode seems pretty nasty and a more optimized 
> > >> but still precise-enough solution is welcome. 
> > >> 
> > >> Signed-off-by: Aaron Watry <awatry at gmail.com> 
> > >> --- 
> > >> generic/include/clc/clc.h        | 1 + 
> > >> generic/include/clc/math/tan.h   | 2 ++ 
> > >> generic/include/clc/math/tan.inc | 1 + 
> > >> generic/lib/SOURCES              | 1 + 
> > >> generic/lib/math/tan.cl          | 8 ++++++++ 
> > >> generic/lib/math/tan.inc         | 8 ++++++++ 
> > >> 6 files changed, 21 insertions(+) 
> > >> create mode 100644 generic/include/clc/math/tan.h 
> > >> create mode 100644 generic/include/clc/math/tan.inc 
> > >> create mode 100644 generic/lib/math/tan.cl 
> > >> create mode 100644 generic/lib/math/tan.inc 
> > >> 
> > >> diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h 
> > >> index 079c674..69e44b6 100644 
> > >> --- a/generic/include/clc/clc.h 
> > >> +++ b/generic/include/clc/clc.h 
> > >> @@ -60,6 +60,7 @@ 
> > >> #include <clc/math/sin.h> 
> > >> #include <clc/math/sincos.h> 
> > >> #include <clc/math/sqrt.h> 
> > >> +#include <clc/math/tan.h> 
> > >> #include <clc/math/trunc.h> 
> > >> #include <clc/math/native_cos.h> 
> > >> #include <clc/math/native_divide.h> 
> > >> diff --git a/generic/include/clc/math/tan.h b/generic/include/clc/math/tan.h 
> > >> new file mode 100644 
> > >> index 0000000..d2d52a9 
> > >> --- /dev/null 
> > >> +++ b/generic/include/clc/math/tan.h 
> > >> @@ -0,0 +1,2 @@ 
> > >> +#define __CLC_BODY <clc/math/tan.inc> 
> > >> +#include <clc/math/gentype.inc> 
> > >> diff --git a/generic/include/clc/math/tan.inc b/generic/include/clc/math/tan.inc 
> > >> new file mode 100644 
> > >> index 0000000..50c5b1d 
> > >> --- /dev/null 
> > >> +++ b/generic/include/clc/math/tan.inc 
> > >> @@ -0,0 +1 @@ 
> > >> +_CLC_OVERLOAD _CLC_DECL __CLC_GENTYPE tan(__CLC_GENTYPE x); 
> > >> diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES 
> > >> index 30e182f..0667d14 100644 
> > >> --- a/generic/lib/SOURCES 
> > >> +++ b/generic/lib/SOURCES 
> > >> @@ -48,6 +48,7 @@ math/pown.cl 
> > >> math/sin.cl 
> > >> math/sincos.cl 
> > >> math/sincos_helpers.cl 
> > >> +math/tan.cl 
> > >> relational/all.cl 
> > >> relational/any.cl 
> > >> relational/isequal.cl 
> > >> diff --git a/generic/lib/math/tan.cl b/generic/lib/math/tan.cl 
> > >> new file mode 100644 
> > >> index 0000000..a447999 
> > >> --- /dev/null 
> > >> +++ b/generic/lib/math/tan.cl 
> > >> @@ -0,0 +1,8 @@ 
> > >> +#include <clc/clc.h> 
> > >> + 
> > >> +#ifdef cl_khr_fp64 
> > >> +#pragma OPENCL EXTENSION cl_khr_fp64 : enable 
> > >> +#endif 
> > >> + 
> > >> +#define __CLC_BODY <tan.inc> 
> > >> +#include <clc/math/gentype.inc> 
> > >> diff --git a/generic/lib/math/tan.inc b/generic/lib/math/tan.inc 
> > >> new file mode 100644 
> > >> index 0000000..8d9d9fe 
> > >> --- /dev/null 
> > >> +++ b/generic/lib/math/tan.inc 
> > >> @@ -0,0 +1,8 @@ 
> > >> +/* 
> > >> + * Note: tan(x) = sin(x)/cos(x) also, but the final assembly ends up being 
> > >> + *       twice as long for R600 (maybe for others as well). 
> > >> + */ 
> > >> +_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE tan(__CLC_GENTYPE x) { 
> > >> +  __CLC_GENTYPE sinx = sin(x); 
> > >> +  return sinx / sqrt( (__CLC_GENTYPE) 1.0 - (sinx*sinx) ); 
> > >> +} 
> > > 
> > > -- 
> > > Jan Vesely <jan.vesely at rutgers.edu> 
> > > _______________________________________________ 
> > > Libclc-dev mailing list 
> > > Libclc-dev at pcc.me.uk 
> > > http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev 
> > 
>
> -- 
> Jan Vesely <jan.vesely at rutgers.edu> 


More information about the Libclc-dev mailing list