<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><br><div><div>On Sep 5, 2014, at 1:57 PM, Jan Vesely <<a href="mailto:jan.vesely@rutgers.edu">jan.vesely@rutgers.edu</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">On Thu, 2014-09-04 at 12:35 -0500, Aaron Watry wrote:<br><blockquote type="cite">Uses the algorithm:<br>tan(x) = sin(x) / sqrt(1-sin^2(x))<br><br>An alternative is:<br>tan(x) = sin(x) / cos(x)<br><br>Which produces more verbose bitcode and longer assembly.<br></blockquote><br>this is weird. both EG and SI have both sin and cos instructions. Is the<br>input normalization code so bad that we are better of doing MUL+SUB+SQRT<br>instead?<br></div></blockquote><div><br></div><div>Those are only useful for native_sin / native_cos. For the standard function, they are far from precise enough. The current (float) sin implementation should be correct, though native_sin right now is still defined to just be the regular sin function instead of the LLVM intrinsic</div><div><br></div><br><blockquote type="cite"><div style="font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br><blockquote type="cite"><br>Either way, the generated bitcode seems pretty nasty and a more optimized<br>but still precise-enough solution is welcome.<br><br>Signed-off-by: Aaron Watry <<a href="mailto:awatry@gmail.com">awatry@gmail.com</a>><br>---<br>generic/include/clc/clc.h | 1 +<br>generic/include/clc/math/tan.h | 2 ++<br>generic/include/clc/math/tan.inc | 1 +<br>generic/lib/SOURCES | 1 +<br>generic/lib/math/tan.cl | 8 ++++++++<br>generic/lib/math/tan.inc | 8 ++++++++<br>6 files changed, 21 insertions(+)<br>create mode 100644 generic/include/clc/math/tan.h<br>create mode 100644 generic/include/clc/math/tan.inc<br>create mode 100644 generic/lib/math/tan.cl<br>create mode 100644 generic/lib/math/tan.inc<br><br>diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h<br>index 079c674..69e44b6 100644<br>--- a/generic/include/clc/clc.h<br>+++ b/generic/include/clc/clc.h<br>@@ -60,6 +60,7 @@<br>#include <clc/math/sin.h><br>#include <clc/math/sincos.h><br>#include <clc/math/sqrt.h><br>+#include <clc/math/tan.h><br>#include <clc/math/trunc.h><br>#include <clc/math/native_cos.h><br>#include <clc/math/native_divide.h><br>diff --git a/generic/include/clc/math/tan.h b/generic/include/clc/math/tan.h<br>new file mode 100644<br>index 0000000..d2d52a9<br>--- /dev/null<br>+++ b/generic/include/clc/math/tan.h<br>@@ -0,0 +1,2 @@<br>+#define __CLC_BODY <clc/math/tan.inc><br>+#include <clc/math/gentype.inc><br>diff --git a/generic/include/clc/math/tan.inc b/generic/include/clc/math/tan.inc<br>new file mode 100644<br>index 0000000..50c5b1d<br>--- /dev/null<br>+++ b/generic/include/clc/math/tan.inc<br>@@ -0,0 +1 @@<br>+_CLC_OVERLOAD _CLC_DECL __CLC_GENTYPE tan(__CLC_GENTYPE x);<br>diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES<br>index 30e182f..0667d14 100644<br>--- a/generic/lib/SOURCES<br>+++ b/generic/lib/SOURCES<br>@@ -48,6 +48,7 @@ math/pown.cl<br>math/sin.cl<br>math/sincos.cl<br>math/sincos_helpers.cl<br>+math/tan.cl<br>relational/all.cl<br>relational/any.cl<br>relational/isequal.cl<br>diff --git a/generic/lib/math/tan.cl b/generic/lib/math/tan.cl<br>new file mode 100644<br>index 0000000..a447999<br>--- /dev/null<br>+++ b/generic/lib/math/tan.cl<br>@@ -0,0 +1,8 @@<br>+#include <clc/clc.h><br>+<br>+#ifdef cl_khr_fp64<br>+#pragma OPENCL EXTENSION cl_khr_fp64 : enable<br>+#endif<br>+<br>+#define __CLC_BODY <tan.inc><br>+#include <clc/math/gentype.inc><br>diff --git a/generic/lib/math/tan.inc b/generic/lib/math/tan.inc<br>new file mode 100644<br>index 0000000..8d9d9fe<br>--- /dev/null<br>+++ b/generic/lib/math/tan.inc<br>@@ -0,0 +1,8 @@<br>+/*<br>+ * Note: tan(x) = sin(x)/cos(x) also, but the final assembly ends up being<br>+ * twice as long for R600 (maybe for others as well).<br>+ */<br>+_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE tan(__CLC_GENTYPE x) {<br>+ __CLC_GENTYPE sinx = sin(x);<br>+ return sinx / sqrt( (__CLC_GENTYPE) 1.0 - (sinx*sinx) );<br>+}<br></blockquote><br>--<span class="Apple-converted-space"> </span><br>Jan Vesely <<a href="mailto:jan.vesely@rutgers.edu">jan.vesely@rutgers.edu</a>><br>_______________________________________________<br>Libclc-dev mailing list<br><a href="mailto:Libclc-dev@pcc.me.uk">Libclc-dev@pcc.me.uk</a><br><a href="http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev">http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev</a></div></blockquote></div><br></body></html>