<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Mar 6, 2015 at 7:01 PM, Tom Stellard <span dir="ltr"><<a href="mailto:thomas.stellard@amd.com" target="_blank">thomas.stellard@amd.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The new implementation was ported from the AMD builtin library<br>
and has been tested with piglit, OpenCV, and the ocl conformance tests.<br>
---<br>
generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a> | 120 ++++++++++++++++++++++++++++++++++++++-<br>
generic/lib/geometric/length.inc | 3 -<br>
2 files changed, 117 insertions(+), 6 deletions(-)<br>
delete mode 100644 generic/lib/geometric/length.inc<br>
<br>
diff --git a/generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a> b/generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a><br>
index ef087c7..3037372 100644<br>
--- a/generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a><br>
+++ b/generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a><br>
@@ -1,8 +1,122 @@<br>
+/*<br>
+ * Copyright (c) 2014 Advanced Micro Devices, Inc.<br>
+ *<br>
+ * Permission is hereby granted, free of charge, to any person obtaining a copy<br>
+ * of this software and associated documentation files (the "Software"), to deal<br>
+ * in the Software without restriction, including without limitation the rights<br>
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell<br>
+ * copies of the Software, and to permit persons to whom the Software is<br>
+ * furnished to do so, subject to the following conditions:<br>
+ *<br>
+ * The above copyright notice and this permission notice shall be included in<br>
+ * all copies or substantial portions of the Software.<br>
+ *<br>
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR<br>
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,<br>
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE<br>
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER<br>
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,<br>
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN<br>
+ * THE SOFTWARE.<br>
+ */<br>
+<br>
#include <clc/clc.h><br>
<br>
+_CLC_OVERLOAD _CLC_DEF float length(float p) {<br>
+ return fabs(p);<br>
+}<br>
+<br>
+_CLC_OVERLOAD _CLC_DEF float length(float2 p) {<br>
+ float l2 = dot(p, p);<br>
+<br>
+ if (l2 < FLT_MIN) {<br>
+ p *= 0x1.0p+86F;<br>
+ return sqrt(dot(p, p)) * 0x1.0p-86F;<br>
+ } else if (l2 == INFINITY) {<br>
+ p *= 0x1.0p-65F;<br>
+ return sqrt(dot(p, p)) * 0x1.0p+65F;<br>
+ }<br>
+<br>
+ return sqrt(l2);<br>
+}<br></blockquote><div><br></div><div>I'm assuming that the FLT_MIN/INFINITY cases are correct here. It's definitely a good thing to correct the scalar float implementation.<br><br></div><div>The only suggestion that I have here is to consider combining the float2/float3/float4 into a single macro that is invoked 3 times (and then the same for the double version).<br><br></div><div>It's not necessary, as that can be cleaned up later if needed.<br><br></div><div>More comments, later...<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+_CLC_OVERLOAD _CLC_DEF float length(float3 p) {<br>
+ float l2 = dot(p, p);<br>
+<br>
+ if (l2 < FLT_MIN) {<br>
+ p *= 0x1.0p+86F;<br>
+ return sqrt(dot(p, p)) * 0x1.0p-86F;<br>
+ } else if (l2 == INFINITY) {<br>
+ p *= 0x1.0p-66F;<br>
+ return sqrt(dot(p, p)) * 0x1.0p+66F;<br></blockquote><div><br></div><div>float2 uses p[+-]65F, but float3/float4 use p[+-]66F. Is this correct?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ }<br>
+<br>
+ return sqrt(l2);<br>
+}<br>
+<br>
+_CLC_OVERLOAD _CLC_DEF float length(float4 p) {<br>
+ float l2 = dot(p, p);<br>
+<br>
+ if (l2 < FLT_MIN) {<br>
+ p *= 0x1.0p+86F;<br>
+ return sqrt(dot(p, p)) * 0x1.0p-86F;<br>
+ } else if (l2 == INFINITY) {<br>
+ p *= 0x1.0p-66f;<br>
+ return sqrt(dot(p, p)) * 0x1.0p+66F;<br>
+ }<br>
+ return sqrt(l2);<br>
+}<br>
+<br>
#ifdef cl_khr_fp64<br>
#pragma OPENCL EXTENSION cl_khr_fp64 : enable<br>
-#endif<br>
<br>
-#define __CLC_BODY <length.inc><br>
-#include <clc/geometric/floatn.inc><br>
+_CLC_OVERLOAD _CLC_DEF double length(double p){<br>
+ return fabs(p);<br>
+}<br>
+<br>
+_CLC_OVERLOAD _CLC_DEF double length(double2 p) {<br>
+ double l2 = dot(p, p);<br>
+<br>
+ if (l2 < DBL_MIN) {<br>
+ p *= 0x1.0p+563;<br>
+ return sqrt(dot(p, p)) * 0x1.0p-563;<br>
+ } else if (l2 == INFINITY) {<br>
+ p *= 0x1.0p-513;<br>
+ return sqrt(dot(p, p)) * 0x1.0p+513;<br>
+ }<br>
+<br>
+ return sqrt(l2);<br>
+}<br>
+<br>
+_CLC_OVERLOAD _CLC_DEF double length(double3 p) {<br>
+ double l2 = dot(p, p);<br>
+<br>
+ if (l2 < DBL_MIN) {<br>
+ p *= 0x1.0p+563;<br>
+ return sqrt(dot(p, p)) * 0x1.0p-563;<br>
+ } else if (l2 == INFINITY) {<br>
+ p *= 0x1.0p-514;<br>
+ return sqrt(dot(p, p)) * 0x1.0p+514;<br></blockquote><br>The double2 version used -513/+513, but double3/double4 use +514/-514. Is one of these wrong?<br><br></div><div class="gmail_quote">The style question at the top, I don't care too much about (combining the vector versions into a macro), but I would like to see if we can get an answer about the correctness of the various exponents being used for the float/double vector versions. It just doesn't seem like they can all be right from a first glance.<br></div><div class="gmail_quote"><div><br></div><div>--Aaron<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+ }<br>
+<br>
+ return sqrt(l2);<br>
+}<br>
+<br>
+_CLC_OVERLOAD _CLC_DEF double<br>
+length(double4 p)<br>
+{<br>
+ double l2 = dot(p, p);<br>
+<br>
+ if (l2 < DBL_MIN) {<br>
+ p *= 0x1.0p+563;<br>
+ return sqrt(dot(p, p)) * 0x1.0p-563;<br>
+ }<br>
+ else if (l2 == INFINITY) {<br>
+ p *= 0x1.0p-514;<br>
+ return sqrt(dot(p, p)) * 0x1.0p+514;<br>
+ }<br>
+<br>
+ return sqrt(l2);<br>
+}<br>
+<br>
+#endif<br>
diff --git a/generic/lib/geometric/length.inc b/generic/lib/geometric/length.inc<br>
deleted file mode 100644<br>
index 5faaaff..0000000<br>
--- a/generic/lib/geometric/length.inc<br>
+++ /dev/null<br>
@@ -1,3 +0,0 @@<br>
-_CLC_OVERLOAD _CLC_DEF __CLC_FLOAT length(__CLC_FLOATN p) {<br>
- return native_sqrt(dot(p, p));<br>
-}<br>
<span class="HOEnZb"><font color="#888888">--<br>
2.0.4<br>
<br>
<br>
_______________________________________________<br>
Libclc-dev mailing list<br>
<a href="mailto:Libclc-dev@pcc.me.uk">Libclc-dev@pcc.me.uk</a><br>
<a href="http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev" target="_blank">http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev</a><br>
</font></span></blockquote></div><br></div></div>