<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Mar 6, 2015 at 7:01 PM, Tom Stellard <span dir="ltr"><<a href="mailto:thomas.stellard@amd.com" target="_blank">thomas.stellard@amd.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The new implementation was ported from the AMD builtin library<br>

and has been tested with piglit, OpenCV, and the ocl conformance tests.<br>

---<br>

 generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a>  | 120 ++++++++++++++++++++++++++++++++++++++-<br>

 generic/lib/geometric/length.inc |   3 -<br>

 2 files changed, 117 insertions(+), 6 deletions(-)<br>

 delete mode 100644 generic/lib/geometric/length.inc<br>

<br>

diff --git a/generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a> b/generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a><br>

index ef087c7..3037372 100644<br>

--- a/generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a><br>

+++ b/generic/lib/geometric/<a href="http://length.cl" target="_blank">length.cl</a><br>

@@ -1,8 +1,122 @@<br>

+/*<br>

+ * Copyright (c) 2014 Advanced Micro Devices, Inc.<br>

+ *<br>

+ * Permission is hereby granted, free of charge, to any person obtaining a copy<br>

+ * of this software and associated documentation files (the "Software"), to deal<br>

+ * in the Software without restriction, including without limitation the rights<br>

+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell<br>

+ * copies of the Software, and to permit persons to whom the Software is<br>

+ * furnished to do so, subject to the following conditions:<br>

+ *<br>

+ * The above copyright notice and this permission notice shall be included in<br>

+ * all copies or substantial portions of the Software.<br>

+ *<br>

+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR<br>

+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,<br>

+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE<br>

+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER<br>

+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,<br>

+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN<br>

+ * THE SOFTWARE.<br>

+ */<br>

+<br>

 #include <clc/clc.h><br>

<br>

+_CLC_OVERLOAD _CLC_DEF float length(float p) {<br>

+  return fabs(p);<br>

+}<br>

+<br>

+_CLC_OVERLOAD _CLC_DEF float length(float2 p) {<br>

+  float l2 = dot(p, p);<br>

+<br>

+  if (l2 < FLT_MIN) {<br>

+    p *= 0x1.0p+86F;<br>

+    return sqrt(dot(p, p)) * 0x1.0p-86F;<br>

+  } else if (l2 == INFINITY) {<br>

+    p *= 0x1.0p-65F;<br>

+    return sqrt(dot(p, p)) * 0x1.0p+65F;<br>

+  }<br>

+<br>

+  return sqrt(l2);<br>

+}<br></blockquote><div><br></div><div>I'm assuming that the FLT_MIN/INFINITY cases are correct here.  It's definitely a good thing to correct the scalar float implementation.<br><br></div><div>The only suggestion that I have here is to consider combining the float2/float3/float4 into a single macro that is invoked 3 times (and then the same for the double version).<br><br></div><div>It's not necessary, as that can be cleaned up later if needed.<br><br></div><div>More comments, later...<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

+<br>

+_CLC_OVERLOAD _CLC_DEF float length(float3 p) {<br>

+  float l2 = dot(p, p);<br>

+<br>

+  if (l2 < FLT_MIN) {<br>

+    p *= 0x1.0p+86F;<br>

+    return sqrt(dot(p, p)) * 0x1.0p-86F;<br>

+  } else if (l2 == INFINITY) {<br>

+    p *= 0x1.0p-66F;<br>

+    return sqrt(dot(p, p)) * 0x1.0p+66F;<br></blockquote><div><br></div><div>float2 uses p[+-]65F, but float3/float4 use p[+-]66F.  Is this correct?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

+  }<br>

+<br>

+  return sqrt(l2);<br>

+}<br>

+<br>

+_CLC_OVERLOAD _CLC_DEF float length(float4 p) {<br>

+  float l2 = dot(p, p);<br>

+<br>

+  if (l2 < FLT_MIN) {<br>

+    p *= 0x1.0p+86F;<br>

+    return sqrt(dot(p, p)) * 0x1.0p-86F;<br>

+  } else if (l2 == INFINITY) {<br>

+    p *= 0x1.0p-66f;<br>

+    return sqrt(dot(p, p)) * 0x1.0p+66F;<br>

+  }<br>

+  return sqrt(l2);<br>

+}<br>

+<br>

 #ifdef cl_khr_fp64<br>

 #pragma OPENCL EXTENSION cl_khr_fp64 : enable<br>

-#endif<br>

<br>

-#define __CLC_BODY <length.inc><br>

-#include <clc/geometric/floatn.inc><br>

+_CLC_OVERLOAD _CLC_DEF double length(double p){<br>

+  return fabs(p);<br>

+}<br>

+<br>

+_CLC_OVERLOAD _CLC_DEF double length(double2 p) {<br>

+  double l2 = dot(p, p);<br>

+<br>

+  if (l2 < DBL_MIN) {<br>

+      p *= 0x1.0p+563;<br>

+      return sqrt(dot(p, p)) * 0x1.0p-563;<br>

+  } else if (l2 == INFINITY) {<br>

+      p *= 0x1.0p-513;<br>

+      return sqrt(dot(p, p)) * 0x1.0p+513;<br>

+  }<br>

+<br>

+  return sqrt(l2);<br>

+}<br>

+<br>

+_CLC_OVERLOAD _CLC_DEF double length(double3 p) {<br>

+  double l2 = dot(p, p);<br>

+<br>

+  if (l2 < DBL_MIN) {<br>

+      p *= 0x1.0p+563;<br>

+      return sqrt(dot(p, p)) * 0x1.0p-563;<br>

+  } else if (l2 == INFINITY) {<br>

+      p *= 0x1.0p-514;<br>

+      return sqrt(dot(p, p)) * 0x1.0p+514;<br></blockquote><br>The double2 version used -513/+513, but double3/double4 use +514/-514.  Is one of these wrong?<br><br></div><div class="gmail_quote">The style question at the top, I don't care too much about (combining the vector versions into a macro), but I would like to see if we can get an answer about the correctness of the various exponents being used for the float/double vector versions.  It just doesn't seem like they can all be right from a first glance.<br></div><div class="gmail_quote"><div><br></div><div>--Aaron<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

+  }<br>

+<br>

+  return sqrt(l2);<br>

+}<br>

+<br>

+_CLC_OVERLOAD _CLC_DEF double<br>

+length(double4 p)<br>

+{<br>

+    double l2 = dot(p, p);<br>

+<br>

+    if (l2 < DBL_MIN) {<br>

+        p *= 0x1.0p+563;<br>

+        return sqrt(dot(p, p)) * 0x1.0p-563;<br>

+    }<br>

+    else if (l2 == INFINITY) {<br>

+        p *= 0x1.0p-514;<br>

+        return sqrt(dot(p, p)) * 0x1.0p+514;<br>

+    }<br>

+<br>

+    return sqrt(l2);<br>

+}<br>

+<br>

+#endif<br>

diff --git a/generic/lib/geometric/length.inc b/generic/lib/geometric/length.inc<br>

deleted file mode 100644<br>

index 5faaaff..0000000<br>

--- a/generic/lib/geometric/length.inc<br>

+++ /dev/null<br>

@@ -1,3 +0,0 @@<br>

-_CLC_OVERLOAD _CLC_DEF __CLC_FLOAT length(__CLC_FLOATN p) {<br>

-  return native_sqrt(dot(p, p));<br>

-}<br>

<span class="HOEnZb"><font color="#888888">--<br>

2.0.4<br>

<br>

<br>

_______________________________________________<br>

Libclc-dev mailing list<br>

<a href="mailto:Libclc-dev@pcc.me.uk">Libclc-dev@pcc.me.uk</a><br>

<a href="http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev" target="_blank">http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev</a><br>

</font></span></blockquote></div><br></div></div>