[Libclc-dev] [PATCH 1/8] Fix implementation of length builtin
Aaron Watry
awatry at gmail.com
Tue Mar 10 10:08:39 PDT 2015
On Fri, Mar 6, 2015 at 7:01 PM, Tom Stellard <thomas.stellard at amd.com>
wrote:
> The new implementation was ported from the AMD builtin library
> and has been tested with piglit, OpenCV, and the ocl conformance tests.
> ---
> generic/lib/geometric/length.cl | 120
> ++++++++++++++++++++++++++++++++++++++-
> generic/lib/geometric/length.inc | 3 -
> 2 files changed, 117 insertions(+), 6 deletions(-)
> delete mode 100644 generic/lib/geometric/length.inc
>
> diff --git a/generic/lib/geometric/length.cl b/generic/lib/geometric/
> length.cl
> index ef087c7..3037372 100644
> --- a/generic/lib/geometric/length.cl
> +++ b/generic/lib/geometric/length.cl
> @@ -1,8 +1,122 @@
> +/*
> + * Copyright (c) 2014 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining
> a copy
> + * of this software and associated documentation files (the "Software"),
> to deal
> + * in the Software without restriction, including without limitation the
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or
> sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be
> included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
> SHALL THE
> + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> IN
> + * THE SOFTWARE.
> + */
> +
> #include <clc/clc.h>
>
> +_CLC_OVERLOAD _CLC_DEF float length(float p) {
> + return fabs(p);
> +}
> +
> +_CLC_OVERLOAD _CLC_DEF float length(float2 p) {
> + float l2 = dot(p, p);
> +
> + if (l2 < FLT_MIN) {
> + p *= 0x1.0p+86F;
> + return sqrt(dot(p, p)) * 0x1.0p-86F;
> + } else if (l2 == INFINITY) {
> + p *= 0x1.0p-65F;
> + return sqrt(dot(p, p)) * 0x1.0p+65F;
> + }
> +
> + return sqrt(l2);
> +}
>
I'm assuming that the FLT_MIN/INFINITY cases are correct here. It's
definitely a good thing to correct the scalar float implementation.
The only suggestion that I have here is to consider combining the
float2/float3/float4 into a single macro that is invoked 3 times (and then
the same for the double version).
It's not necessary, as that can be cleaned up later if needed.
More comments, later...
> +
> +_CLC_OVERLOAD _CLC_DEF float length(float3 p) {
> + float l2 = dot(p, p);
> +
> + if (l2 < FLT_MIN) {
> + p *= 0x1.0p+86F;
> + return sqrt(dot(p, p)) * 0x1.0p-86F;
> + } else if (l2 == INFINITY) {
> + p *= 0x1.0p-66F;
> + return sqrt(dot(p, p)) * 0x1.0p+66F;
>
float2 uses p[+-]65F, but float3/float4 use p[+-]66F. Is this correct?
> + }
> +
> + return sqrt(l2);
> +}
> +
> +_CLC_OVERLOAD _CLC_DEF float length(float4 p) {
> + float l2 = dot(p, p);
> +
> + if (l2 < FLT_MIN) {
> + p *= 0x1.0p+86F;
> + return sqrt(dot(p, p)) * 0x1.0p-86F;
> + } else if (l2 == INFINITY) {
> + p *= 0x1.0p-66f;
> + return sqrt(dot(p, p)) * 0x1.0p+66F;
> + }
> + return sqrt(l2);
> +}
> +
> #ifdef cl_khr_fp64
> #pragma OPENCL EXTENSION cl_khr_fp64 : enable
> -#endif
>
> -#define __CLC_BODY <length.inc>
> -#include <clc/geometric/floatn.inc>
> +_CLC_OVERLOAD _CLC_DEF double length(double p){
> + return fabs(p);
> +}
> +
> +_CLC_OVERLOAD _CLC_DEF double length(double2 p) {
> + double l2 = dot(p, p);
> +
> + if (l2 < DBL_MIN) {
> + p *= 0x1.0p+563;
> + return sqrt(dot(p, p)) * 0x1.0p-563;
> + } else if (l2 == INFINITY) {
> + p *= 0x1.0p-513;
> + return sqrt(dot(p, p)) * 0x1.0p+513;
> + }
> +
> + return sqrt(l2);
> +}
> +
> +_CLC_OVERLOAD _CLC_DEF double length(double3 p) {
> + double l2 = dot(p, p);
> +
> + if (l2 < DBL_MIN) {
> + p *= 0x1.0p+563;
> + return sqrt(dot(p, p)) * 0x1.0p-563;
> + } else if (l2 == INFINITY) {
> + p *= 0x1.0p-514;
> + return sqrt(dot(p, p)) * 0x1.0p+514;
>
The double2 version used -513/+513, but double3/double4 use +514/-514. Is
one of these wrong?
The style question at the top, I don't care too much about (combining the
vector versions into a macro), but I would like to see if we can get an
answer about the correctness of the various exponents being used for the
float/double vector versions. It just doesn't seem like they can all be
right from a first glance.
--Aaron
> + }
> +
> + return sqrt(l2);
> +}
> +
> +_CLC_OVERLOAD _CLC_DEF double
> +length(double4 p)
> +{
> + double l2 = dot(p, p);
> +
> + if (l2 < DBL_MIN) {
> + p *= 0x1.0p+563;
> + return sqrt(dot(p, p)) * 0x1.0p-563;
> + }
> + else if (l2 == INFINITY) {
> + p *= 0x1.0p-514;
> + return sqrt(dot(p, p)) * 0x1.0p+514;
> + }
> +
> + return sqrt(l2);
> +}
> +
> +#endif
> diff --git a/generic/lib/geometric/length.inc
> b/generic/lib/geometric/length.inc
> deleted file mode 100644
> index 5faaaff..0000000
> --- a/generic/lib/geometric/length.inc
> +++ /dev/null
> @@ -1,3 +0,0 @@
> -_CLC_OVERLOAD _CLC_DEF __CLC_FLOAT length(__CLC_FLOATN p) {
> - return native_sqrt(dot(p, p));
> -}
> --
> 2.0.4
>
>
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at pcc.me.uk
> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20150310/cc289aa7/attachment-0001.html>
More information about the Libclc-dev
mailing list