[Libclc-dev] [PATCH] math: Add fmod implementation

Wed Sep 10 11:35:13 PDT 2014

On Wed, 2014-09-10 at 14:01 -0400, Matt Arsenault wrote:
> On Sep 10, 2014, at 1:46 PM, Aaron Watry <awatry at gmail.com> wrote:
> 
> > On Wed, Sep 10, 2014 at 12:17 PM, Matt Arsenault
> > <Matthew.Arsenault at amd.com> wrote:
> >> On 09/10/2014 11:59 AM, Aaron Watry wrote:
> >>> 
> >>> Passes piglit tests on evergreen (sent to piglit list).
> >>> 
> >>> Signed-off-by: Aaron Watry <awatry at gmail.com>
> >>> ---
> >>>  generic/include/clc/clc.h       |  1 +
> >>>  generic/include/clc/math/fmod.h |  7 +++++++
> >>>  generic/lib/SOURCES             |  1 +
> >>>  generic/lib/math/fmod.cl        | 15 +++++++++++++++
> >>>  4 files changed, 24 insertions(+)
> >>>  create mode 100644 generic/include/clc/math/fmod.h
> >>>  create mode 100644 generic/lib/math/fmod.cl
> >>> 
> >>> diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h
> >>> index b8c1cb9..94557a1 100644
> >>> --- a/generic/include/clc/clc.h
> >>> +++ b/generic/include/clc/clc.h
> >>> @@ -47,6 +47,7 @@
> >>>  #include <clc/math/fma.h>
> >>>  #include <clc/math/fmax.h>
> >>>  #include <clc/math/fmin.h>
> >>> +#include <clc/math/fmod.h>
> >>>  #include <clc/math/hypot.h>
> >>>  #include <clc/math/log.h>
> >>>  #include <clc/math/log2.h>
> >>> diff --git a/generic/include/clc/math/fmod.h
> >>> b/generic/include/clc/math/fmod.h
> >>> new file mode 100644
> >>> index 0000000..737679f
> >>> --- /dev/null
> >>> +++ b/generic/include/clc/math/fmod.h
> >>> @@ -0,0 +1,7 @@
> >>> +#define __CLC_BODY <clc/math/binary_decl.inc>
> >>> +#define __CLC_FUNCTION fmod
> >>> +
> >>> +#include <clc/math/gentype.inc>
> >>> +
> >>> +#undef __CLC_BODY
> >>> +#undef __CLC_FUNCTION
> >>> diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
> >>> index e4ba1d1..45e12aa 100644
> >>> --- a/generic/lib/SOURCES
> >>> +++ b/generic/lib/SOURCES
> >>> @@ -39,6 +39,7 @@ math/exp.cl
> >>>  math/exp10.cl
> >>>  math/fmax.cl
> >>>  math/fmin.cl
> >>> +math/fmod.cl
> >>>  math/hypot.cl
> >>>  math/mad.cl
> >>>  math/mix.cl
> >>> diff --git a/generic/lib/math/fmod.cl b/generic/lib/math/fmod.cl
> >>> new file mode 100644
> >>> index 0000000..091035b
> >>> --- /dev/null
> >>> +++ b/generic/lib/math/fmod.cl
> >>> @@ -0,0 +1,15 @@
> >>> +#include <clc/clc.h>
> >>> +
> >>> +#ifdef cl_khr_fp64
> >>> +#pragma OPENCL EXTENSION cl_khr_fp64 : enable
> >>> +#endif
> >>> +
> >>> +#define FUNCTION fmod
> >>> +#define FUNCTION_IMPL(x, y) ( (x) - (y) * trunc((x) / (y)))
> >>> +
> >>> +#define __CLC_BODY <binary_impl.inc>
> >>> +#include <clc/math/gentype.inc>
> >>> +
> >>> +#undef __CLC_BODY
> >>> +#undef FUNCTION
> >>> +#undef FUNCTION_IMPL
> >>> \ No newline at end of file
> >> 
> >> 
> >> I think this can use the LLVM frem instruction instead, and would be better
> >> expanded in the backend. I have most of a patch that expands ISD::FREM for
> >> SI that I forgot about somewhere
> >> 
> > 
> > Hi Matt,
> > 
> > There's both fmod and remainder functions in the CL built-in library,
> > and as near as I can tell, they just differ in how to treat the result
> > of x/y:
> > 
> > From the CL 1.2 spec (6.12.12):
> > gentype fmod (gentype x, gentype y) => Modulus. Returns x – y * trunc (x/y).
> > 
> > gentype remainder (gentype x, gentype y) => Compute the value r such
> > that r = x - n*y, where n
> > is the integer nearest the exact value of x/y. If there
> > are two integers closest to x/y, n shall be the even
> > one. If r is zero, it is given the same sign as x.
> > 
> > Do you happen to know which behavior the frem instruction gives us?
> > Truncate or Round half to nearest even?  I'm guessing that one of
> > these will be able to use the frem instruction, and the other won't,
> > but I haven't checked which is which yet.

There is both __builtin_fmod(f), and __builtin_remainder(f), but I
haven't found any documentation on them, or code outside of
Basic/Builtins.def

If they are based on math.h then both fmod and remainder seem to match
OCL definitions.

we have round to nearest even instructions, not sure if using __builtin
or adding amdgpu.rndne intrinsic is the better way to go.

jan

> > 
> > —Aaron
> 
> I’m not sure. I was operating under the assumption that frem matches
> food’s behavior, but I haven’t tested it particularly carefully. x86
> lowers frem into calls to fmod, and I assume the OpenCL version behaves
> the same as libm's
> 
> _______________________________________________
> Libclc-dev mailing list
> Libclc-dev at pcc.me.uk
> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev

-- 
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20140910/ec8cff7b/attachment.sig>