[Libclc-dev] fast_length and fast_normalize
tom at stellard.net
Tue Mar 11 17:41:24 PDT 2014
On Wed, Mar 12, 2014 at 12:21:23AM +0000, Jeroen Ketema wrote:
> On 11 Mar 2014, at 23:14, Tom Stellard <tom at stellard.net> wrote:
> > On Tue, Mar 11, 2014 at 07:05:16PM +0000, Jeroen Ketema wrote:
> >> Hi all,
> >> I was wondering: Would it make sense to provide implementations of fast_length and fast_normailze even though currently no implementations of half_sqrt and half_rsqrt are provided by libclc?
> > Can fast_length and fast_normalize be implemented correctly without
> > half_sqrt and half_rsqrt?
> The OpenCL specification says that the result should be equal to something that involves half_sqrt and half_rsqrt, respectively. So, it seems to make most sense to use the definitions given by OpenCL directly.
> > Are there llvm intrinsics that we could
> > use for half_sqrt and half_rsqrt?
> The nvptx back-end has the sqrt.approx and rsqrt.approx intrinsics, but it’s not clear to me whether these have enough precision. Also this isn’t a solution for the r600 back-end.
I think the sqrt.approx and rsqrt.approx intrinsics are intended
to be used with the native_sqrt and native_rsqrt functions.
For a generic implementation, you may be able to do something with the
llvm.sqrt.* and the llvm.convert.to.fp16 / llvm.convert.from.fp16
> > -Tom
> >> Thanks,
> >> Jeroen
> >> _______________________________________________
> >> Libclc-dev mailing list
> >> Libclc-dev at pcc.me.uk
> >> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev
More information about the Libclc-dev