[Libclc-dev] fast_length and fast_normalize

Tue Mar 18 15:56:46 PDT 2014

On Wed, Mar 12, 2014 at 10:21:54AM +0000, Jeroen Ketema wrote:
> 
> > For a generic implementation, you may be able to do something with the
> > llvm.sqrt.* and the llvm.convert.to.fp16 / llvm.convert.from.fp16
> > intrinsics.
> 
> The only way I can see doing something with these intrinsics in the following
> 
> %2 = call f32 @llvm.convert.to.fp16(f32 %1)
> %3 = call f32 @llvm.convert.from.fp16(i16 %2)
> %4 = call f32 @llvm.sqrt.f32(f32 %3)
> %5 = call f32 @llvm.convert.to.fp16(f32 %4)
> %6 = call f32 @llvm.convert.from.fp16(i16 %5)
> 
> Which seems a bit roundabout.
> 
> Which brings me back to my original question: Would it make sense to provide an implementation of fast_length and fast_normalize even though there are no implementations (but only prototypes) for half_sqrt and half_rsqrt? 
> 

I think it would be best to implement half_sqrt and half_rsqrt too.
Using the intrinsic may be roundabout, but it is correct and targets may
override the generic implementation if they want to.

-Tom

> Jeroen
>