[Libclc-dev] [PATCH] relational: Fix signbit
Jeroen Ketema
j.ketema at imperial.ac.uk
Wed Jun 25 12:12:31 PDT 2014
Hi,
You’re correct. Mea culpa, I missed the fact that != returns -1 for vectors.
So, looks good to me!
Jeroen
On 25 Jun 2014, at 20:06, Aaron Watry <awatry at gmail.com> wrote:
> And that's what it should be doing.
>
> For scalar signbit, we return:
> return __builtin_signbitf(x);
>
> For vector values of float2 for example, we return
> (int2)( (int2){__builtin_signbitf(x.s0), __builtin_signbitf(x.s1)} != (int2) 0)
>
> __builtin_signbitf(float) returns either 0 or 1 depending on
> false/true. We need to convert that value from 1 to -1 for vector
> calls.
>
> The '!= 0' comparison takes care of returning -1 for us for vector
> calls, and we skip that comparison for scalar calls and just return
> __builtin_signbitf's result directly.
>
> In the bitcode from the previous email,
>
> %7 = insertelement <3 x i32> undef, i32 %.lobit.i.i, i32 0
> %8 = bitcast float %3 to i32
> %.lobit.i3.i = lshr i32 %8, 31
> %9 = insertelement <3 x i32> %7, i32 %.lobit.i3.i, i32 1
> %10 = bitcast float %5 to i32
> %.lobit.i2.i = lshr i32 %10, 31
> %11 = insertelement <3 x i32> %9, i32 %.lobit.i2.i, i32 2
> %12 = icmp ne <3 x i32> %11, zeroinitializer
> %13 = sext <3 x i1> %12 to <3 x i32>
>
> %7, %9, and %11 build a <3 x i32> vector
> %12 compares the vector to 0
> %13 sign extends the result from <3 x i1> to <3 x i32>
>
> Unless I'm very mistaken, that's exactly what we want here.
>
> --Aaron
>
> On Wed, Jun 25, 2014 at 1:40 PM, Jeroen Ketema <j.ketema at imperial.ac.uk> wrote:
>>
>> Hi Aaron,
>>
>> This change looks good to me. However, I’m still not totally convinced this patched version does the right thing. When I read this:
>>
>> http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/signbit.html
>>
>> it seems that in the scalar float case either 0 or 1 should be returned and in the vector case a vector filled with 0s and -1s (minus ones).
>>
>> Jeroen
>>
>> On 25 Jun 2014, at 19:17, Aaron Watry <awatry at gmail.com> wrote:
>>
>>> The vector components were mistakenly using () instead of {}, which caused
>>> all but the last vector component to be dropped on the floor.
>>>
>>> CC: Jeroen Ketema <j.ketema at imperial.ac.uk>
>>> Signed-off-by: Aaron Watry <awatry at gmail.com>
>>> ---
>>> generic/lib/relational/signbit.cl | 14 +++++++-------
>>> 1 file changed, 7 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/generic/lib/relational/signbit.cl b/generic/lib/relational/signbit.cl
>>> index 1f496d9..f429960 100644
>>> --- a/generic/lib/relational/signbit.cl
>>> +++ b/generic/lib/relational/signbit.cl
>>> @@ -17,35 +17,35 @@ _CLC_DEF _CLC_OVERLOAD RET_TYPE FUNCTION(ARG_TYPE x) { \
>>>
>>> #define _CLC_DEFINE_RELATIONAL_UNARY_VEC3(RET_TYPE, FUNCTION, ARG_TYPE) \
>>> _CLC_DEF _CLC_OVERLOAD RET_TYPE FUNCTION(ARG_TYPE x) { \
>>> - return (RET_TYPE)((FUNCTION(x.s0), FUNCTION(x.s1), FUNCTION(x.s2)) != (RET_TYPE)0); \
>>> + return (RET_TYPE)( (RET_TYPE){FUNCTION(x.s0), FUNCTION(x.s1), FUNCTION(x.s2)} != (RET_TYPE)0); \
>>> } \
>>>
>>> #define _CLC_DEFINE_RELATIONAL_UNARY_VEC4(RET_TYPE, FUNCTION, ARG_TYPE) \
>>> _CLC_DEF _CLC_OVERLOAD RET_TYPE FUNCTION(ARG_TYPE x) { \
>>> return (RET_TYPE)( \
>>> - ( \
>>> + (RET_TYPE){ \
>>> FUNCTION(x.s0), FUNCTION(x.s1), FUNCTION(x.s2), FUNCTION(x.s3) \
>>> - ) != (RET_TYPE)0); \
>>> + } != (RET_TYPE)0); \
>>> } \
>>>
>>> #define _CLC_DEFINE_RELATIONAL_UNARY_VEC8(RET_TYPE, FUNCTION, ARG_TYPE) \
>>> _CLC_DEF _CLC_OVERLOAD RET_TYPE FUNCTION(ARG_TYPE x) { \
>>> return (RET_TYPE)( \
>>> - ( \
>>> + (RET_TYPE){ \
>>> FUNCTION(x.s0), FUNCTION(x.s1), FUNCTION(x.s2), FUNCTION(x.s3), \
>>> FUNCTION(x.s4), FUNCTION(x.s5), FUNCTION(x.s6), FUNCTION(x.s7) \
>>> - ) != (RET_TYPE)0); \
>>> + } != (RET_TYPE)0); \
>>> } \
>>>
>>> #define _CLC_DEFINE_RELATIONAL_UNARY_VEC16(RET_TYPE, FUNCTION, ARG_TYPE) \
>>> _CLC_DEF _CLC_OVERLOAD RET_TYPE FUNCTION(ARG_TYPE x) { \
>>> return (RET_TYPE)( \
>>> - ( \
>>> + (RET_TYPE){ \
>>> FUNCTION(x.s0), FUNCTION(x.s1), FUNCTION(x.s2), FUNCTION(x.s3), \
>>> FUNCTION(x.s4), FUNCTION(x.s5), FUNCTION(x.s6), FUNCTION(x.s7), \
>>> FUNCTION(x.s8), FUNCTION(x.s9), FUNCTION(x.sa), FUNCTION(x.sb), \
>>> FUNCTION(x.sc), FUNCTION(x.sd), FUNCTION(x.se), FUNCTION(x.sf) \
>>> - ) != (RET_TYPE)0); \
>>> + } != (RET_TYPE)0); \
>>> } \
>>>
>>>
>>> --
>>> 1.9.1
>>>
>>
More information about the Libclc-dev
mailing list