[Libclc-dev] [PATCH v2 16/16] native_powr: Switch implementation to native_exp2 and native_log2
Jeroen Ketema via Libclc-dev
libclc-dev at lists.llvm.org
Tue Nov 14 13:42:52 PST 2017
> On 14 Nov 2017, at 22:37, Jan Vesely <jan.vesely at rutgers.edu> wrote:
>
> On Tue, 2017-11-14 at 21:42 +0100, Jeroen Ketema via Libclc-dev wrote:
>> Hi Jan,
>>
>> Thanks for the clarification. Looking back at the code
>>
>> const __CLC_GENTYPE zero = (__CLC_GENTYPE)0.0f;
>> const __CLC_GENTYPE res = native_exp2(y * native_log2(x));
>> const __CLC_INTN condnan = isless(x, zero);
>> return select(res, (__CLC_GENTYPE)NAN, condnan);
>>
>> and with your comments in mind, can’t this be simplified to:
>>
>> return native_exp2(y * native_log2(x));
>>
>> ?
>>
>> If x < 0, then native_log2 yields NaN, and the NaN propagates through the operations.
>
> yeah, that works as well. I guess the special cases were also crafted
> for this optimization.
>
I’m fine with it either way. I do think it would be good to add some comments
in the code that explain what is going on. Just x^y == 2^{log2 x^y} == 2^{y * log2 x}
plus a comment about x < 0 would work for me.
Jeroen
> Jan
>
>>
>> Jeroen
>>
>>> On 13 Nov 2017, at 21:09, Jan Vesely <jan.vesely at rutgers.edu> wrote:
>>>
>>> On Mon, 2017-11-13 at 20:50 +0100, Jeroen Ketema wrote:
>>>> This one is not immediately obvious to me. Maybe add some appropriate comment?
>>>
>>> it exploits that x == 2^{log2 x} and log2 {x^y} == y * log2 x
>>> thus x^y == 2^{log2 x^y} == 2^{y * log2 x}.
>>> However, it only works for x > 0, which is OK since powr(<0, ...)
>>> should return NaN anyway.
>>> IMO powr was added to allow this kind of expansion instead of more
>>> expensive full pow (which needs to handle negative x).
>>>
>>> Jan
>>>
>>>>
>>>> Jeroen
>>>>
>>>>> On 6 Nov 2017, at 23:15, Jan Vesely via Libclc-dev <libclc-dev at lists.llvm.org> wrote:
>>>>>
>>>>> Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
>>>>> ---
>>>>> sent wrong patch the first time. This one passes the CTS.
>>>>>
>>>>> Jan
>>>>>
>>>>> generic/include/clc/math/native_powr.h | 8 +++++++-
>>>>> generic/lib/SOURCES | 1 +
>>>>> generic/lib/math/native_powr.cl | 5 +++++
>>>>> generic/lib/math/native_powr.inc | 6 ++++++
>>>>> 4 files changed, 19 insertions(+), 1 deletion(-)
>>>>> create mode 100644 generic/lib/math/native_powr.cl
>>>>> create mode 100644 generic/lib/math/native_powr.inc
>>>>>
>>>>> diff --git a/generic/include/clc/math/native_powr.h b/generic/include/clc/math/native_powr.h
>>>>> index e8a37d9..c31161a 100644
>>>>> --- a/generic/include/clc/math/native_powr.h
>>>>> +++ b/generic/include/clc/math/native_powr.h
>>>>> @@ -1 +1,7 @@
>>>>> -#define native_powr pow
>>>>> +#define __CLC_BODY <clc/math/binary_decl_tt.inc>
>>>>> +#define __CLC_FUNCTION native_powr
>>>>> +
>>>>> +#include <clc/math/gentype.inc>
>>>>> +
>>>>> +#undef __CLC_BODY
>>>>> +#undef __CLC_FUNCTION
>>>>> diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES
>>>>> index 355741c..ad5a743 100644
>>>>> --- a/generic/lib/SOURCES
>>>>> +++ b/generic/lib/SOURCES
>>>>> @@ -127,6 +127,7 @@ math/native_exp2.cl
>>>>> math/native_log.cl
>>>>> math/native_log10.cl
>>>>> math/native_log2.cl
>>>>> +math/native_powr.cl
>>>>> math/native_recip.cl
>>>>> math/native_rsqrt.cl
>>>>> math/native_sin.cl
>>>>> diff --git a/generic/lib/math/native_powr.cl b/generic/lib/math/native_powr.cl
>>>>> new file mode 100644
>>>>> index 0000000..452bc6f
>>>>> --- /dev/null
>>>>> +++ b/generic/lib/math/native_powr.cl
>>>>> @@ -0,0 +1,5 @@
>>>>> +#include <clc/clc.h>
>>>>> +
>>>>> +#define __CLC_BODY <native_powr.inc>
>>>>> +#define __FLOAT_ONLY
>>>>> +#include <clc/math/gentype.inc>
>>>>> diff --git a/generic/lib/math/native_powr.inc b/generic/lib/math/native_powr.inc
>>>>> new file mode 100644
>>>>> index 0000000..841b1ff
>>>>> --- /dev/null
>>>>> +++ b/generic/lib/math/native_powr.inc
>>>>> @@ -0,0 +1,6 @@
>>>>> +_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE native_powr(__CLC_GENTYPE x, __CLC_GENTYPE y) {
>>>>> + const __CLC_GENTYPE zero = (__CLC_GENTYPE)0.0f;
>>>>> + const __CLC_GENTYPE res = native_exp2(y * native_log2(x));
>>>>> + const __CLC_INTN condnan = isless(x, zero);
>>>>> + return select(res, (__CLC_GENTYPE)NAN, condnan);
>>>>> +}
>>>>> --
>>>>> 2.13.6
>>>>>
>>>>> _______________________________________________
>>>>> Libclc-dev mailing list
>>>>> Libclc-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev
>>>>
>>>>
>>
>> _______________________________________________
>> Libclc-dev mailing list
>> Libclc-dev at lists.llvm.org <mailto:Libclc-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/libclc-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20171114/b4257e3f/attachment-0001.html>
More information about the Libclc-dev
mailing list