[Libclc-dev] [PATCH 1/1] Implement log10

Fri Oct 24 13:59:55 PDT 2014

> On Oct 24, 2014, at 10:59 AM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> 
> On Wed, 2014-10-22 at 12:46 -0700, Matt Arsenault wrote:
>>> On Oct 22, 2014, at 9:32 AM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
>>> 
>>> [SNIP]
>>>> 
>>>> Last time I tried it, these functions failed the OpenCL conformance 
>>>> test. Have you considered porting from the amd-builtins branch instead?
> 
> can you point me to that branch? I tried searching for it but no luck
> (though I do remember reading news about it)

It’s the amd-builtins branch in the libclc repo. It doesn’t seem to be mirrored in the git mirror, but I was able to clone it with: 
git svn clone --branches=amd-builtins http://llvm.org/svn/llvm-project/libclc/ libclc_amd

>>> 
>>> I could not find any information about relative errors of hw
>>> instructions. I thought that _IEEE instructions produced correctly
>>> rounded results.
>>> I'm still not clear about how the relative error propagates across
>>> operations:
>> 
>> Which instructions? I’ve only found notes about precision on CI for
>> some instructions in some internal documentation. The general rule is
>> most of the f32 transcendental instructions should be good enough for
>> the OpenCL library functions (without denormals), but not the f64 ones
>> (which could only be used for the native_* versions)
> 
> In R600/EG/Cayman ISA there is bunch of instructions with _IEEE suffix
> (in this case LOG_IEEE). the specs do not mention precision, MUL_IEEE
> (and MULADD_IEEE*) says that it uses IEEE rules for 0*x.
> I hoped that _IEEE also meant that the operation also follows the rest
> of the rules including precision.

I believe all of the basic arithmetic instructions should always be correctly rounded. the IEEE part of the name refers to the handling of NaN and other special cases. In SI the convention appears to be the “normal” instruction names have the IEEE NaN behavior, and the _legacy_ ones do not.

> 
>> 
>> 
>>> if we have log1p implemented ( with error <= 2ulp ) would
>>> log(x) = log1p(x-1.0) result in error <= 2 ulp since subtraction is
>>> correctly rounded?
>>> if so would log(x) / LOG_2 (correctly rounded constant) give correct
>>> log2 (error <= 3 ulp)?
>> 
>> I’m not sure. I know you wouldn’t want to do that from a performance
>> perspective since log1p is a longer function. With the AMD OpenCL
>> builtins, log1p is more than twice as long and uses 10 more VGPRs
> 
> I meant whether there are error propagation rules in general (especially
> for the second case). right now my understanding is that we'd need a
> comment/proof for every operation.
> 
> 
> jan
> 
>> 
>>> 
>>> jan
>>> 
>>>> 
>>>> 
>>> 
>>> -- 
>>> Jan Vesely <jan.vesely at rutgers.edu <mailto:jan.vesely at rutgers.edu>>
>>> _______________________________________________
>>> Libclc-dev mailing list
>>> Libclc-dev at pcc.me.uk <mailto:Libclc-dev at pcc.me.uk>
>>> http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev <http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev>
> 
> -- 
> Jan Vesely <jan.vesely at rutgers.edu>