[Libclc-dev] [PATCH 1/1] Implement log10

Fri Oct 24 10:59:51 PDT 2014

On Wed, 2014-10-22 at 12:46 -0700, Matt Arsenault wrote:
> > On Oct 22, 2014, at 9:32 AM, Jan Vesely <jan.vesely at rutgers.edu> wrote:
> > 
> > [SNIP]
> >> 
> >> Last time I tried it, these functions failed the OpenCL conformance 
> >> test. Have you considered porting from the amd-builtins branch instead?

can you point me to that branch? I tried searching for it but no luck
(though I do remember reading news about it)

> > 
> > is precision the problem?
> 
> Yes. I’ll try to run these on the conformance test later to verify. The
> double version will definitely not pass since f64 division is currently
> not implemented correctly (which I’m working on, although it involves
> fixing a lot of special cases).
> 
> I also don’t think the log2(10) will constant fold correctly (I think
> LLVM will fold the intrinsic for it depending on if the host compiler
> has the libfunc for it, which also introduces variability since there
> is no standardization of precision in standard C), so assuming this
> will produce correct results it should use the computed constant.

I agree, precomputed constant is better.

> 
> > 
> > I could not find any information about relative errors of hw
> > instructions. I thought that _IEEE instructions produced correctly
> > rounded results.
> > I'm still not clear about how the relative error propagates across
> > operations:
> 
> Which instructions? I’ve only found notes about precision on CI for
> some instructions in some internal documentation. The general rule is
> most of the f32 transcendental instructions should be good enough for
> the OpenCL library functions (without denormals), but not the f64 ones
> (which could only be used for the native_* versions)

In R600/EG/Cayman ISA there is bunch of instructions with _IEEE suffix
(in this case LOG_IEEE). the specs do not mention precision, MUL_IEEE
(and MULADD_IEEE*) says that it uses IEEE rules for 0*x.
I hoped that _IEEE also meant that the operation also follows the rest
of the rules including precision.

> 
> 
> > if we have log1p implemented ( with error <= 2ulp ) would
> > log(x) = log1p(x-1.0) result in error <= 2 ulp since subtraction is
> > correctly rounded?
> > if so would log(x) / LOG_2 (correctly rounded constant) give correct
> > log2 (error <= 3 ulp)?
> 
> I’m not sure. I know you wouldn’t want to do that from a performance
> perspective since log1p is a longer function. With the AMD OpenCL
> builtins, log1p is more than twice as long and uses 10 more VGPRs

I meant whether there are error propagation rules in general (especially
for the second case). right now my understanding is that we'd need a
comment/proof for every operation.

jan

> 
> > 
> > jan
> > 
> >> 
> >> 
> > 
> > -- 
> > Jan Vesely <jan.vesely at rutgers.edu <mailto:jan.vesely at rutgers.edu>>
> > _______________________________________________
> > Libclc-dev mailing list
> > Libclc-dev at pcc.me.uk <mailto:Libclc-dev at pcc.me.uk>
> > http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev <http://www.pcc.me.uk/cgi-bin/mailman/listinfo/libclc-dev>

-- 
Jan Vesely <jan.vesely at rutgers.edu>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://lists.llvm.org/pipermail/libclc-dev/attachments/20141024/df16d7cc/attachment.sig>