[PATCH] [complex] Teach the complex math IR gen to emit direct math and a NaN-test prior to the call to the library function.

Fri Oct 17 18:04:16 PDT 2014

On Fri, Oct 17, 2014 at 2:23 AM, Steve Canon <scanon at apple.com> wrote:

> Apologies for delay in looking at this, I'm on vacation this week.
>

Not a problem. =]

I don't love this approach because (a) it doesn't get us fully to where we
> want to be in performance, and (b) it's going to trash the floating-point
> flag state.  The performance issue is that we still have two comparisons
> and one or two branches for every complex op outside of no-nans, and the
> flags issue is as follows:
>
> The intention of IEEE-754 is that anything that is conceptually a single
> "operation" should raise at most one of divide-by-zero, invalid, overflow,
> or underflow.  A complex multiplication implemented with lazy checking
> may cause two of these to be raised:
>
>     (tiny, huge) * (tiny, huge) --> underflow + overflow
>     (0, huge) * (inf, huge) --> invalid + overflow, no flags
>
> My preferred approach would be to implement limited-range semantics as an
> option (via either pragma or flag), and have it implied by fast-math.
>

I don't really understand what you want here.

In the case of fast-math, the comparisons should vanish and I think we're
left with a minimal amount of math. If there is some more minimal way to
compute the result in the case of fast-math, please let me know?

In the case of *not* have fast-math and needing to be correct, I'm just not
in a position to come up with a more efficient but still numerically
correct implementation. I have no idea how to do it. And I'm not really
willing to sign up to do it because I don't have the time. =/ I don't think
that hoping for a future better world should obstruct getting this into the
tree as it (to the extent I'm aware) is a strict improvement on the status
quo.

>
> Now, all that being said, I haven't checked if today's compiler-rt
> implementations are even correct w.r.t. flags in this sense,
>

So, the code I am generating here is *exactly* the code we have in
compiler-rt. I don't know the first thing about actually implementing this
stuff and am completely leveraging the compiler-rt implementation. I'm also
not a numerics expert and not setting out to improve that implementation,
but if you or anyone else have a better implementation, I'm all ears.

> so it's not immediately obvious that this change makes anything worse
> today, and it will address //some// of the performance concerns of the
> earlier patch.
>

I'm pretty sure this is essentially just inlining the code from compiler-rt
around the call to the library function. =]

> It just seems contrary to the direction that we really want to be going in
> the longer-term w.r.t. numerical correctness.
>

I don't really know that why this is less *correct*... but I'll take your
word on it. However, I also think that this future you're describing is
somewhat hypothetical really. Is there any hope of getting there? Is anyone
working on it?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20141017/3c51e650/attachment.html>