[cfe-dev] Clang ignoring --fast-math for complex division, serious performance hit
Richard Campbell via cfe-dev
cfe-dev at lists.llvm.org
Mon Nov 6 10:11:38 PST 2017
> On Nov 6, 2017, at 9:59 AM, John McCall <rjmccall at apple.com> wrote:
>
>> On Nov 6, 2017, at 12:21 PM, Richard Campbell <rlcamp.pdx at gmail.com> wrote:
>> When one writes a critical inner loop that doesn’t contain any function calls, one should reasonably expect the compiler not to add them.
>
> Complex divide is a large, complicated operation when full precision and infinity-correctness is required. We appreciate that you have performance constraints, but implementing it with an outlined function is not an unreasonable choice.
Perhaps not, except that the compiler was previously doing the right thing without the associated very serious performance hit.
>
>> While there may be more low hanging fruit, I don’t want it to get in the way of fixing this. My main concern is that there not be noticeable regressions. This particular regression has the potential to result in certain calculations taking HOURS longer than expected, if I hadn’t been hacking my way around it already. I would greatly prefer to write simple maintainable code and let the compiler do the right thing on the hardware of today and tomorrow.
>
> Richard, let me be clear about your options here. If you're interested in working on this, that would be great, and I'd be happy to review your patches. If you're not interested in working on this, then you should file a bug and hope that someone else has the motivation to pick it up.
I’d be happy to submit a patch, but don’t know where to begin with that process. I expect the solution will look nearly identical to the solution to the same problem when it occurred with __mulsc3 in July 2015. That fix seems to have happened with no explanation on the mailing list, or I’d have a head start on fixing this myself.
More information about the cfe-dev
mailing list