[cfe-dev] what does -ffp-contract=fast allow?
Finkel, Hal J. via cfe-dev
cfe-dev at lists.llvm.org
Thu Nov 17 17:03:44 PST 2016
Sent from my Verizon Wireless 4G LTE DROID
On Nov 17, 2016 5:53 PM, Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>> wrote:
>
>
>> On Nov 17, 2016, at 4:33 PM, Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>> wrote:
>>
>>
>> ________________________________
>>>
>>> From: "Warren Ristow" <warren.ristow at sony.com<mailto:warren.ristow at sony.com>>
>>> To: "Sanjay Patel" <spatel at rotateright.com<mailto:spatel at rotateright.com>>, "cfe-dev" <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>>, "llvm-dev" <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
>>> Cc: "Nicolai Hähnle" <nhaehnle at gmail.com<mailto:nhaehnle at gmail.com>>, "Hal Finkel" <hfinkel at anl.gov<mailto:hfinkel at anl.gov>>, "Mehdi Amini" <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>>, "andrew kaylor" <andrew.kaylor at intel.com<mailto:andrew.kaylor at intel.com>>
>>> Sent: Thursday, November 17, 2016 5:58:58 PM
>>> Subject: RE: what does -ffp-contract=fast allow?
>>>
>>> > Is this a bug? We transformed the original expression into:
>>> > x * y + x
>>>
>>> I’d say yes, it’s a bug.
>>>
>>>
>>>
>>> Unless ‑ffast‑math is used (or some appropriate subset that gives us leeway, like ‑fno‑honor‑infinities or ‑fno‑honor‑nans, or somesuch), the re-association isn’t allowed, and that blocks the madd contraction.
>>
>> I agree. FP contraction alone only allows us to do x*y+z -> fma(x,y,z).
>
>
> I agree too, but the more difficult question is "which flags are needed here?”
> Would FPContract + no-inf be enough? If not why and how to document it?
I think that the relevant question is: Is the contracted form more precise for all inputs (or the same precision as the original)? If so, then this should be allowed with just fp-contract+no-inf. Otherwise, more is required.
-Hal
>
>
> —
> Mehdi
>
>
>
>>>
>>>
>>> From: Sanjay Patel [mailto:spatel at rotateright.com<mailto:spatel at rotateright.com>]
>>> Sent: Thursday, November 17, 2016 3:22 PM
>>> To: cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
>>> Cc: Nicolai Hähnle <nhaehnle at gmail.com<mailto:nhaehnle at gmail.com>>; Hal Finkel <hfinkel at anl.gov<mailto:hfinkel at anl.gov>>; Mehdi Amini <mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>>; Ristow, Warren <warren.ristow at sony.com<mailto:warren.ristow at sony.com>>; andrew.kaylor at intel.com<mailto:andrew.kaylor at intel.com>
>>> Subject: what does -ffp-contract=fast allow?
>>>
>>>
>>>
>>> This is just paraphrasing from D26602, so credit to Nicolai for first raising the issue there.
>>>
>>> float foo(float x, float y) {
>>> return x * (y + 1);
>>> }
>>>
>>> $ ./clang -O2 xy1.c -S -o - -target aarch64 -ffp-contract=fast | grep fm
>>> fmadd s0, s1, s0, s0
>>>
>>> Is this a bug? We transformed the original expression into:
>>> x * y + x
>>>
>>> When x=INF and y=0, the code returns INF if we don't reassociate. With reassociation to FMA, it returns NAN because 0 * INF = NAN.
>>>
>>> 1. I used aarch64 as the example target, but this is not target-dependent (as long as the target has FMA).
>>>
>>> 2. This is *not* -ffast-math...or is it? The C standard only shows on/off settings for the associated FP_CONTRACT pragma.
>>>
>>> 3. AFAIK, clang has no documentation for -ffp-contract:
>>> http://clang.llvm.org/docs/UsersManual.html
>>>
>>> 4. GCC says:
>>> https://gcc.gnu.org/onlinedocs/gcc-6.2.0/gcc/Optimize-Options.html#Optimize-Options
>>> "-ffp-contract=fast enables floating-point expression contraction such as forming of fused multiply-add operations if the target has native support for them."
>>>
>>> 5. The LLVM backend (where this reassociation currently happens) shows:
>>> FPOpFusion::Fast - Enable fusion of FP ops wherever it's profitable.
>>
>>
>>
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20161118/739a92ce/attachment.html>
More information about the cfe-dev
mailing list