[cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
Matthias Braun via cfe-dev
cfe-dev at lists.llvm.org
Wed Oct 12 11:36:52 PDT 2016
> On Oct 12, 2016, at 7:53 AM, Hal Finkel via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>
> ----- Original Message -----
>> From: "Sebastian Pop" <sebpop.llvm at gmail.com <mailto:sebpop.llvm at gmail.com>>
>> To: "Hal Finkel" <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>
>> Cc: "Renato Golin" <renato.golin at linaro.org <mailto:renato.golin at linaro.org>>, "Sebastian Paul Pop" <s.pop at samsung.com <mailto:s.pop at samsung.com>>, "llvm-dev"
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>, "Matthias Braun" <matze at braunis.de <mailto:matze at braunis.de>>, "Clang Dev" <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>, "nd"
>> <nd at arm.com <mailto:nd at arm.com>>, "Abe Skolnik" <a.skolnik at samsung.com <mailto:a.skolnik at samsung.com>>
>> Sent: Wednesday, October 12, 2016 9:43:37 AM
>> Subject: Re: [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
>>
>> On Wed, Oct 12, 2016 at 10:28 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>>> ----- Original Message -----
>>>> From: "Renato Golin" <renato.golin at linaro.org>
>>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>>> Cc: "Sebastian Paul Pop" <s.pop at samsung.com>, "llvm-dev"
>>>> <llvm-dev at lists.llvm.org>, "Matthias Braun"
>>>> <matze at braunis.de>, "Clang Dev" <cfe-dev at lists.llvm.org>, "nd"
>>>> <nd at arm.com>, "Abe Skolnik" <a.skolnik at samsung.com>,
>>>> "Sebastian Pop" <sebpop.llvm at gmail.com>
>>>> Sent: Wednesday, October 12, 2016 9:16:39 AM
>>>> Subject: Re: [test-suite] making polybench/symm succeed with
>>>> "-Ofast" and "-ffp-contract=on"
>>>>
>>>> On 12 October 2016 at 15:05, Hal Finkel <hfinkel at anl.gov> wrote:
>>>>> This is something we need to understand. No, there's not always
>>>>> an
>>>>> error bar. With FMA formation and without non-IEEE-compliant
>>>>> optimizations (i.e. fast-math), the optimized answer should be
>>>>> identical to the non-optimized answer.
>>>>
>>>> What about architectures that this is never respected, like
>>>> Darwin?
>>>>
>>>> In the general case, indeed, optimisation levels should not change
>>>> the
>>>> IEEE representation and the tests should be deterministic.
>>>>
>>>> But we can't guarantee this will always be the case.
>>>>
>>>>
>>>>> We still do see cross-system discrepancies sometimes because of
>>>>> differences in denormal handling, but on the same system that
>>>>> should be consistent (aside, perhaps, from compiler-level
>>>>> constant-folding issues).
>>>>
>>>> But the test-suite doesn't run on a single system, nor it has one
>>>> reference_output for each system.
>>>
>>> I agree and understand, and we may need a tolerance in practice to
>>> deal with differences from denormal handling, etc. However, if
>>> Sebastian is seeing differences on the same system, we should
>>> understand why. Is he running on an ARM Darwin system, or an x86
>>> using fp80 arithmetic,
>>
>> My dev machine is an x86_64-linux. This is where I ran all my
>> reported results.
>> How do I determine whether I am using fp80 arithmetic?
>
> I don't think that Clang/LLVM uses it by default on x86_64. If you're using -Ofast, however, that would explain it. I recommend looking at -O3 vs -O0 and make sure those are the same. -Ofast enables -ffast-math, which can legitimately cause differences.
On x86_64 we generally use the SSE units as much as possible because they are faster. The only exception to the rule is long double which uses x87/fp80. (32bit is a different story and generally uses more x87/fp80 because of ABI constraints).
- Matthias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20161012/31e6ea2e/attachment.html>
More information about the cfe-dev
mailing list