[cfe-dev] [llvm-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
Hal Finkel via cfe-dev
cfe-dev at lists.llvm.org
Wed Oct 12 11:50:59 PDT 2016
----- Original Message -----
> From: "Mehdi Amini" <mehdi.amini at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Renato Golin" <renato.golin at linaro.org>, "Sebastian Paul Pop"
> <s.pop at samsung.com>, "llvm-dev" <llvm-dev at lists.llvm.org>, "Matthias
> Braun" <matze at braunis.de>, "Clang Dev" <cfe-dev at lists.llvm.org>,
> "nd" <nd at arm.com>, "Abe Skolnik" <a.skolnik at samsung.com>
> Sent: Wednesday, October 12, 2016 1:29:12 PM
> Subject: Re: [llvm-dev] [test-suite] making polybench/symm succeed
> with "-Ofast" and "-ffp-contract=on"
> > On Oct 12, 2016, at 7:05 AM, Hal Finkel via llvm-dev <
> > llvm-dev at lists.llvm.org > wrote:
>
> > ----- Original Message -----
>
> > > From: "Renato Golin" < renato.golin at linaro.org >
> >
>
> > > To: "Sebastian Pop" < sebpop.llvm at gmail.com >
> >
>
> > > Cc: "Hal Finkel" < hfinkel at anl.gov >, "Sebastian Paul Pop" <
> > > s.pop at samsung.com >, "llvm-dev" < llvm-dev at lists.llvm.org >,
> >
>
> > > "Matthias Braun" < matze at braunis.de >, "Clang Dev" <
> > > cfe-dev at lists.llvm.org >, "nd" < nd at arm.com >, "Abe Skolnik"
> >
>
> > > < a.skolnik at samsung.com >
> >
>
> > > Sent: Wednesday, October 12, 2016 8:35:16 AM
> >
>
> > > Subject: Re: [test-suite] making polybench/symm succeed with
> > > "-Ofast"
> > > and "-ffp-contract=on"
> >
>
> > > On 12 October 2016 at 14:26, Sebastian Pop <
> > > sebpop.llvm at gmail.com
> > > >
> >
>
> > > wrote:
> >
>
> > > > Correct me if I misunderstood: you would be ok changing the
> > >
> >
>
> > > > reference output to exactly match the output of "-O0
> > >
> >
>
> > > > -ffp-contract=off".
> > >
> >
>
> > > No, that's not at all what I said.
> >
>
> > > Matching identical outputs to FP tests makes no sense because
> > > there's
> >
>
> > > *always* an error bar.
> >
>
> > This is something we need to understand. No, there's not always an
> > error bar. With FMA formation and without non-IEEE-compliant
> > optimizations (i.e. fast-math), the optimized answer should be
> > identical to the non-optimized answer.
>
> Can you clarify: in my mind the F in FMA is for “fused”, i.e. no
> intermediate truncation, i.e. not the same numerical result. But you
> imply the opposite above?
Sorry, I did not type that correctly. I meant *without* FMA formation and without non-IEEE-compliant optimizations (i.e. fast-math), the optimized answer should be identical to the non-optimized answer.
-Hal
> —
> Mehdi
> > If these don't match, then we should understand why. This used to
> > be
> > a large problem because of fp80-related issues on x86 processors,
> > but even on x86 if we stick to SSE (etc.) FP instructions, this is
> > not an issue any more. We still do see cross-system discrepancies
> > sometimes because of differences in denormal handling, but on the
> > same system that should be consistent (aside, perhaps, from
> > compiler-level constant-folding issues).
>
> > -Hal
>
> > > The output of O0, O1, O2, O3, Ofast, Os, Oz should all be within
> > > the
> >
>
> > > boundaries of an average and its associated error bar.
> >
>
> > > By understanding what's the *expected* output and its associated
> >
>
> > > error
> >
>
> > > range we can accurately predict what will be the correct
> >
>
> > > reference_output and the tolerance for each individual test.
> >
>
> > > Your solution 2 "works" because you're doing the matching
> > > yourself,
> >
>
> > > in
> >
>
> > > the code, and for that, you pay the penalty of running it twice.
> > > But
> >
>
> > > it's not easy to control the tolerance, nor it's stable for all
> >
>
> > > platforms where we don't yet run the test suite.
> >
>
> > > My original proposal, and what I'm still proposing here, is to
> >
>
> > > understand the tests and make them right, by giving them proper
> >
>
> > > references and tolerances. If the output is too large,
> > > reduce/sample
> >
>
> > > in a way that doesn't increase the error ranges too much, enough
> > > to
> >
>
> > > keep the tolerance low, so we can still catch bugs in the FP
> >
>
> > > transformations.
> >
>
> > > cheers,
> >
>
> > > --renato
> >
>
> > --
>
> > Hal Finkel
>
> > Lead, Compiler Technology and Programming Languages
>
> > Leadership Computing Facility
>
> > Argonne National Laboratory
>
> > _______________________________________________
>
> > LLVM Developers mailing list
>
> > llvm-dev at lists.llvm.org
>
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20161012/32ac8fd6/attachment.html>
More information about the cfe-dev
mailing list