[cfe-dev] [test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"

Sebastian Pop via cfe-dev cfe-dev at lists.llvm.org
Wed Oct 12 05:04:19 PDT 2016


On Wed, Oct 12, 2016 at 4:01 AM, Renato Golin <renato.golin at linaro.org> wrote:
> On 12 October 2016 at 05:35, Sebastian Pop <sebpop.llvm at gmail.com> wrote:
>> polybench/linear-algebra/solvers/gramschmidt/ exposes the same problems as symm.
>> It does not match the reference output at -O0 -ffp-contract=off,
>> and it only passes all elements comparisons for FP_ABSTOLERANCE=1 for
>> "-Ofast" vs. "-O0 -ffp-contract=off".
>
> I think we're going about this in a completely wrong way.
>
> The current reference output is specific to fp-contract=off, and
> making it work for fp-contract=on makes no sense at all.

Yes.

I want to mention that there are two problems: one is with the FP tolerance
as you describe below.
The other problem is the reference output does not match
at "-O0 -ffp-contract=off". It might be that the reference output was recorded
at "-O3 -ffp-contract=off". I think that this hides either a compiler
bug or a test bug.

Sebastian

>
> For all we know, fp-contract=on generates *more accurate* results, not
> less. But it may also have less predictable results *across* different
> targets, thus the need to a tolerance.
>
> FP_TOLERANCE is *not* about making the new results match an old
> reference, but about showing the *real* uncertainties of FP
> transformation on *different* targets.
>
> So, if you want to fix this test for good, here are the steps you need to take:
>
> 1. Checkout the test-suite on different platforms, x86_64, ARM,
> AArch64, PPC, MIPS. The more the merrier.
> 2. Enable fp-contract=on, run the tests on all platforms, record the
> outputs, ignore the differences.
> 3. Collate each platofrm's output for each test and see how different they are
>
> To make it easier to compare, in the past, I've used this trick:
>
> 1. Run in one platform, ex. x86_64, ignored the reference
> 2. Copy the output of those tests back to the reference_output
> 3. Run on a different platform, tweaking the tolerance until it "passes"
> 4. Run on yet another platform, making sure you don't need to tweak
> the tolerance yet again
>
> If the tolerance is "too high" for that test, we can further discuss
> how to change it to make it better. If not, you found a solution.
>
> If you want to make it even better, do some analysis on the
> distribution of the results, per test, and pick the average as the
> reference output and one or two standard deviations as the tolerance.
> This should pass on most architectures.
>
> To simplify the analysis, you can reduce the output into a single
> number, say, adding all the results up. This will generate more
> inaccuracies than comparing each value, and if that's too large an
> error, then you reduce the number of samples.
>
> For example, on cholesky, we sampled every 16th item of the array:
>
>   for (i = 0; i < n; i++) {
>     for (j = 0; j < n; j++)
>       print_element(A[i][j], j*16, printmat);
>     fputs(printmat, stderr);
>   }
>
> using "print_element" because calling printf sucks.
>
> These modifications are ok, because they don't change the tests nor
> hides them from compiler changes.
>
> cheers,
> --renato



More information about the cfe-dev mailing list