[PATCH] D24925: [testsuite] turn fp-contract off for ARM because output checking is not flexible enough

Mon Sep 26 12:11:39 PDT 2016

rengolin added a comment.

In https://reviews.llvm.org/D24925#552730, @hfinkel wrote:

> Hold on... both the new and the old state are working states, they're just different.

No. In the extreme case, "(return 0) == 0" is a way to make the test "pass". Of course this it not the case, but changing tests to accomodate code changes always has some degree of relaxation. We just want to make sure the relaxation is meaningful.

> When you use FMAs, as the standard allows, the answer will often be different compared to not using them. However, making this change is actually the most-useful thing to do (I'll explain why I think this below).

Absolutely. But ARM and AArch64 have FMA, and if that change is beneficial to other architectures, and this is the default on Clang, than *NOT* testing it means real code may break and we won't notice.

Believe me, I've been there before and have suffered the consequences.

> I understand you're being sarcastic, but this is unfair.

I wasn't being sarcastic, it was meant as a hyperbole.

> This is not a random reason, we understand exactly why this change produces changes in program output, and I also believe that testing using -ffp-contract=off is the most-useful test configuration to have (although having others in addition would be great too).

But it's not testing what the user will have, and that's a recipe for disaster.

> Why is it the most useful? Because it is the testing configuration for which we can have the most-precise known-good answer for comparison. As you point out, having strict thresholds for these tests has caught important miscompilation bugs in the past. When you start using FMAs, one issue is that there are lots of places where you now might or might not actually use them, and so the space of potential correct answers is large. We can come up with loser tolerances to capture this freedom, but even doing this precisely is potentially degenerate with bugs for some inputs, and so only having the loser version is actually worse than just testing with FMAs disabled. I would love to have both, but in the mean time, this is the right thing to do.

So, as Steve said, FMA is generally *more* precise, even if not standard C. But this is not about precision, it's about behaviour. I have to test what users will use, not what is more stable or less controversial.

We need to understand what the breaking bots were all about. It's very likely that they're all due to the same underlying problem, so fixing one thing on all tests might make all this argument moot.

In the past, I have fixed problems like this by implementing algorithms by hand (like sin/cos) to make sure the behaviour is known and doesn't heavily rely on system/libc/choices. We'll probably have to do the same here, and I'm surprised that no one is trying to understand the breakage, but work around it.

> FWIW, the reason we do this on PPC/LInux, and have for a long time, is exactly because GCC on PPC/Linux defaults to using FMAs, and so GCC needs -ffp-contract=off to pass our test suite (independent of anything else). This is just a general statement about our reference outputs: they're all generated without FMAs. Our flags should encode that dependency so that we don't see spurious failures (with Clang or any other compiler).

If the behaviour defaults to ON, then I *have* to test it ON. That's a simple truth of validation that we should not budge.

https://reviews.llvm.org/D24925