[llvm-dev] RFC: Large, especially super-linear, compile time regressions are bugs.

Thu Mar 31 16:40:50 PDT 2016

On 31 March 2016 at 23:34, Mehdi Amini <mehdi.amini at apple.com> wrote:
> I'm not sure about how "this commit regress the compile time by 2%" is a dubious metric.
> The metric is not dubious IMO, it is what it is: a measurement.

Ignoring for a moment the slippery slope we recently had on compile
time performance, 2% is an acceptable regression for a change that
improves most targets around 2% execution time, more than if only one
target was affected.

Different people see performance with different eyes, and companies
have different expectations about it, too, so those percentages can
have different impact on different people for the same change.

I guess my point is that no threshold will please everybody, and
people are more likely to "abuse" of the metric if the results are far
from what they see as acceptable, even if everyone else is ok with it.

My point about replacing metrics for thinking is not to the lazy
programmers (of which there are very few here), but to how far does
the encoded threshold fall from your own. Bias is a *very* hard thing
to remove, even for extremely smart and experienced people.

So, while "which hunt" is a very strong term for the mild bias we'll
all have personally, we have seen recently how some discussions end up
in rage when a group of people strongly disagree with the rest,
self-reinforcing their bias to levels that they would never reach
alone. In those cases, the term stops being strong, and may be
fitting... Makes sense?

> I agree. Did you read the part where I was mentioning that we're working in the tooling part and that I was waiting for it to be done to start this thread?

I did, and should have mentioned on my reply. I think you guys (and
ARM) are doing an amazing job at quality measurement. I wasn't trying
to reduce your efforts, but IMHO, the relationship between effort and
bias removal is not linear, ie. you'll have to improve quality
exponentially to remove bias linearly. So, the threshold we're
prepared to stop might not remove all the problems and metrics could
still play a negative role.

I think I'm just asking for us to be aware of the fact, not to stop
any attempt to introduce metrics. If they remain relevant to the final
objective, and we're allowed to break them with enough arguments, it
should work fine.

> How to do you suggest we address the long trail of 1-3% slow down that lead to the current situation (cf the two links I posted in my previous email)?
> Because there *is* a problem here, and I'd really like someone to come up with a solution for that.

Indeed, we're now slower than GCC, and that's a place that looked
impossible two years ago. But I doubt reverting a few patches will
help. For this problem, we'll need a task force to hunt for all the
dragons, and surgically alter them, since at this time, all relevant
patches are too far in the past.

For the future, emailing on compile time regressions (as well as run
time) is a good thing to have and I vouch for it. But I don't want
that to become a tool that will increase stress in the community.

> Not sure why or what you mean? The fact that an optimization improves only some target does not invalidate the point.

Sorry, I seem to have misinterpreted your point.

The fallacy is about the measurement of "benefit" versus the
regression "effect". The former is very hard to measure, while the
latter is very precise. Comparisons with radically different standard
deviations can easily fall into "undefined behaviour" land, and be
seed for rage threads.

> I talking about chasing and tracking every single commit were a developer would regress compile time *without even being aware*.

That's a goal worth pursuing, regardless of the patch's benefit, I
agree wholeheartedly. And for that, I'm very grateful of the work you
guys are doing.

cheers,
--renato