[libcxx-commits] [libcxx] Reverts around time_zone (PR #95058)

Tue Jun 11 18:24:45 PDT 2024

vitalybuka wrote:

@ldionne  I am not sure  why you are on such defense on reverts. 

>  It doesn't make the code that was committed wrong as in "incorrect".
> if I enable a CI machine that is vastly underpowered and causes a bunch of tests to start timing out, whose fault is it? 

It's not about blaming who is this fault. It's about doing necessary to maintain the code health.

It's not ONLY about correctness. Our build bots are way to keep compiler from regressing. And time to time revert of correct code is needed to keep bots functioning, maybe just for upgrading or re-configuring build steps. Keeping attention to bots is the price to pay to have good diverse test coverage and detect REAL bugs early.

For problematic bots:
1. We can decide that those bots are not helpful, and remove them from master, or move to staging, where it does not spam commiters.  
2. Or maybe that bot is extremely important we will need to figure-out something else, including delaying  landing some patches.

But it's not the case we have here. That bot is consistently green and well maintained.

> So, to summarize: When we ship a bug and we can't quickly fix it forward, we revert. We all agree on that, it's the LLVM way. However, many things don't qualify as "a bug" yet still break CI. 

And until proven there is always safe to assume that there is a bug than CI failure. In particular case possibility of 'deadlock' what is concerned me.

And my point that threshold for revert should be significantly lower than TRUE bug. Some times it's need to be done to figure out how to proceed further. If it's CI issue, it's better to revert, bot maintainers may need some time catching up, e.g. when we upgrade some deps requirement, we may revert back and forth that patch, until we upgrade all bots. It's way better then we broke all bots and fix forward them next few weeks keeping then non-functional.

Reverting is a trivial thing, it's not some terrible thing that must never happens.
In healthy work flow it should be possible to speculatively revert a patch even if there is a doubt that this is the cause.
It's always trivial to reland the patch if the revert is not helpful.

E.g. here @mordante was able quickly guess problems, and if we reverted that yesterday, we'd already had it relanded with UNSUPPORTED spending less time than we spent arguing here.

It works for the rest of LLVM. works for large OSS projects, e.g. Chromium, why libcxx needs to be special?

To summarize:
1. cost of false revert is not zero, but extremely low
1. cost of false non-revert significant 
So reverting, even when not, or even just for fun ( don't quote me on this one :) ) , should be acceptable practice. We are not in the state when someone is abusing reverts, and if we have abuser, it would be easy to resolve.

>@vitalybuka Is there even still a problem? You marked the test as unsupported under hwasan, so it's not running under your configuration. Did that work? 

Yes, it works.
I guess will well have to do the same for asan, it's inconsistently timeouts there as well https://lab.llvm.org/buildbot/#/builders/239/builds/7415
Ideally the test needs  to be speedup somehow.

> A better fix would be to use --param long_tests=False to disable all the slow-running tests in the suite instead, and to mark the local_time test as REQUIRES: long_tests.

Thanks! I didn't know. We will use this. Then disabling asan is unnecessary,

This particular issue is resolved.

My main concern is that reverts in libcxx are often an issue and more complicated than in the rest of LLVM, making bot maintenance difficult. Can we do better?

https://github.com/llvm/llvm-project/pull/95058