[libcxx-commits] [libcxx] Reverts around time_zone (PR #95058)

Wed Jun 12 09:19:00 PDT 2024

mordante wrote:

> @ldionne I am not sure why you are on such defense on reverts.
> 
> > It doesn't make the code that was committed wrong as in "incorrect".
> > if I enable a CI machine that is vastly underpowered and causes a bunch of tests to start timing out, whose fault is it?
> 
> It's not about blaming who is this fault. It's about doing necessary to maintain the code health.
> 
> It's not ONLY about correctness. Our build bots are way to keep compiler from regressing. And time to time revert of correct code is needed to keep bots functioning, maybe just for upgrading or re-configuring build steps. Keeping attention to bots is the price to pay to have good diverse test coverage and detect REAL bugs early.
> 
> For problematic bots:
> 
>     1. We can decide that those bots are not helpful, and remove them from master, or move to staging, where it does not spam commiters.
> 
>     2. Or maybe that bot is extremely important we will need to figure-out something else, including delaying  landing some patches.
> 
> 
> But it's not the case we have here. That bot is consistently green and well maintained.

As mentioned before, the sanitizer bots have false positives every now and then. I agree with @ldionne that a time-out in a bot is not per-se an error in our code.

> > So, to summarize: When we ship a bug and we can't quickly fix it forward, we revert. We all agree on that, it's the LLVM way. However, many things don't qualify as "a bug" yet still break CI.
> 
> And until proven there is always safe to assume that there is a bug than CI failure. In particular case possibility of 'deadlock' what is concerned me.
> 
> And my point that threshold for revert should be significantly lower than TRUE bug. Some times it's need to be done to figure out how to proceed further. If it's CI issue, it's better to revert, bot maintainers may need some time catching up, e.g. when we upgrade some deps requirement, we may revert back and forth that patch, until we upgrade all bots. It's way better then we broke all bots and fix forward them next few weeks keeping then non-functional.
> 
> Reverting is a trivial thing, it's not some terrible thing that must never happens. In healthy work flow it should be possible to speculatively revert a patch even if there is a doubt that this is the cause. It's always trivial to reland the patch if the revert is not helpful.

Reverting and relanding is not always trivial. Temporary disabling tests on configurations that do not work is a solution we use in libc++.

> E.g. here @mordante was able quickly guess problems, and if we reverted that yesterday, we'd already had it relanded with UNSUPPORTED spending less time than we spent arguing here.
> 
> It works for the rest of LLVM. works for large OSS projects, e.g. Chromium, why libcxx needs to be special?
> 
> To summarize:
> 
>     1. cost of false revert is not zero, but extremely low
> 
>     2. cost of false non-revert significant
>        So reverting, even when not, or even just for fun ( don't quote me on this one :) ) , should be acceptable practice. We are not in the state when someone is abusing reverts, and if we have abuser, it would be easy to resolve.
> 
> 
> > @vitalybuka Is there even still a problem? You marked the test as unsupported under hwasan, so it's not running under your configuration. Did that work?
> 
> Yes, it works. I guess will well have to do the same for asan, it's inconsistently timeouts there as well https://lab.llvm.org/buildbot/#/builders/239/builds/7415 Ideally the test needs to be speedup somehow.
> 
> > A better fix would be to use --param long_tests=False to disable all the slow-running tests in the suite instead, and to mark the local_time test as REQUIRES: long_tests.
> 
> Thanks! I didn't know. We will use this. Then disabling asan is unnecessary,
> 
> This particular issue is resolved.
> 
> My main concern is that reverts in libcxx are often an issue and more complicated than in the rest of LLVM, making bot maintenance difficult. Can we do better?

We try to do better, in libc++ we have a pre-commit CI to test all our supported configurations. This means patches in libc++ typically have been tested in the CI more than typical patches in the LLVM project. My personal issue with reverting directly is that it sometimes is easier to add a patch that temporary disables a configuration. Especially since buildbot maintainers sometimes don't have time to look at an issue in a timely fashion. That means development of a feature is blocked until that can be resolved. (Using stacked patches in GitHub, unlike in Phabricator, is quite painful. This is already makes development of larger features harder than previously.)

Note that in cases where I expect issues in the code due to CI issues I revert my patches. In this case I only got 1 e-mail of a failing build, not of all 3 failing builds.

https://github.com/llvm/llvm-project/pull/95058