[PATCH] D79066: [Support] Class to facilitate file locking

Thu May 14 19:03:22 PDT 2020

dblaikie added a comment.

@labath - what do you think, for retrieving an Expected<FileLocker> - static member function of FileLocker, or a non-static member function of raw_ostream?

================
Comment at: llvm/include/llvm/Support/FileSystem.h:1218
+inline Expected<FileLocker>
+tryLock(int FD, std::chrono::milliseconds T = std::chrono::milliseconds(1000)) {
+  FileLocker Lock(FD, T);
----------------
sepavloff wrote:
> dblaikie wrote:
> > labath wrote:
> > > sepavloff wrote:
> > > > labath wrote:
> > > > > sepavloff wrote:
> > > > > > labath wrote:
> > > > > > > sepavloff wrote:
> > > > > > > > labath wrote:
> > > > > > > > > I don't think there should be a default value for the timeout for two reasons:
> > > > > > > > > - it makes it very unobvious that there is any timing out happening (`tryLock(FD)` looks a lot like `mutex.try_lock()` which only does an instantaneous lock check with no retries or time outs)
> > > > > > > > > - the appropriate value of the timeout sounds like something that is going to be very dependent on the exact use case, and the person using it should make a deliberate choice about the value (and explain it). (Speaking of that, I am still not sure your use case really needs timeouts, though that's more of a discussion for the other patch.)
> > > > > > > > You are right, removed default argument.
> > > > > > > > 
> > > > > > > > Timeout is a form of protection against deadlocks. For the particular task of writing to log file infinite wait time is also acceptable, but in other use cases (such as unit tests) timeouts may be convenient.
> > > > > > > I don't think unit tests are a good motivating example. There are plenty of ways a test can not terminate without locking files (or anything else). The solution to that is to have a (fairly high) timeout at the level of the test harness. That we already have.
> > > > > > > 
> > > > > > > So if the unit tests are the main motivation for introducing these timeouts, I think we should remove them, as they add a fair amount of complexity (and flakyness) to the equation.
> > > > > > > 
> > > > > > > Even for the clang use case, whether one prefers a deadlock over the program silently doing the wrong thing (perhaps when the system is under heavy load, like it is when running the test suite), depends on what exactly is he trying to achieve. But that's something I don't feel qualified to judge...
> > > > > > Unit tests are not a motivation for using timeouts, but they demonstrates the benefits. If users can specify timeouts, they have possibility to handle deadlocks. It is up to user to handle such case or not, but the possibility exists. Without timeouts user must behave as if deadlocks never happen. It limits usability of the solution and it is the motivation.
> > > > > Well.. it's not that the user _can_ specify timeouts. In the current setup, the user _must_ specify a timeout -- there is no blocking equivalent of this function. Which is fine, if the only use case we have for it right now wants/needs/etc. a timeout (we are not writing a general purpose library here). However, I remain unconvinced that a timeout is called for in this situation (but you don't have to convince me here -- that's another review).
> > > > Collection of build statistics usually is a side effect of build. If something goes wrong and lock cannot be released, it can cause build failure. It is often more acceptable to get incomplete statistics than to break a build. Timeouts can work as watchdog,  excluding log machinery from culprits.
> > > > 
> > > > However in some cases blocking operations may be more preferable. I think we can solve this problem by adding blocking function as well. It is done in this version.
> > > (I think adding the blocking is good because it may prevent needless proliferation of calls to the timeouted versions. I just want to clarify that I did not consider that as a requirement for this patch.)
> > +1 please don't include timeout support if there's no immediate planned use case.
> The time out support is motivated by the following consideration.
> 
> At present compiler jobs executed in parallel build are independent. Each compiler process has exclusive access to any file it writes to. Allowing writes to the same log file creates dependency between compiler processes, which may affect build process. For example, the following hypothetical scenario. If some process having lock to the shared log file is suspended, other processes may be suspended waiting the lock. The memory is occupied with waiting processes, and the build can fail due to resource shortage. Using timeouts can alleviate this issue at expense of log completeness.
That's a nice hypothetical scenario - but please add it when there's a concrete need in the LLVM project for that functionality, not before.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79066/new/

https://reviews.llvm.org/D79066