[PATCH] D78896: [Support] Add file lock/unlock functions

Mon Jun 1 02:37:49 PDT 2020

labath added inline comments.

================
Comment at: llvm/unittests/Support/Path.cpp:2054-2109
+  std::mutex Start2M;
+  std::condition_variable Start2CV;
+  bool Start2 = false;
+  std::mutex LockM;
+  std::condition_variable LockCV;
+  bool Locked = false;
+
----------------
sepavloff wrote:
> labath wrote:
> > sepavloff wrote:
> > > labath wrote:
> > > > sepavloff wrote:
> > > > > labath wrote:
> > > > > > labath wrote:
> > > > > > > I get the impression this code is much more complicated then needed. There's a lot of synchronization going on but it still does not guarantee the that the file is unlocked while the other thread is inside the `tryLock` call (my goal was to get coverage for the while loop). How about something like:
> > > > > > > ```
> > > > > > > EC = fs::tryLockFile(FD1);
> > > > > > > ASSERT_NO_ERROR(EC);
> > > > > > > EC = fs::tryLockFile(FD2);
> > > > > > > ASSERT_ERROR(EC);
> > > > > > > std::thread LockThread([&] { EC2 = fs::tryLockFile(FD2, std::chrono::minutes(1)); });
> > > > > > > std::this_thread::sleep_for(std::chrono::seconds(1));
> > > > > > > EC = fs::unlockFile(FD1);
> > > > > > > ASSERT_NO_ERROR(EC);
> > > > > > > LockThread.join();
> > > > > > > ASSERT_NO_ERROR(EC2);
> > > > > > > EC = fs::unlockFile(FD2);
> > > > > > > ASSERT_NO_ERROR(EC);
> > > > > > > ```
> > > > > > > It still does not guarantee that the other thread is inside the `tryLockFile` call, but it comes as close as we can get, and it avoids all the condition_variable overhead.
> > > > > > Waiting on a response to this. I still feel that this can be organized in a much simpler way without using so many explicit synchronization primitives.
> > > > > I put a diagram explaining what the test does. Actually there are two events, which ensures that the first attempt to lock file occurs in Thread2 after Thread1 locks the file but before it releases it. So the two calls to `tryLockFile` checks both cases, successful and unsuccessful. Each event requires a mutex and a condvar, so we have 4 synchronization objects. Simpler variants do not guarantee checking the both cases. 
> > > > I believe that each of these events requires *synchronization*, but a condition variable is not the only way to achieve that. Starting and joining a thread is also a form of synchronization, and it is much simpler (and something you have to do anyway.
> > > > 
> > > > So, instead of starting a thread which, as a first order of business blocks on a condition variable, you could just delay starting the thread until such a time that the condition would be satisfied. Basically -- remove `cv.wait_for` and replace `cv.notify` with the creation of the thread object. Then, instead of waiting for the other thread to unblock you, you can just `join` it.
> > > > 
> > > > For the second condition variable, you just create a fresh thread again. Having the thread bodies be smaller would make the code much easier to follow, and it's not like we have to worry about the performance overhead of creating a bunch of small threads here...
> > > Starting the second thread inside the first instead of waiting event indeed makes the test more compact, I rewrote the test accordingly.
> > > 
> > > As for the second lock try, which happens after the file is unlocked, using new thread makes the logic more obscure. I moved synchronization stuff into the new class `Event`, it must make the test shorter and clearer.
> > Creating the Event class does make it a bit better, but I still maintain that this test is too complicated for what it really tests. Take a look at the following test:
> > ```
> > TEST_F(FileSystemTest, lockFileThread) {
> > #if LLVM_ENABLE_THREADS
> >   int FD1, FD2;
> >   SmallString<64> TempPath;
> >   ASSERT_NO_ERROR(fs::createTemporaryFile("test", "temp", FD1, TempPath));
> >   FileRemover Cleanup(TempPath);
> >   ASSERT_NO_ERROR(fs::openFileForReadWrite(TempPath, FD2, fs::CD_OpenExisting,
> >                                            fs::OF_Append));
> > 
> > 
> >   ASSERT_NO_ERROR(fs::tryLockFile(FD1));
> >   ASSERT_ERROR(fs::tryLockFile(FD2));
> >   std::future<std::error_code> Future = std::async(std::launch::async, [&] {
> >     return fs::tryLockFile(FD2, std::chrono::seconds(5));
> >   });
> >   ASSERT_NO_ERROR(fs::unlockFile(FD1));
> >   ASSERT_NO_ERROR(Future.get());
> >   fs::unlockFile(FD2);
> > 
> >   ASSERT_NO_ERROR(fs::tryLockFile(FD1));
> >   ASSERT_ERROR(fs::tryLockFile(FD2));
> >   Future = std::async(std::launch::async, [&] { return fs::lockFile(FD2); });
> >   ASSERT_NO_ERROR(fs::unlockFile(FD1));
> >   ASSERT_NO_ERROR(Future.get());
> >   fs::unlockFile(FD2);
> > 
> > #endif
> > }
> > ```
> > It tests the same thing as the test you wrote -- I obtained by applying series of semantics-preserving simplifications to it. This included fairly simple things like:
> > - inlining
> > - replacing patterns like std::thread(foo).join() with direct calls to `foo`
> > - moving code which does not block outside of a thread -- e.g. asserting that a lock attempt fails does not need to be done in a separate thread because it does not block. Only the blocking calls do.
> > - replacing a thread consisting of a single expression with a call to `std::async`
> > - removing unused variables produced by all of this
> > 
> > However, the end result is a test which is about 3 times shorter than the original (28 lines vs 88), and it's almost linear -- each parallel section is only three lines long. I think it'd be pretty hard to argue that this is not more readable than the original.
> Thank you very much for the code and detailed explanations!
> 
> The way your code checks the result of `fs::tryLockFile` means it relies on particular sequence of statement execution. For the test to be successful the main thread after the creation of a separate thread must continue execution and execute `fs::unlockFile`. In this case when the separate thread starts, it sees unlocked file. It is the most probable case, but not the single. If rescheduling of the main thread occurs after thread creation but before  execution of `fs::unlockFile` or there is a core ready to execute the new thread, this test will fail.
> 
> There is no guarantee of ordering statement execution in different threads unless synchronization objects are used.
I'm sorry, but I am unable to follow this line of reasoning. You're talking about this block of code, right?
```
  std::future<std::error_code> Future = std::async(std::launch::async, [&] {
    return fs::tryLockFile(FD2, std::chrono::seconds(5));
  });
  ASSERT_NO_ERROR(fs::unlockFile(FD1));
  ASSERT_NO_ERROR(Future.get());
  fs::unlockFile(FD2);
```
Before the `std::async` statement, FD1 is locked, FD2 is unlocked. After it, there are two actions that can execute in arbitrary order (or concurrently): `fs::unlockFile(FD1)` on the main thread and `fs::tryLockFile(FD2, std::chrono::seconds(5))` on the "async" thread. If the `unlockFile` executes first, it will unlock the file, and the subsequent `tryLockFile` will immediately succeed. If `tryLockFile` is scheduled first, then it will get a lock failure and will start to wait. While it waits the main thread will get scheduled, unlock the file, and then the `tryLockFile` will succeed again. If an evil scheduler decides to not schedule the main thread for five seconds, then `tryLockFile` will fail, but there's nothing we can do about that except increase the timeout.

This is the exact same situation that can happen with condition variables:
```
// thread 2
    ...
    DoUnlockEvent.signal();
    if (UseBlockingCall)
      ECT2b = fs::lockFile(FD2);
    else
      ECT2b = fs::tryLockFile(FD2, std::chrono::seconds(5));
    ...

// thread 1
    ECT1a = fs::tryLockFile(FD1);
    if (ECT1a)
      return;
    auto Thread2 = std::thread(Thread2Body);
    DoUnlockEvent.wait();
    ECT1b = fs::unlockFile(FD1);
    ...
```
After thread1 is unblocked by `DoUnlockEvent.signal();`, we again have two runnable threads (the `if` statement on thread 2, and the `fs::unlockFile(FD1)` call on thread 1) and it is up to the scheduler to determine their order.

It's not true that there are no synchronization objects here. We have `std::future` and the thread object contained within. Creation of the future (via std::async) establishes a happens-before relationship between the actions taken before `std::async` is called on the main thread, and the body of the async thread. This is exactly what happens with `DoUnlockEvent.signal()` and `DoUnlockEvent.wait()`. And calling `future::get` establishes a happens-before relationship between the body of the async thread and the code that comes after the `get` call. That's exactly what would happen with `Thread2.join()` in your example. 

My point is that "launching a async thread" is a simpler way of synchronizing than "waiting on a cv", and "getting a future" is simpler than "joining a thread which 'returns' a result through a global variable".

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78896/new/

https://reviews.llvm.org/D78896