[PATCH] D88666: DirectoryWatcher: add an implementation for Windows

Adrian McCarthy via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 6 14:10:18 PDT 2020


amccarth requested changes to this revision.
amccarth added a comment.
This revision now requires changes to proceed.

Some of this is nitpicky/opinionated, but the race condition is real.  We need a reliable way to signal the watcher thread when it's time to exit.  Options I see are:

1. Use the FindFirstChange function to get a handle to wait on for the directory change and create a separate event to signal when it's time to exit.  The watcher thread would use WaitForMultipleObjects.  If' it's a directory change, then it can make a synchronous call the ReadDirectoryChangesW, knowing that there's info available.  (It could possibly even do the callback at that point, without the need for a separate handler thread.)

2. Continue to use the ReadDirectoryChangesW with overlapped IO, but, instead of waiting in GetOverlappedResult, it would first use WaitForMultipleObjects on the event in the overlapped IO and a distinct event used to tell the threat to exit (as in #1).



================
Comment at: clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp:36
+  alignas(DWORD)
+  CHAR Buffer[4 * (sizeof(FILE_NOTIFY_INFORMATION) + MAX_PATH * sizeof(WCHAR))];
+
----------------
compnerd wrote:
> amccarth wrote:
> > If it were me, I'd probably make this a `std::vector`.
> > 
> > * If an off-by-one bug causes an overrun of one WCHAR, you could trash a crucial member variable.  On the heap, the damage is less likely to be catastrophic.
> > * You wouldn't need `alignas`.
> > * I don't think these are created in a tight loop, so the overhead doesn't concern me.
> > 
> > Also, I'd probably go with a slightly more descriptive name, like `Notifications` rather than `Buffer`.
> The `alignas` is because the documentation states that the buffer should be DWORD aligned.  It is more for pedantic reasons rather than anything else.  I think that making it a catastrophic failure is a good thing though - it would catch the error :)  You are correct about the allocation - it is once per watch.  I'll rename it at least.
But it's still an arbitrarily-sized buffer in the middle of a class definition.  If you change your mind about how big to make it, it changes the definition of the class.  The buffer is going to be accessed using pointer arithmetic, which is generally dangerous.  Moving the buffer out of the class avoids both of those problems.

The alignas here is _not_ pedantic, it's essential.  Without it, you could easily have an alignment problem.  But if you used a vector, you'd know that it would always be suitably aligned.


================
Comment at: clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp:82
+    DirectoryWatcherCallback Receiver)
+    : Callback(Receiver) {
+  // Pre-compute the real location as we will be handing over the directory
----------------
compnerd wrote:
> amccarth wrote:
> > There's a lot going on in this constructor.  Is this how the other implementations are arranged?
> > 
> > Would it make sense to just initialize the object, and save most of the actual work to a `Watch` method?
> Largely the same.  However, the majority of the "work" is actually the thread proc for the two threads.
Let me put it another way.  Constructors cannot return errors and LLVM does use exceptions, so things that can fail generally shouldn't be in the constructor.  This code is accessing the file system, creating and event, and spawning two threads.  Any of those things can fail, but you've got no way to let the caller know whether something went wrong.

If the lambdas were really short, then it would be easy to see that they're thread procs.  But they're not, so they're hard to find and understand.  If they were private member functions with descriptive names, the code would be easier to understand.


================
Comment at: clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp:92
+    Length = GetFinalPathNameByHandleW(DirectoryHandle, Buffer.data(),
+                                       Buffer.capacity(), 0);
+    Buffer.resize(Length);
----------------
Using `Buffer.data()` when you've only reserved space is undefined behavior.  You should used `resize` instead of `reserve` and then pass the `size` rather than the `capacity`.

Be aware that, while unlikely, this could still fail.  The directory could have been removed or renamed between calls or the caller could have passed a bad handle to begin with.


================
Comment at: clang/lib/DirectoryWatcher/windows/DirectoryWatcher-windows.cpp:195
+  // Signal the Watcher to exit.
+  SetEvent(Overlapped.hEvent);
+  HandlerThread.join();
----------------
I don't think this is a reliable way to get the watcher thread to exit.

You've overloaded the meaning of the event object and are racing the i/o system who plans to use the event for its own purposes.

Suppose the watcher thread is waiting in GetOverlappedResult.  If you set the event, then the other fields of the OVERLAPPED struct are in an indeterminant state.  I don't see how the watcher thread can distinguish your exit signal from a normal completion.

In another case, suppose to set the event when the watcher thread has just completed a GetOverlappedResult but hasn't yet started the next ReadDirectoryChangesW.  The watcher thread won't notice the signal.  It'll just loop around and the next ReadDirectoryChangesW will reset it.  This will hang.

This one reason why I suggested FindFirstChange and WaitForMultipleObjects.  It lets you have distinct events objects for distinct events.  It also has the benefit that you wouldn't need the overlapped IO and you possibly could combine the watcher and handler functions into a single thread.


================
Comment at: clang/unittests/DirectoryWatcher/CMakeLists.txt:1
-if(APPLE OR CMAKE_SYSTEM_NAME MATCHES "Linux")
+if(APPLE OR CMAKE_SYSTEM_NAME MATCHES "Linux" OR CMAKE_SYSTEM_NAME STREQUAL Windows)
 
----------------
I'm not a Cmake expert, but I"m curious way `MATCHES "Linux"` but `STREQUAL Windows`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88666/new/

https://reviews.llvm.org/D88666



More information about the cfe-commits mailing list