[PATCH] D145101: [clang][deps] NFC: Simplify worker loop

Wed Mar 1 12:47:43 PST 2023

jansvoboda11 added inline comments.

================
Comment at: clang/tools/clang-scan-deps/ClangScanDeps.cpp:802
   for (unsigned I = 0; I < Pool.getThreadCount(); ++I) {
-    Pool.async([I, &Lock, &Index, &Inputs, &HadErrors, &FD, &PD, &WorkerTools,
-                &DependencyOS, &Errs]() {
+    Pool.async([&, I]() {
       llvm::StringSet<> AlreadySeenModules;
----------------
benlangmuir wrote:
> Why are we second-guessing the `ThreadPool` at all? I would think we should do
> 
> ```
> for (unsigned Index = 0; E = Inputs.size(); Index + E; ++Index) {
>   Pool.async([Index, &]{ ... });
> }
> ```
> 
> Then the thread pool is responsible for dispatching the tasks when it has available resources instead of us manually looping inside the threads.
I guess it was originally done this way because we have a number of `DependencyScanning{Tool,Worker}` instances (the same number as there are threads in the pool) we want to reuse to get the advantage of local caching in `DependencyScanningWorkerFilesystem`. I guess having one `Worker` instance "pinned" to one thread might get us better memory access patterns? (I'm just guessing here.)

We could implement your simplification and keep the "pinning" if the thread pool gave us index of the thread the async task is running on. But it doesn't seem like we have that API.

That means we'd probably need to implement some `Worker` queue each async task would take from. I'm not sure that's better/simpler than monotonically increasing input index.

Maybe @Bigcheese can explain this in more detail?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145101/new/

https://reviews.llvm.org/D145101