[PATCH] D114845: [llvm] [Debuginfod] DebuginfodCollection and DebuginfodServer for tracking local debuginfo.

Thu Mar 3 11:52:28 PST 2022

noajshu marked an inline comment as done and an inline comment as not done.
noajshu added a comment.

Thanks @mysterymath and @fche2 for many very helpful comments!
@mysterymath suggested we could perform the first update manually, then print a message like "ready to accept connections". This way the test client knows when it can ask for artifacts. This seems logical so I will make this change in D114846 <https://reviews.llvm.org/D114846>.

Regarding logging:
@mysterymath pointed out that logging by inserting strings to `dbgs()` is thread-unsafe. @fche2 pointed out that elfutils' debuginfod exports Prometheus metrics. I would advocate for keeping some logging facility in the application if only because it is helpful for testing and debugging the code. I am not aware of an existing logging framework in LLVM, so I have created a simple `std::queue`-based logging class `DebuginfodLog` using a `sys::Mutex` to synchronize access. In the future this could be upgraded to support Prometheus exports or other features. Please let me know if you think this will suffice, or if you would prefer an alternative solution to the logging problem. Thanks a lot!

================
Comment at: llvm/lib/Debuginfod/Debuginfod.cpp:315-319
+    // Wait for the number of concurrent jobs to go down
+    while (NumTasksRemaining > Concurrency) {
+      LLVM_DEBUG(dbgs() << NumTasksRemaining << " tasks remaining\n";);
+      std::this_thread::sleep_for(std::chrono::milliseconds(50));
+    }
----------------
mysterymath wrote:
> Is there an advantage for manually managing the concurrency here, over passing it as an argument to ThreadPool, `std::min`-ed with the hardware concurrency?
> From a cursor look at ThreadPool's API, each async call after the thread pool is full should just more-or-less push a std::function<void()> onto a vector.
As this is within a directory iterator loop, my concern was for when there is a large number of files within that directory. If we add tasks to the ThreadPool faster than they are completed, the memory usage of that vector of `std::function<void()>`s becomes unbounded. So I thought it best to manage the progress through the loop more manually. What do you think?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114845/new/

https://reviews.llvm.org/D114845