[Lldb-commits] [lldb] DynamicLoaderDarwin load images in parallel with preload (PR #110646)

Mon Oct 14 15:27:50 PDT 2024

================
@@ -642,26 +652,86 @@ ModuleSP DynamicLoaderDarwin::GetDYLDModule() {
 
 void DynamicLoaderDarwin::ClearDYLDModule() { m_dyld_module_wp.reset(); }
 
+template <typename InputIterator, typename ResultType>
+static std::vector<ResultType> parallel_map(
+    llvm::ThreadPoolInterface &threadPool, InputIterator first,
+    InputIterator last,
+    llvm::function_ref<ResultType(
+        const typename std::iterator_traits<InputIterator>::value_type &)>
+        transform) {
+  const auto size = std::distance(first, last);
+  std::vector<ResultType> results(size);
+  if (size > 0) {
+    llvm::ThreadPoolTaskGroup taskGroup(threadPool);
+    auto it = first;
+    for (ssize_t i = 0; i < size; ++i, ++it) {
+      taskGroup.async([&, i, it]() { results[i] = transform(*it); });
+    }
+    taskGroup.wait();
+  }
+  return results;
+}
----------------
DmT021 wrote:

We can but it may not be better. The implementations of map and parallel_map aren't important from the business logic perspective. The only important thing about PreloadModulesFromImageInfos is checking the setting. 
Also, the code will be a bit more complicated than shown in the snippet you provided:
- we are creating a task group and waiting for it even when we are going to load the modules sequentially.
- we are checking the setting on each iteration of the loop.

This is probably almost free (I'm not sure if that's the case for `task_group.wait()` though) but it's still a bit of an eyesore.

But a version of this without these two disadvantages will have a repeating for loop. Something like
```
std::vector<std::pair<DynamicLoaderDarwin::ImageInfo, ModuleSP>>
DynamicLoaderDarwin::PreloadModulesFromImageInfos(
    const ImageInfo::collection &image_infos) {
  const auto size = image_infos.size();
  std::vector<std::pair<DynamicLoaderDarwin::ImageInfo, ModuleSP>> images(
      size);
  auto lambda = [&](size_t i, ImageInfo::collection::const_iterator it) {
    const auto &image_info = *it;
    images[i] = std::make_pair(
        image_info, FindTargetModuleForImageInfo(image_info, true, nullptr));
  };
  auto it = image_infos.begin();
  bool is_parallel_load =
      DynamicLoaderDarwinProperties::GetGlobal().GetEnableParallelImageLoad();
  if (is_parallel_load) {
    if (size > 0) {
      llvm::ThreadPoolTaskGroup taskGroup(Debugger::GetThreadPool());
      for (size_t i = 0; i < size; ++i, ++it) {
        taskGroup.async(lambda, i, it);
      }
      taskGroup.wait();
    }
  } else {
    for (size_t i = 0; i < size; ++i, ++it) {
      lambda(i, it);
    }
  }
  return images;
}
```

So now we have a bigger room for error in future refactoring here and the important part (checking the setting) is less visible in the code as well.

Also, consider how would this code look like if had `std::transform` with `ExecutionPolicy` available. I think it'd look about the same as the current implementation with `map` and `parallel_map`.

https://github.com/llvm/llvm-project/pull/110646