[Mlir-commits] [mlir] 8ff4276 - [mlir] Use thread-pool's notion of thread count instead of requerying system.

Thu Dec 23 17:05:06 PST 2021

Author: Stella Laurenzo
Date: 2021-12-23T17:04:45-08:00
New Revision: 8ff42766d1effcee2e4ebb0016b02430263cb4f0

URL: https://github.com/llvm/llvm-project/commit/8ff42766d1effcee2e4ebb0016b02430263cb4f0
DIFF: https://github.com/llvm/llvm-project/commit/8ff42766d1effcee2e4ebb0016b02430263cb4f0.diff

LOG: [mlir] Use thread-pool's notion of thread count instead of requerying system.

The computed number of hardware threads can change over the life of the process based on affinity changes. Since we need a data structure that is at least as large as the maximum parallelism, it is important to use the value that was actually latched for the thread pool we will be dispatching work to.

Also adds an assert specifically for if it doesn't line up (I was getting a crash on an index into the vector).

Differential Revision: https://reviews.llvm.org/D116257

Added: 
    

Modified: 
    mlir/lib/Transforms/Inliner.cpp

Removed: 
    


################################################################################
diff  --git a/mlir/lib/Transforms/Inliner.cpp b/mlir/lib/Transforms/Inliner.cpp
index 32b12e7a3bf40..cbeb94980dcca 100644

--- a/mlir/lib/Transforms/Inliner.cpp
+++ b/mlir/lib/Transforms/Inliner.cpp
@@ -663,7 +663,7 @@ LogicalResult InlinerPass::optimizeSCC(CallGraph &cg, CGUseList &useList,
 
   // Optimize each of the nodes within the SCC in parallel.
   if (failed(optimizeSCCAsync(nodesToVisit, context)))
-      return failure();
+    return failure();
 
   // Recompute the uses held by each of the nodes.
   for (CallGraphNode *node : nodesToVisit)
@@ -674,11 +674,13 @@ LogicalResult InlinerPass::optimizeSCC(CallGraph &cg, CGUseList &useList,
 LogicalResult
 InlinerPass::optimizeSCCAsync(MutableArrayRef<CallGraphNode *> nodesToVisit,
                               MLIRContext *ctx) {
-  // Ensure that there are enough pipeline maps for the optimizer to run in
-  // parallel. Note: The number of pass managers here needs to remain constant
+  // We must maintain a fixed pool of pass managers which is at least as large
+  // as the maximum parallelism of the failableParallelForEach below.
+  // Note: The number of pass managers here needs to remain constant
   // to prevent issues with pass instrumentations that rely on having the same
   // pass manager for the main thread.
-  size_t numThreads = llvm::hardware_concurrency().compute_thread_count();
+  llvm::ThreadPool &threadPool = ctx->getThreadPool();
+  size_t numThreads = threadPool.getThreadCount();
   if (opPipelines.size() < numThreads) {
     // Reserve before resizing so that we can use a reference to the first
     // element.
@@ -700,6 +702,8 @@ InlinerPass::optimizeSCCAsync(MutableArrayRef<CallGraphNode *> nodesToVisit,
       bool expectedInactive = false;
       return isActive.compare_exchange_strong(expectedInactive, true);
     });
+    assert(it != activePMs.end() &&
+           "could not find inactive pass manager for thread");
     unsigned pmIndex = it - activePMs.begin();
 
     // Optimize this callable node.