[PATCH] D101699: [Support/Parallel] Add a special case for 0/1 items to llvm::parallel_for_each.
Chris Lattner via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat May 1 14:07:34 PDT 2021
lattner created this revision.
Herald added subscribers: dexonsmith, rriddle.
lattner requested review of this revision.
Herald added subscribers: llvm-commits, stephenneuendorffer.
Herald added a project: LLVM.
This avoids the non-trivial overhead of creating a TaskGroup in these degenerate
cases, but also exposes parallelism. It turns out that the default executor
underlying TaskGroup prevents recursive parallelism - so an instance of a task
group being alive will make nested ones become serial.
This is a big issue in MLIR in some dialects, if they have a single instance of
an outer op (e.g. a firrtl.circuit) that has many parallel ops within it (e.g.
a firrtl.module). This patch side-steps the problem by avoiding creating the
TaskGroup in the unneeded case. See this issue for more details:
https://github.com/llvm/circt/issues/993
Note that this isn't a really great solution for the general case of nested
parallelism. A redesign of the TaskGroup stuff would be better, but would be
a much more invasive change.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D101699
Files:
llvm/include/llvm/Support/Parallel.h
Index: llvm/include/llvm/Support/Parallel.h
===================================================================
--- llvm/include/llvm/Support/Parallel.h
+++ llvm/include/llvm/Support/Parallel.h
@@ -129,9 +129,20 @@
template <class IterTy, class FuncTy>
void parallel_for_each(IterTy Begin, IterTy End, FuncTy Fn) {
+ // If we have zero or one items, then do not incur the overhead of spinning up
+ // a task group. They are surprisingly expensive, and because they do not
+ // support nested parallelism, a single entry task group can block parallel
+ // execution underneath them.
+ auto NumItems = std::distance(Begin, End);
+ if (NumItems <= 1) {
+ if (NumItems)
+ Fn(*Begin);
+ return;
+ }
+
// Limit the number of tasks to MaxTasksPerGroup to limit job scheduling
// overhead on large inputs.
- ptrdiff_t TaskSize = std::distance(Begin, End) / MaxTasksPerGroup;
+ ptrdiff_t TaskSize = NumItems / MaxTasksPerGroup;
if (TaskSize == 0)
TaskSize = 1;
@@ -145,9 +156,20 @@
template <class IndexTy, class FuncTy>
void parallel_for_each_n(IndexTy Begin, IndexTy End, FuncTy Fn) {
+ // If we have zero or one items, then do not incur the overhead of spinning up
+ // a task group. They are surprisingly expensive, and because they do not
+ // support nested parallelism, a single entry task group can block parallel
+ // execution underneath them.
+ auto NumItems = End - Begin;
+ if (NumItems <= 1) {
+ if (NumItems)
+ Fn(Begin);
+ return;
+ }
+
// Limit the number of tasks to MaxTasksPerGroup to limit job scheduling
// overhead on large inputs.
- ptrdiff_t TaskSize = (End - Begin) / MaxTasksPerGroup;
+ ptrdiff_t TaskSize = NumItems / MaxTasksPerGroup;
if (TaskSize == 0)
TaskSize = 1;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D101699.342176.patch
Type: text/x-patch
Size: 1770 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210501/5edb5dde/attachment.bin>
More information about the llvm-commits
mailing list