[lld] r287042 - Use one task per iteration in parallel_for_loop.

Rafael EspĂ­ndola via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 16 05:47:07 PST 2016


> If you don't want to revert it, how about this.
>
> The problem in the original code is the task size is fixed to 1024. We can
> make it adaptive to the size of the input, so that we will always have
> reasonable number of tasks.

So, there is quite a bit of work to be done if we get serious about
threads. We have to investigate why the pool executor has such a high
overhead and figure out the right way to split work so that each
thread is not stepping over each other. We should very likely also
create one thread per core, not one per SMT.

For example of possible improvement, when outputting the file it is
probably profitable to make write do nothing but write and partition
the input sections over all the file so that each thread can allocate
local memory, relocate there and then write its work to the correct
output offset with a single contiguous write call.

So in general finding the correct granularity is something that I
think should be explicitly done in the caller.

Given that we still have a lot of work before threading becomes a
priority, how about the attached compromise. It just writes each
output thread in parallel. In my testcase it brings the linker back to
the previous performance when not using --block-id.

Cheers,
Rafael
-------------- next part --------------
diff --git a/ELF/OutputSections.cpp b/ELF/OutputSections.cpp
index e2c0afb..675b320 100644
--- a/ELF/OutputSections.cpp
+++ b/ELF/OutputSections.cpp
@@ -18,7 +18,6 @@
 #include "SymbolTable.h"
 #include "SyntheticSections.h"
 #include "Target.h"
-#include "lld/Core/Parallel.h"
 #include "llvm/Support/Dwarf.h"
 #include "llvm/Support/MD5.h"
 #include "llvm/Support/MathExtras.h"
@@ -610,13 +609,8 @@ template <class ELFT> void OutputSection<ELFT>::writeTo(uint8_t *Buf) {
   ArrayRef<uint8_t> Filler = Script<ELFT>::X->getFiller(this->Name);
   if (!Filler.empty())
     fill(Buf, this->Size, Filler);
-  if (Config->Threads) {
-    parallel_for_each(Sections.begin(), Sections.end(),
-                      [=](InputSection<ELFT> *C) { C->writeTo(Buf); });
-  } else {
-    for (InputSection<ELFT> *C : Sections)
-      C->writeTo(Buf);
-  }
+  for (InputSection<ELFT> *C : Sections)
+    C->writeTo(Buf);
   // Linker scripts may have BYTE()-family commands with which you
   // can write arbitrary bytes to the output. Process them if any.
   Script<ELFT>::X->writeDataBytes(this->Name, Buf);
diff --git a/ELF/Writer.cpp b/ELF/Writer.cpp
index 9ab4cda..8762ca7 100644
--- a/ELF/Writer.cpp
+++ b/ELF/Writer.cpp
@@ -17,6 +17,7 @@
 #include "SymbolTable.h"
 #include "SyntheticSections.h"
 #include "Target.h"
+#include "lld/Core/Parallel.h"
 
 #include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/StringSwitch.h"
@@ -1545,9 +1546,18 @@ template <class ELFT> void Writer<ELFT>::writeSections() {
     Out<ELFT>::Opd->writeTo(Buf + Out<ELFT>::Opd->Offset);
   }
 
-  for (OutputSectionBase *Sec : OutputSections)
-    if (Sec != Out<ELFT>::Opd && Sec != Out<ELFT>::EhFrameHdr)
-      Sec->writeTo(Buf + Sec->Offset);
+  if (Config->Threads) {
+    parallel_for_each(OutputSections.begin(), OutputSections.end(),
+                      [=](OutputSectionBase *Sec) {
+                        if (Sec != Out<ELFT>::Opd &&
+                            Sec != Out<ELFT>::EhFrameHdr)
+                          Sec->writeTo(Buf + Sec->Offset);
+                      });
+  } else {
+    for (OutputSectionBase *Sec : OutputSections)
+      if (Sec != Out<ELFT>::Opd && Sec != Out<ELFT>::EhFrameHdr)
+        Sec->writeTo(Buf + Sec->Offset);
+  }
 
   OutputSectionBase *ARMExidx = findSection(".ARM.exidx");
   if (!Config->Relocatable)


More information about the llvm-commits mailing list