[lld] r303797 - Improve parallelism of ICF.

Rui Ueyama via llvm-commits llvm-commits at lists.llvm.org
Wed May 24 12:22:35 PDT 2017


Author: ruiu
Date: Wed May 24 14:22:34 2017
New Revision: 303797

URL: http://llvm.org/viewvc/llvm-project?rev=303797&view=rev
Log:
Improve parallelism of ICF.

This is the only place we use threads for ICF. The intention of this code
was to split an input vector into 256 shards and process them in parallel.
What the code was actually doing was to split an input into 257 shards,
process the first 256 shards in parallel, and the remaining one in serial.

That means this code takes ceil(256/n)+1 instead of ceil(256/n) where n
is the number of available CPU cores. The former converges to 2 while
the latter converges to 1.

This patches fixes the above issue.

Modified:
    lld/trunk/COFF/ICF.cpp
    lld/trunk/ELF/ICF.cpp

Modified: lld/trunk/COFF/ICF.cpp
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/COFF/ICF.cpp?rev=303797&r1=303796&r2=303797&view=diff
==============================================================================
--- lld/trunk/COFF/ICF.cpp (original)
+++ lld/trunk/COFF/ICF.cpp Wed May 24 14:22:34 2017
@@ -193,9 +193,9 @@ void ICF::forEachClass(std::function<voi
   size_t NumShards = 256;
   size_t Step = Chunks.size() / NumShards;
   for_each_n(parallel::par, size_t(0), NumShards, [&](size_t I) {
-    forEachClassRange(I * Step, (I + 1) * Step, Fn);
+    size_t End = (I == NumShards - 1) ? Chunks.size() : (I + 1) * Step;
+    forEachClassRange(I * Step, End, Fn);
   });
-  forEachClassRange(Step * NumShards, Chunks.size(), Fn);
 }
 
 // Merge identical COMDAT sections.

Modified: lld/trunk/ELF/ICF.cpp
URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/ICF.cpp?rev=303797&r1=303796&r2=303797&view=diff
==============================================================================
--- lld/trunk/ELF/ICF.cpp (original)
+++ lld/trunk/ELF/ICF.cpp Wed May 24 14:22:34 2017
@@ -326,9 +326,9 @@ void ICF<ELFT>::forEachClass(std::functi
   size_t NumShards = 256;
   size_t Step = Sections.size() / NumShards;
   parallelForEachN(0, NumShards, [&](size_t I) {
-    forEachClassRange(I * Step, (I + 1) * Step, Fn);
+    size_t End = (I == NumShards - 1) ? Sections.size() : (I + 1) * Step;
+    forEachClassRange(I * Step, End, Fn);
   });
-  forEachClassRange(Step * NumShards, Sections.size(), Fn);
   ++Cnt;
 }
 




More information about the llvm-commits mailing list