<div dir="ltr">Seems no real difference even on my beefy 20-core 40-thread machine, probably because 256/n is still large today.</div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 24, 2017 at 12:25 PM, Davide Italiano <span dir="ltr"><<a href="mailto:davide@freebsd.org" target="_blank">davide@freebsd.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Wed, May 24, 2017 at 12:22 PM, Rui Ueyama via llvm-commits<br>

<<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<br>

> Author: ruiu<br>

> Date: Wed May 24 14:22:34 2017<br>

> New Revision: 303797<br>

><br>

> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=303797&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-<wbr>project?rev=303797&view=rev</a><br>

> Log:<br>

> Improve parallelism of ICF.<br>

><br>

> This is the only place we use threads for ICF. The intention of this code<br>

> was to split an input vector into 256 shards and process them in parallel.<br>

> What the code was actually doing was to split an input into 257 shards,<br>

> process the first 256 shards in parallel, and the remaining one in serial.<br>

><br>

> That means this code takes ceil(256/n)+1 instead of ceil(256/n) where n<br>

> is the number of available CPU cores. The former converges to 2 while<br>

> the latter converges to 1.<br>

><br>

> This patches fixes the above issue.<br>

><br>

<br>

</span>Nice. Any impact on performances?<br>

<br>

Thanks!<br>

<span class="HOEnZb"><font color="#888888"><br>

--<br>

Davide<br>

<br>

"There are no solved problems; there are only problems that are more<br>

or less solved" -- Henri Poincare<br>

</font></span></blockquote></div><br></div>