<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Sep 18, 2016 at 4:12 AM, Carsten Mattner <span dir="ltr"><<a href="mailto:carstenmattner@gmail.com" target="_blank">carstenmattner@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Sun, Sep 18, 2016 at 5:45 AM, Xinliang David Li <<a href="mailto:xinliangli@gmail.com">xinliangli@gmail.com</a>> wrote:<br>

> As Mehdi mentioned, thinLTO backend processes use very little memory, you<br>

> may get away without any additional flags (neither -Wl,--plugin-opt=jobs=..,<br>

> nor -Dxxx for cmake to limit link parallesm) if your build machine has<br>

> enough memory. Here is some build time data of parallel linking (with<br>

> ThinLTO) 52 binaries in clang build (linking parallelism equals ninja<br>

> parallelism). The machine has 32 logical cores and 64GB memory.<br>

><br>

> 1) Using the default ninja parallelism, the peak 1min load-average is 537.<br>

> The total elapse time is 9m43s<br>

> 2) Using ninja -j16, the peak load is 411. The elapse time is 8m26s<br>

> 3) ninja -j8 : elapse time is 8m34s<br>

> 4) ninja -j4 : elapse time is 8m50s<br>

> 5) ninja  -j2 : elapse time is 9m54s<br>

> 6) ninja -j1 : elapse time is 12m3s<br>

><br>

> As you can see, doing serial thinLTO linking across multiple binaries do not<br>

> give you the best performance. The build performance peaked at j16 in this<br>

> configuration.   You may need to find your best LLVM_PARALLEL_LINK_JOBS<br>

> value.<br>

<br>

</span>What did you set LLVM_PARALLEL_LINK_JOBS to?<br>

Maybe I should first try to leave it unset and see if it fits within<br>

my machine's<br>

hardware limits.<br></blockquote><div><br></div><div>It was left unset in the experiments.</div><div><br></div><div>David </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<span class=""><br>

> Having said that,  there is definitely  room for ThinLTO usability<br>

> improvement so that ThinLTO parallel backend can coordinate well with the<br>

> build system's parallelism so that user does not need to figure out the<br>

> sweet spot.<br>

<br>

</span>Definitely. If parallelism can be controlled on multiple layers, an<br>

outer layer's<br>

setting ought to influence it in a reasonable way to make it more intuitive<br>

to use.<br>

</blockquote></div><br></div></div>