<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Nov 13, 2013 at 7:40 PM, Rick Foos <span dir="ltr"><<a href="mailto:rfoos@codeaurora.org" target="_blank">rfoos@codeaurora.org</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000"><div class="im">

    <div>On 11/13/2013 06:19 PM, Sean Silva

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr"><br>

        <div class="gmail_extra"><br>

          <br>

          <div class="gmail_quote">On Wed, Nov 13, 2013 at 2:41 PM, Rick

            Foos <span dir="ltr"><<a href="mailto:rfoos@codeaurora.org" target="_blank">rfoos@codeaurora.org</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000">

                <div>Sorry for the delay, <br>

                  <br>

                  Our problem with running the sanitizers is that the

                  load average running under Ninja reached 146 and a

                  short time after a system crash requiring calling

                  someone to power cycle the box...<br>

                </div>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div>I'm curious what is causing so much load? All our tests

              are mostly single-threaded, so if only #cores jobs are

              spawned (or #cores + 2 which is what ninja uses when

              #cores > 2), there should only be #cores + 2 jobs

              running simultaneously (certainly not 146/32 ~4.5). Is lit

              spawning too many jobs?</div>

            <div><br>

            </div>

          </div>

        </div>

      </div>

    </blockquote></div>

    A bare ninja command in the test step, so no -j or -l control.<div class="im"><br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div>Does the machine have enough RAM?</div>

            <div><br>

            </div>

          </div>

        </div>

      </div>

    </blockquote></div>

    24G RAM. 40Mb L2<div class="im"><br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div> </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000">

                <div> <br>

                  The address sanitizer by itself leaves a load average

                  40. This means the OS over 100% utilization, and is

                  thrashing a bit. Load Average doesn't say what exactly

                  is thrashing.<br>

                  <br>

                  Ninja supports make's -j, and -l options. The -l

                  maximum load average, is the key. <br>

                  <br>

                  The load average should be less than the total number

                  of cores (hyperthreads too) before Ninja launches

                  another task. <br>

                  <br>

                  A Load Average at or lower than 100%  technically

                  should benefit performance, and maximize throughput.

                  However, I will be happy if I don't have to call

                  someone to power cycle the server :)<br>

                </div>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div>I don't think that's quite how it works. As long as you

              have enough RAM, the only performance loss due to having a

              bunch of jobs waiting is context switching overhead, but

              that can be minimized by either lowering the preempt timer

              rate (what is called HZ in linux; 100 which is common for

              servers doing batch jobs dilutes the overhead to basically

              nothing) or if you are running a recent kernel then you

              can arrange things to run tickless and then there will be

              essentially no overhead. If load is less than #cores, then

              you don't have a job running on every core, which means

              that those cores are essentially idle and you are losing

              performance. The other killer is jobs blocking on disk IO

              *with no other jobs to be scheduled in the meantime*;

              generally you have to keep load above 100% to avoid that

              problem.</div>

            <div><br>

            </div>

            <div>-- Sean Silva<br>

            </div>

          </div>

        </div>

      </div>

    </blockquote></div>

    ninja --help<br>

    usage: ninja [options] [targets...]<br>

    ...<br>

      -j N     run N jobs in parallel [default=10]<br>

      -l N     do not start new jobs if the load average is greater than

    N<br>

    <br>

    As far as what load average means:<br>

    <a href="http://serverfault.com/questions/251947/what-does-load-average-mean" target="_blank">http://serverfault.com/questions/251947/what-does-load-average-mean</a><br>

<a href="http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages" target="_blank">http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages</a><br>

    <br>

    Everything seems to say 100% load is when Loadaverage = number of

    Processors.<br></div></blockquote><div><br></div><div>This term "load" is only vaguely related to the colloquial meaning, so "100% load" should not be understood as "perfect" or "maximum". It's literally just the time-averaged number of jobs available to run. The bridge analogy in the second link is fairly accurate. Notice that even if you are at >100% load, the bridge is still being used at full capacity (as many cars as possible are crossing the bridge simultaneously). If load is >100%, then that might impact the *latency* for getting to a particular job (in the analogy: how long it takes for a particular car to get across the bridge *including the waiting time in the queue*), but for a batch operation like running tests that doesn't matter.</div>

<div>  </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">

    <br>

    ----<br>

    While the Ninja build step seemed OK, -j10 and all, the test section

    seemed to be the problem.<br>

    <br>

    Ninja continuously launched the address measurement tasks with no

    limits.<br></div></blockquote><div><br></div><div>What "address measurement"?</div><div><br></div><div>-- Sean Silva</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div bgcolor="#FFFFFF" text="#000000">

    <br>

    When combined with a thread sanitizer doing the same thing,

    Loadaverage 146 followed by a crash.<br>

    <br>

     In my testing after -l is used, the load average is mostly below

    32. There are some other builders going on, so they are not

    controlled by loadaverage. My guess is that when all builders are

    throttled by loadaverage, it will be very close to 100% utilization

    when everything is running.<br>

    <br>

    Ninja for sure needs this control in the sanitizers. An experiment

    with Make is in order to prove the point.<div><div class="h5"><br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

              <div bgcolor="#FFFFFF" text="#000000">

                <div> <br>

                  So the maximum load average of a 16 core machine with

                  hyperthreads is 32 (keeping it simple). This needs to

                  be passed to all make's and Ninja build steps on that

                  slave to maximize throughput.<br>

                  <br>

                  For now, I'm looking at a minimal patch to include

                  jobs and a new loadaverage variable for the

                  sanitizers. <br>

                  <br>

                  Longer term, all buildslaves should define maximum

                  loadaverage, and all make/ninja steps should pass -j,

                  and -l options.<br>

                  <br>

                  Best Regards,<br>

                  Rick

                  <div>

                    <div><br>

                      <br>

                      On 11/13/2013 11:21 AM, Sergey Matveev wrote:<br>

                    </div>

                  </div>

                </div>

                <div>

                  <div>

                    <blockquote type="cite">

                      <div dir="ltr">+kcc</div>

                      <div class="gmail_extra"><br>

                        <br>

                        <div class="gmail_quote">On Wed, Nov 13, 2013 at

                          6:41 AM, Shankar Easwaran <span dir="ltr"><<a href="mailto:shankare@codeaurora.org" target="_blank">shankare@codeaurora.org</a>></span>

                          wrote:<br>

                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Sorry

                            for another indirection. Rick foos is

                            working on it. I think there is some good

                            news here :)<br>

                            <br>

                            Cced Rick + adding Galina,Dmitri.<br>

                            <br>

                            Thanks<br>

                            <br>

                            Shankar Easwaran

                            <div>

                              <div><br>

                                <br>

                                On 11/12/2013 8:37 PM, Rui Ueyama wrote:<br>

                                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                  Shankar tried to set it up recently.<br>

                                  <br>

                                  <br>

                                  On Tue, Nov 12, 2013 at 6:31 PM, Sean

                                  Silva <<a href="mailto:silvas@purdue.edu" target="_blank">silvas@purdue.edu</a>>

                                  wrote:<br>

                                  <br>

                                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                    Sanitizers?<br>

                                    <br>

                                    There have been a couple of these

                                    sorts of bugs recently... we really<br>

                                    ought to have some sanitizer bots...<br>

                                    <br>

                                    -- Sean Silva<br>

                                    <br>

                                    <br>

                                    On Tue, Nov 12, 2013 at 9:21 PM, Rui

                                    Ueyama <<a href="mailto:ruiu@google.com" target="_blank">ruiu@google.com</a>>

                                    wrote:<br>

                                    <br>

                                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                                      Author: ruiu<br>

                                      Date: Tue Nov 12 20:21:51 2013<br>

                                      New Revision: 194545<br>

                                      <br>

                                      URL: <a href="http://llvm.org/viewvc/llvm-project?rev=194545&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=194545&view=rev</a><br>

                                      Log:<br>

                                      [PECOFF] Fix use-after-return.<br>

                                      <br>

                                      Modified:<br>

                                       lld/trunk/lib/Driver/WinLinkDriver.cpp<br>

                                      <br>

                                      Modified:

                                      lld/trunk/lib/Driver/WinLinkDriver.cpp<br>

                                      URL:<br>

                                      <a href="http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Driver/WinLinkDriver.cpp?rev=194545&r1=194544&r2=194545&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Driver/WinLinkDriver.cpp?rev=194545&r1=194544&r2=194545&view=diff</a><br>

                                      <br>

==============================================================================<br>

                                      ---

                                      lld/trunk/lib/Driver/WinLinkDriver.cpp

                                      (original)<br>

                                      +++

                                      lld/trunk/lib/Driver/WinLinkDriver.cpp

                                      Tue Nov 12 20:21:51 2013<br>

                                      @@ -842,7 +842,7 @@

                                      WinLinkDriver::parse(int argc,

                                      const cha<br>

                                      <br>

                                            case OPT_INPUT:<br>

                                      inputElements.push_back(std::unique_ptr<InputElement>(<br>

                                      -          new PECOFFFileNode(ctx,

                                      inputArg->getValue())));<br>

                                      +          new PECOFFFileNode(ctx,<br>

ctx.allocateString(inputArg->getValue()))));<br>

                                              break;<br>

                                      <br>

                                        #define

                                      DEFINE_BOOLEAN_FLAG(name, setter)

                                            \<br>

                                      @@ -892,9 +892,11 @@

                                      WinLinkDriver::parse(int argc,

                                      const cha<br>

                                          // start with a hypen or a

                                      slash. This is not compatible with

                                      link.exe<br>

                                          // but useful for us to test

                                      lld on Unix.<br>

                                          if (llvm::opt::Arg *dashdash =

                                      parsedArgs->getLastArg(OPT_DASH_DASH))

                                      {<br>

                                      -    for (const StringRef value :

                                      dashdash->getValues())<br>

                                      -      inputElements.push_back(<br>

                                      -        

                                       std::unique_ptr<InputElement>(new

                                      PECOFFFileNode(ctx, value)));<br>

                                      +    for (const StringRef value :

                                      dashdash->getValues()) {<br>

                                      +    

                                       std::unique_ptr<InputElement>

                                      elem(<br>

                                      +          new PECOFFFileNode(ctx,

                                      ctx.allocateString(value)));<br>

                                      +    

                                       inputElements.push_back(std::move(elem));<br>

                                      +    }<br>

                                          }<br>

                                      <br>

                                          // Add the libraries specified

                                      by /defaultlib unless they are

                                      already<br>

                                      added<br>

                                      <br>

                                      <br>

_______________________________________________<br>

                                      llvm-commits mailing list<br>

                                      <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

                                      <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

                                      <br>

                                    </blockquote>

                                    <br>

                                  </blockquote>

                                </blockquote>

                                <br>

                                <br>

                              </div>

                            </div>

                            <span><font color="#888888"> -- <br>

                                Qualcomm Innovation Center, Inc. is a

                                member of Code Aurora Forum, hosted by

                                the Linux Foundation</font></span>

                            <div>

                              <div><br>

                                <br>

_______________________________________________<br>

                                llvm-commits mailing list<br>

                                <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

                                <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

                              </div>

                            </div>

                          </blockquote>

                        </div>

                        <br>

                      </div>

                      <br>

                      <fieldset></fieldset>

                      <br>

                      <pre>_______________________________________________

llvm-commits mailing list

<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a>

</pre>

                    </blockquote>

                    <br>

                    <br>

                  </div>

                </div>

                <span><font color="#888888">

                    <pre cols="72">-- 

Rick Foos

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation</pre>

                  </font></span></div>

              <br>

              _______________________________________________<br>

              llvm-commits mailing list<br>

              <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

              <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

              <br>

            </blockquote>

          </div>

          <br>

        </div>

      </div>

    </blockquote>

    <br>

    <br>

    <pre cols="72">-- 

Rick Foos

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation</pre>

  </div></div></div>

</blockquote></div><br></div></div>