<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 11/13/2013 06:19 PM, Sean Silva
wrote:<br>
</div>
<blockquote
cite="mid:CAHnXoa=R0kN6=WhMEM2HO_cTNXR0wOtwHcQW=bj0c3jJ7pXSwQ@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Wed, Nov 13, 2013 at 2:41 PM, Rick
Foos <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:rfoos@codeaurora.org" target="_blank">rfoos@codeaurora.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>Sorry for the delay, <br>
<br>
Our problem with running the sanitizers is that the
load average running under Ninja reached 146 and a
short time after a system crash requiring calling
someone to power cycle the box...<br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>I'm curious what is causing so much load? All our tests
are mostly single-threaded, so if only #cores jobs are
spawned (or #cores + 2 which is what ninja uses when
#cores > 2), there should only be #cores + 2 jobs
running simultaneously (certainly not 146/32 ~4.5). Is lit
spawning too many jobs?</div>
<div><br>
</div>
</div>
</div>
</div>
</blockquote>
A bare ninja command in the test step, so no -j or -l control.<br>
<blockquote
cite="mid:CAHnXoa=R0kN6=WhMEM2HO_cTNXR0wOtwHcQW=bj0c3jJ7pXSwQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Does the machine have enough RAM?</div>
<div><br>
</div>
</div>
</div>
</div>
</blockquote>
24G RAM. 40Mb L2<br>
<blockquote
cite="mid:CAHnXoa=R0kN6=WhMEM2HO_cTNXR0wOtwHcQW=bj0c3jJ7pXSwQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div> <br>
The address sanitizer by itself leaves a load average
40. This means the OS over 100% utilization, and is
thrashing a bit. Load Average doesn't say what exactly
is thrashing.<br>
<br>
Ninja supports make's -j, and -l options. The -l
maximum load average, is the key. <br>
<br>
The load average should be less than the total number
of cores (hyperthreads too) before Ninja launches
another task. <br>
<br>
A Load Average at or lower than 100% technically
should benefit performance, and maximize throughput.
However, I will be happy if I don't have to call
someone to power cycle the server :)<br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>I don't think that's quite how it works. As long as you
have enough RAM, the only performance loss due to having a
bunch of jobs waiting is context switching overhead, but
that can be minimized by either lowering the preempt timer
rate (what is called HZ in linux; 100 which is common for
servers doing batch jobs dilutes the overhead to basically
nothing) or if you are running a recent kernel then you
can arrange things to run tickless and then there will be
essentially no overhead. If load is less than #cores, then
you don't have a job running on every core, which means
that those cores are essentially idle and you are losing
performance. The other killer is jobs blocking on disk IO
*with no other jobs to be scheduled in the meantime*;
generally you have to keep load above 100% to avoid that
problem.</div>
<div><br>
</div>
<div>-- Sean Silva<br>
</div>
</div>
</div>
</div>
</blockquote>
ninja --help<br>
usage: ninja [options] [targets...]<br>
...<br>
-j N run N jobs in parallel [default=10]<br>
-l N do not start new jobs if the load average is greater than
N<br>
<br>
As far as what load average means:<br>
<a class="moz-txt-link-freetext" href="http://serverfault.com/questions/251947/what-does-load-average-mean">http://serverfault.com/questions/251947/what-does-load-average-mean</a><br>
<a class="moz-txt-link-freetext" href="http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages">http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages</a><br>
<br>
Everything seems to say 100% load is when Loadaverage = number of
Processors.<br>
<br>
----<br>
While the Ninja build step seemed OK, -j10 and all, the test section
seemed to be the problem.<br>
<br>
Ninja continuously launched the address measurement tasks with no
limits.<br>
<br>
When combined with a thread sanitizer doing the same thing,
Loadaverage 146 followed by a crash.<br>
<br>
In my testing after -l is used, the load average is mostly below
32. There are some other builders going on, so they are not
controlled by loadaverage. My guess is that when all builders are
throttled by loadaverage, it will be very close to 100% utilization
when everything is running.<br>
<br>
Ninja for sure needs this control in the sanitizers. An experiment
with Make is in order to prove the point.<br>
<br>
<blockquote
cite="mid:CAHnXoa=R0kN6=WhMEM2HO_cTNXR0wOtwHcQW=bj0c3jJ7pXSwQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div> <br>
So the maximum load average of a 16 core machine with
hyperthreads is 32 (keeping it simple). This needs to
be passed to all make's and Ninja build steps on that
slave to maximize throughput.<br>
<br>
For now, I'm looking at a minimal patch to include
jobs and a new loadaverage variable for the
sanitizers. <br>
<br>
Longer term, all buildslaves should define maximum
loadaverage, and all make/ninja steps should pass -j,
and -l options.<br>
<br>
Best Regards,<br>
Rick
<div>
<div class="h5"><br>
<br>
On 11/13/2013 11:21 AM, Sergey Matveev wrote:<br>
</div>
</div>
</div>
<div>
<div class="h5">
<blockquote type="cite">
<div dir="ltr">+kcc</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Wed, Nov 13, 2013 at
6:41 AM, Shankar Easwaran <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:shankare@codeaurora.org"
target="_blank">shankare@codeaurora.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Sorry
for another indirection. Rick foos is
working on it. I think there is some good
news here :)<br>
<br>
Cced Rick + adding Galina,Dmitri.<br>
<br>
Thanks<br>
<br>
Shankar Easwaran
<div>
<div><br>
<br>
On 11/12/2013 8:37 PM, Rui Ueyama wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Shankar tried to set it up recently.<br>
<br>
<br>
On Tue, Nov 12, 2013 at 6:31 PM, Sean
Silva <<a moz-do-not-send="true"
href="mailto:silvas@purdue.edu"
target="_blank">silvas@purdue.edu</a>>
wrote:<br>
<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Sanitizers?<br>
<br>
There have been a couple of these
sorts of bugs recently... we really<br>
ought to have some sanitizer bots...<br>
<br>
-- Sean Silva<br>
<br>
<br>
On Tue, Nov 12, 2013 at 9:21 PM, Rui
Ueyama <<a moz-do-not-send="true"
href="mailto:ruiu@google.com"
target="_blank">ruiu@google.com</a>>
wrote:<br>
<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
Author: ruiu<br>
Date: Tue Nov 12 20:21:51 2013<br>
New Revision: 194545<br>
<br>
URL: <a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project?rev=194545&view=rev"
target="_blank">http://llvm.org/viewvc/llvm-project?rev=194545&view=rev</a><br>
Log:<br>
[PECOFF] Fix use-after-return.<br>
<br>
Modified:<br>
lld/trunk/lib/Driver/WinLinkDriver.cpp<br>
<br>
Modified:
lld/trunk/lib/Driver/WinLinkDriver.cpp<br>
URL:<br>
<a moz-do-not-send="true"
href="http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Driver/WinLinkDriver.cpp?rev=194545&r1=194544&r2=194545&view=diff"
target="_blank">http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Driver/WinLinkDriver.cpp?rev=194545&r1=194544&r2=194545&view=diff</a><br>
<br>
==============================================================================<br>
---
lld/trunk/lib/Driver/WinLinkDriver.cpp
(original)<br>
+++
lld/trunk/lib/Driver/WinLinkDriver.cpp
Tue Nov 12 20:21:51 2013<br>
@@ -842,7 +842,7 @@
WinLinkDriver::parse(int argc,
const cha<br>
<br>
case OPT_INPUT:<br>
inputElements.push_back(std::unique_ptr<InputElement>(<br>
- new PECOFFFileNode(ctx,
inputArg->getValue())));<br>
+ new PECOFFFileNode(ctx,<br>
ctx.allocateString(inputArg->getValue()))));<br>
break;<br>
<br>
#define
DEFINE_BOOLEAN_FLAG(name, setter)
\<br>
@@ -892,9 +892,11 @@
WinLinkDriver::parse(int argc,
const cha<br>
// start with a hypen or a
slash. This is not compatible with
link.exe<br>
// but useful for us to test
lld on Unix.<br>
if (llvm::opt::Arg *dashdash =
parsedArgs->getLastArg(OPT_DASH_DASH))
{<br>
- for (const StringRef value :
dashdash->getValues())<br>
- inputElements.push_back(<br>
-
std::unique_ptr<InputElement>(new
PECOFFFileNode(ctx, value)));<br>
+ for (const StringRef value :
dashdash->getValues()) {<br>
+
std::unique_ptr<InputElement>
elem(<br>
+ new PECOFFFileNode(ctx,
ctx.allocateString(value)));<br>
+
inputElements.push_back(std::move(elem));<br>
+ }<br>
}<br>
<br>
// Add the libraries specified
by /defaultlib unless they are
already<br>
added<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a moz-do-not-send="true"
href="mailto:llvm-commits@cs.uiuc.edu"
target="_blank">llvm-commits@cs.uiuc.edu</a><br>
<a moz-do-not-send="true"
href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits"
target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
<br>
</blockquote>
<br>
</blockquote>
</blockquote>
<br>
<br>
</div>
</div>
<span><font color="#888888"> -- <br>
Qualcomm Innovation Center, Inc. is a
member of Code Aurora Forum, hosted by
the Linux Foundation</font></span>
<div>
<div><br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a moz-do-not-send="true"
href="mailto:llvm-commits@cs.uiuc.edu"
target="_blank">llvm-commits@cs.uiuc.edu</a><br>
<a moz-do-not-send="true"
href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits"
target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
llvm-commits mailing list
<a moz-do-not-send="true" href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a>
<a moz-do-not-send="true" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a>
</pre>
</blockquote>
<br>
<br>
</div>
</div>
<span class=""><font color="#888888">
<pre cols="72">--
Rick Foos
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation</pre>
</font></span></div>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a moz-do-not-send="true"
href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
<a moz-do-not-send="true"
href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits"
target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
Rick Foos
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation</pre>
</body>
</html>