[lld] r194545 - [PECOFF] Fix use-after-return.

Wed Nov 13 16:40:34 PST 2013

On 11/13/2013 06:19 PM, Sean Silva wrote:
>
>
>
> On Wed, Nov 13, 2013 at 2:41 PM, Rick Foos <rfoos at codeaurora.org 
> <mailto:rfoos at codeaurora.org>> wrote:
>
>     Sorry for the delay,
>
>     Our problem with running the sanitizers is that the load average
>     running under Ninja reached 146 and a short time after a system
>     crash requiring calling someone to power cycle the box...
>
>
> I'm curious what is causing so much load? All our tests are mostly 
> single-threaded, so if only #cores jobs are spawned (or #cores + 2 
> which is what ninja uses when #cores > 2), there should only be #cores 
> + 2 jobs running simultaneously (certainly not 146/32 ~4.5). Is lit 
> spawning too many jobs?
>
A bare ninja command in the test step, so no -j or -l control.
> Does the machine have enough RAM?
>
24G RAM. 40Mb L2
>
>
>     The address sanitizer by itself leaves a load average 40. This
>     means the OS over 100% utilization, and is thrashing a bit. Load
>     Average doesn't say what exactly is thrashing.
>
>     Ninja supports make's -j, and -l options. The -l maximum load
>     average, is the key.
>
>     The load average should be less than the total number of cores
>     (hyperthreads too) before Ninja launches another task.
>
>     A Load Average at or lower than 100%  technically should benefit
>     performance, and maximize throughput. However, I will be happy if
>     I don't have to call someone to power cycle the server :)
>
>
> I don't think that's quite how it works. As long as you have enough 
> RAM, the only performance loss due to having a bunch of jobs waiting 
> is context switching overhead, but that can be minimized by either 
> lowering the preempt timer rate (what is called HZ in linux; 100 which 
> is common for servers doing batch jobs dilutes the overhead to 
> basically nothing) or if you are running a recent kernel then you can 
> arrange things to run tickless and then there will be essentially no 
> overhead. If load is less than #cores, then you don't have a job 
> running on every core, which means that those cores are essentially 
> idle and you are losing performance. The other killer is jobs blocking 
> on disk IO *with no other jobs to be scheduled in the meantime*; 
> generally you have to keep load above 100% to avoid that problem.
>
> -- Sean Silva
ninja --help
usage: ninja [options] [targets...]
...
   -j N     run N jobs in parallel [default=10]
   -l N     do not start new jobs if the load average is greater than N

As far as what load average means:
http://serverfault.com/questions/251947/what-does-load-average-mean
http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages

Everything seems to say 100% load is when Loadaverage = number of 
Processors.

----
While the Ninja build step seemed OK, -j10 and all, the test section 
seemed to be the problem.

Ninja continuously launched the address measurement tasks with no limits.

When combined with a thread sanitizer doing the same thing, Loadaverage 
146 followed by a crash.

  In my testing after -l is used, the load average is mostly below 32. 
There are some other builders going on, so they are not controlled by 
loadaverage. My guess is that when all builders are throttled by 
loadaverage, it will be very close to 100% utilization when everything 
is running.

Ninja for sure needs this control in the sanitizers. An experiment with 
Make is in order to prove the point.

>
>     So the maximum load average of a 16 core machine with hyperthreads
>     is 32 (keeping it simple). This needs to be passed to all make's
>     and Ninja build steps on that slave to maximize throughput.
>
>     For now, I'm looking at a minimal patch to include jobs and a new
>     loadaverage variable for the sanitizers.
>
>     Longer term, all buildslaves should define maximum loadaverage,
>     and all make/ninja steps should pass -j, and -l options.
>
>     Best Regards,
>     Rick
>
>
>     On 11/13/2013 11:21 AM, Sergey Matveev wrote:
>>     +kcc
>>
>>
>>     On Wed, Nov 13, 2013 at 6:41 AM, Shankar Easwaran
>>     <shankare at codeaurora.org <mailto:shankare at codeaurora.org>> wrote:
>>
>>         Sorry for another indirection. Rick foos is working on it. I
>>         think there is some good news here :)
>>
>>         Cced Rick + adding Galina,Dmitri.
>>
>>         Thanks
>>
>>         Shankar Easwaran
>>
>>
>>         On 11/12/2013 8:37 PM, Rui Ueyama wrote:
>>
>>             Shankar tried to set it up recently.
>>
>>
>>             On Tue, Nov 12, 2013 at 6:31 PM, Sean Silva
>>             <silvas at purdue.edu <mailto:silvas at purdue.edu>> wrote:
>>
>>                 Sanitizers?
>>
>>                 There have been a couple of these sorts of bugs
>>                 recently... we really
>>                 ought to have some sanitizer bots...
>>
>>                 -- Sean Silva
>>
>>
>>                 On Tue, Nov 12, 2013 at 9:21 PM, Rui Ueyama
>>                 <ruiu at google.com <mailto:ruiu at google.com>> wrote:
>>
>>                     Author: ruiu
>>                     Date: Tue Nov 12 20:21:51 2013
>>                     New Revision: 194545
>>
>>                     URL:
>>                     http://llvm.org/viewvc/llvm-project?rev=194545&view=rev
>>                     Log:
>>                     [PECOFF] Fix use-after-return.
>>
>>                     Modified:
>>                      lld/trunk/lib/Driver/WinLinkDriver.cpp
>>
>>                     Modified: lld/trunk/lib/Driver/WinLinkDriver.cpp
>>                     URL:
>>                     http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Driver/WinLinkDriver.cpp?rev=194545&r1=194544&r2=194545&view=diff
>>
>>                     ==============================================================================
>>                     --- lld/trunk/lib/Driver/WinLinkDriver.cpp (original)
>>                     +++ lld/trunk/lib/Driver/WinLinkDriver.cpp Tue
>>                     Nov 12 20:21:51 2013
>>                     @@ -842,7 +842,7 @@ WinLinkDriver::parse(int
>>                     argc, const cha
>>
>>                           case OPT_INPUT:
>>                     inputElements.push_back(std::unique_ptr<InputElement>(
>>                     -          new PECOFFFileNode(ctx,
>>                     inputArg->getValue())));
>>                     +          new PECOFFFileNode(ctx,
>>                     ctx.allocateString(inputArg->getValue()))));
>>                             break;
>>
>>                       #define DEFINE_BOOLEAN_FLAG(name, setter)       \
>>                     @@ -892,9 +892,11 @@ WinLinkDriver::parse(int
>>                     argc, const cha
>>                         // start with a hypen or a slash. This is not
>>                     compatible with link.exe
>>                         // but useful for us to test lld on Unix.
>>                         if (llvm::opt::Arg *dashdash =
>>                     parsedArgs->getLastArg(OPT_DASH_DASH)) {
>>                     -    for (const StringRef value :
>>                     dashdash->getValues())
>>                     -      inputElements.push_back(
>>                     -  std::unique_ptr<InputElement>(new
>>                     PECOFFFileNode(ctx, value)));
>>                     +    for (const StringRef value :
>>                     dashdash->getValues()) {
>>                     +  std::unique_ptr<InputElement> elem(
>>                     +          new PECOFFFileNode(ctx,
>>                     ctx.allocateString(value)));
>>                     +  inputElements.push_back(std::move(elem));
>>                     +    }
>>                         }
>>
>>                         // Add the libraries specified by /defaultlib
>>                     unless they are already
>>                     added
>>
>>
>>                     _______________________________________________
>>                     llvm-commits mailing list
>>                     llvm-commits at cs.uiuc.edu
>>                     <mailto:llvm-commits at cs.uiuc.edu>
>>                     http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>>
>>
>>         -- 
>>         Qualcomm Innovation Center, Inc. is a member of Code Aurora
>>         Forum, hosted by the Linux Foundation
>>
>>
>>         _______________________________________________
>>         llvm-commits mailing list
>>         llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>>         http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>>
>>
>>     _______________________________________________
>>     llvm-commits mailing list
>>     llvm-commits at cs.uiuc.edu  <mailto:llvm-commits at cs.uiuc.edu>
>>     http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>     -- 
>     Rick Foos
>     Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
>
>
>     _______________________________________________
>     llvm-commits mailing list
>     llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
>     http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>

-- 
Rick Foos
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131113/4075f8c1/attachment.html>