[lld] r194545 - [PECOFF] Fix use-after-return.

Sean Silva silvas at purdue.edu
Wed Nov 13 17:12:23 PST 2013


On Wed, Nov 13, 2013 at 7:40 PM, Rick Foos <rfoos at codeaurora.org> wrote:

>  On 11/13/2013 06:19 PM, Sean Silva wrote:
>
>
>
>
> On Wed, Nov 13, 2013 at 2:41 PM, Rick Foos <rfoos at codeaurora.org> wrote:
>
>>  Sorry for the delay,
>>
>> Our problem with running the sanitizers is that the load average running
>> under Ninja reached 146 and a short time after a system crash requiring
>> calling someone to power cycle the box...
>>
>
>  I'm curious what is causing so much load? All our tests are mostly
> single-threaded, so if only #cores jobs are spawned (or #cores + 2 which is
> what ninja uses when #cores > 2), there should only be #cores + 2 jobs
> running simultaneously (certainly not 146/32 ~4.5). Is lit spawning too
> many jobs?
>
>    A bare ninja command in the test step, so no -j or -l control.
>
>   Does the machine have enough RAM?
>
>    24G RAM. 40Mb L2
>
>
>
>
>>
>> The address sanitizer by itself leaves a load average 40. This means the
>> OS over 100% utilization, and is thrashing a bit. Load Average doesn't say
>> what exactly is thrashing.
>>
>> Ninja supports make's -j, and -l options. The -l maximum load average, is
>> the key.
>>
>> The load average should be less than the total number of cores
>> (hyperthreads too) before Ninja launches another task.
>>
>> A Load Average at or lower than 100%  technically should benefit
>> performance, and maximize throughput. However, I will be happy if I don't
>> have to call someone to power cycle the server :)
>>
>
>  I don't think that's quite how it works. As long as you have enough RAM,
> the only performance loss due to having a bunch of jobs waiting is context
> switching overhead, but that can be minimized by either lowering the
> preempt timer rate (what is called HZ in linux; 100 which is common for
> servers doing batch jobs dilutes the overhead to basically nothing) or if
> you are running a recent kernel then you can arrange things to run tickless
> and then there will be essentially no overhead. If load is less than
> #cores, then you don't have a job running on every core, which means that
> those cores are essentially idle and you are losing performance. The other
> killer is jobs blocking on disk IO *with no other jobs to be scheduled in
> the meantime*; generally you have to keep load above 100% to avoid that
> problem.
>
>  -- Sean Silva
>
> ninja --help
> usage: ninja [options] [targets...]
> ...
>   -j N     run N jobs in parallel [default=10]
>   -l N     do not start new jobs if the load average is greater than N
>
> As far as what load average means:
> http://serverfault.com/questions/251947/what-does-load-average-mean
> http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages
>
> Everything seems to say 100% load is when Loadaverage = number of
> Processors.
>

This term "load" is only vaguely related to the colloquial meaning, so
"100% load" should not be understood as "perfect" or "maximum". It's
literally just the time-averaged number of jobs available to run. The
bridge analogy in the second link is fairly accurate. Notice that even if
you are at >100% load, the bridge is still being used at full capacity (as
many cars as possible are crossing the bridge simultaneously). If load is
>100%, then that might impact the *latency* for getting to a particular job
(in the analogy: how long it takes for a particular car to get across the
bridge *including the waiting time in the queue*), but for a batch
operation like running tests that doesn't matter.


>
> ----
> While the Ninja build step seemed OK, -j10 and all, the test section
> seemed to be the problem.
>
> Ninja continuously launched the address measurement tasks with no limits.
>

What "address measurement"?

-- Sean Silva


>
> When combined with a thread sanitizer doing the same thing, Loadaverage
> 146 followed by a crash.
>
>  In my testing after -l is used, the load average is mostly below 32.
> There are some other builders going on, so they are not controlled by
> loadaverage. My guess is that when all builders are throttled by
> loadaverage, it will be very close to 100% utilization when everything is
> running.
>
> Ninja for sure needs this control in the sanitizers. An experiment with
> Make is in order to prove the point.
>
>
>
>> So the maximum load average of a 16 core machine with hyperthreads is 32
>> (keeping it simple). This needs to be passed to all make's and Ninja build
>> steps on that slave to maximize throughput.
>>
>> For now, I'm looking at a minimal patch to include jobs and a new
>> loadaverage variable for the sanitizers.
>>
>> Longer term, all buildslaves should define maximum loadaverage, and all
>> make/ninja steps should pass -j, and -l options.
>>
>> Best Regards,
>> Rick
>>
>>
>> On 11/13/2013 11:21 AM, Sergey Matveev wrote:
>>
>> +kcc
>>
>>
>> On Wed, Nov 13, 2013 at 6:41 AM, Shankar Easwaran <
>> shankare at codeaurora.org> wrote:
>>
>>> Sorry for another indirection. Rick foos is working on it. I think there
>>> is some good news here :)
>>>
>>> Cced Rick + adding Galina,Dmitri.
>>>
>>> Thanks
>>>
>>> Shankar Easwaran
>>>
>>>
>>> On 11/12/2013 8:37 PM, Rui Ueyama wrote:
>>>
>>>> Shankar tried to set it up recently.
>>>>
>>>>
>>>> On Tue, Nov 12, 2013 at 6:31 PM, Sean Silva <silvas at purdue.edu> wrote:
>>>>
>>>>  Sanitizers?
>>>>>
>>>>> There have been a couple of these sorts of bugs recently... we really
>>>>> ought to have some sanitizer bots...
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>> On Tue, Nov 12, 2013 at 9:21 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>>>
>>>>>  Author: ruiu
>>>>>> Date: Tue Nov 12 20:21:51 2013
>>>>>> New Revision: 194545
>>>>>>
>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=194545&view=rev
>>>>>> Log:
>>>>>> [PECOFF] Fix use-after-return.
>>>>>>
>>>>>> Modified:
>>>>>>      lld/trunk/lib/Driver/WinLinkDriver.cpp
>>>>>>
>>>>>> Modified: lld/trunk/lib/Driver/WinLinkDriver.cpp
>>>>>> URL:
>>>>>>
>>>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Driver/WinLinkDriver.cpp?rev=194545&r1=194544&r2=194545&view=diff
>>>>>>
>>>>>>
>>>>>> ==============================================================================
>>>>>> --- lld/trunk/lib/Driver/WinLinkDriver.cpp (original)
>>>>>> +++ lld/trunk/lib/Driver/WinLinkDriver.cpp Tue Nov 12 20:21:51 2013
>>>>>> @@ -842,7 +842,7 @@ WinLinkDriver::parse(int argc, const cha
>>>>>>
>>>>>>       case OPT_INPUT:
>>>>>>         inputElements.push_back(std::unique_ptr<InputElement>(
>>>>>> -          new PECOFFFileNode(ctx, inputArg->getValue())));
>>>>>> +          new PECOFFFileNode(ctx,
>>>>>> ctx.allocateString(inputArg->getValue()))));
>>>>>>         break;
>>>>>>
>>>>>>   #define DEFINE_BOOLEAN_FLAG(name, setter)       \
>>>>>> @@ -892,9 +892,11 @@ WinLinkDriver::parse(int argc, const cha
>>>>>>     // start with a hypen or a slash. This is not compatible with
>>>>>> link.exe
>>>>>>     // but useful for us to test lld on Unix.
>>>>>>     if (llvm::opt::Arg *dashdash =
>>>>>> parsedArgs->getLastArg(OPT_DASH_DASH)) {
>>>>>> -    for (const StringRef value : dashdash->getValues())
>>>>>> -      inputElements.push_back(
>>>>>> -          std::unique_ptr<InputElement>(new PECOFFFileNode(ctx,
>>>>>> value)));
>>>>>> +    for (const StringRef value : dashdash->getValues()) {
>>>>>> +      std::unique_ptr<InputElement> elem(
>>>>>> +          new PECOFFFileNode(ctx, ctx.allocateString(value)));
>>>>>> +      inputElements.push_back(std::move(elem));
>>>>>> +    }
>>>>>>     }
>>>>>>
>>>>>>     // Add the libraries specified by /defaultlib unless they are
>>>>>> already
>>>>>> added
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>
>>>>>>
>>>>>
>>>
>>>   --
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>> hosted by the Linux Foundation
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing listllvm-commits at cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>>
>>   --
>> Rick Foos
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>
>
> --
> Rick Foos
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131113/d8858a18/attachment.html>


More information about the llvm-commits mailing list