[lld] r194545 - [PECOFF] Fix use-after-return.

Fri Nov 15 01:40:09 PST 2013

Because it's a sanitized symbolizer.
If you are doing a bootstrap, you should point it to the
llvm-symbolizer from the previous stage with ASAN_SYMBOLIZER_PATH. I
believe lit configs have some support for this somewhere.

On Fri, Nov 15, 2013 at 8:46 AM, Sean Silva <silvas at purdue.edu> wrote:
>
>
>
> On Thu, Nov 14, 2013 at 6:30 PM, Rick Foos <rfoos at codeaurora.org> wrote:
>>
>> There is a problem with threads. I'll try to describe what I'm seeing.
>>
>> Thanks for looking at this,
>> Rick
>>
>> ninja '-j 12' '-l 32' check-all
>> Lauches 200+ llvm-symbolizer's and consumes 24G memory, going into swap
>> space.
>>
>> It doesn't halt but does keep going with a load average 80, 44 zombie's,
>> and this run 10 llvm-symbolizers (highlighted) at the top.
>>
>> Quite a bit of the memory is released later on, and the testing
>> continues...
>>
>> The last line of stdio stays the same. No interim tests results are
>> displayed.
>>
>> [189/189] Running all regression tests
>>
>> repeating sequence:
>> A large number of llvm-symbolizers are launched 200+
>> They run for a few minutes, and then complete. The top 10 llvm-symbolizers
>> stay resident.
>>
>> On average 132 kworkers are running.
>> On average 76 llvm-symbolizers are running, but they do drop to near 0
>> before restarting.
>
>
> This "thundering herd" of symbolizers seems really problematic. They are all
> likely reporting the same bug. As a quick experiment, you should try the
> following:
>
> $ mv llvm-symbolizer llvm-symbolizer_REAL
> $ echo 'exec flock ./symbolizer.lock ./llvm-symbolizer_REAL'
>>llvm-symbolizer
> $ chmod +x llvm-symbolizer
>
> That should make sure that only a single llvm-symbolizer ever runs. It will
> completely serialize the symbolizers, but that still might be a win over
> swapping. You can also add the `-n` option to flock to cause it to fail if
> there is already another symbolizer running (that might be useful so that
> the build finishes quickly, while still getting at least one sanitizer error
> report).
>
> Also, wtf is llvm-symbolizer doing that needs so much memory??? That seems
> like the root cause of this issue...
>
>>
>>
>> As time go on, the top llvm-symbolizers go from 50% cpu, to 100% CPU now
>> up to 116% CPU.
>>
>>
>>
>>
>>
>> ---
>>
>> top - 15:16:28 up 16 min,  1 user,  load average: 80.91, 69.35, 38.58
>> Tasks: 466 total,  66 running, 356 sleeping,   0 stopped,  44 zombie
>> %Cpu(s): 28.8 us, 71.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,
>> 0.0 st
>> KiB Mem:  24520168 total,  1735968 used, 22784200 free,    10240 buffers
>> KiB Swap:  1999868 total,   144028 used,  1855840 free,   116280 cached
>>
>>   PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
>> 54979 buildbot  20   0 1024g  12m   12 R    46  0.1   4:09.50
>> llvm-symbolizer
>> 55000 buildbot  20   0 1024g  12m   12 R    46  0.1   4:09.02
>> llvm-symbolizer
>> 54771 buildbot  20   0 97.0t  27m   48 R    44  0.1   4:10.47
>> llvm-symbolizer
>> 54923 buildbot  20   0 1024g  12m   12 R    44  0.1   4:07.50
>> llvm-symbolizer
>> 54769 buildbot  20   0 97.0t  27m   48 R    44  0.1   4:09.85
>> llvm-symbolizer
>> 55144 buildbot  20   0 1024g  12m   12 R    44  0.1   4:07.72
>> llvm-symbolizer
>> 54882 buildbot  20   0 1024g  12m   12 R    43  0.1   4:11.09
>> llvm-symbolizer
>> 54975 buildbot  20   0 1024g  12m   12 R    42  0.1   4:08.50
>> llvm-symbolizer
>> 54922 buildbot  20   0 1024g  12m   12 R    41  0.1   4:09.29
>> llvm-symbolizer
>> 54958 buildbot  20   0 1024g  12m   12 R    39  0.1   4:07.27
>> llvm-symbolizer
>
>
> Why is the symbolizer using so much virtual address space? I know that the
> sanitizers themselves need a lot for their shadow memory, but just
> symbolizing should hardly use any...
>
>>
>>     1 root      20   0 26920 1500  536 S    11  0.0   0:49.61 init
>>    10 root      20   0     0    0    0 S     2  0.0   0:11.64 rcu_sched
>>   209 root      20   0     0    0    0 S     2  0.0   0:10.44 kworker/0:1
>>    15 root      20   0     0    0    0 S     2  0.0   0:09.85 kworker/1:0
>>   178 root      20   0     0    0    0 S     2  0.0   0:08.85 kworker/24:1
>>   202 root      20   0     0    0    0 S     2  0.0   0:09.95 kworker/12:1
>>   205 root      20   0     0    0    0 S     2  0.0   0:09.71 kworker/15:1
>>
>> ---- pstree
>> systemadmin at quicbuild03:~$ pstree
>> init-+-acpid
>>      |-avahi-daemon---avahi-daemon
>>      |-bluetoothd
>>      |-buildslave-+-ninja---sh---python-+-23*[python---bash]
>>      |            |                     |-8*[python-+-bash]
>>      |            |                     |           `-{python}]
>>      |            |
>> |-python---bash---FileCheck-+-llvm-symb+
>>      |            |                     |
>> `-{FileChec+
>>      |            |                     `-{python}
>>      |            `-{buildslave}
>>      |-buildslave---{buildslave}
>>      |-console-kit-dae---64*[{console-kit-dae}]
>>      |-cron
>>      |-cups-browsed
>>      |-cupsd
>>      |-dbus-daemon
>>      |-exim4
>>      |-6*[getty]
>>      |-irqbalance
>>      |-13*[llvm-symbolizer-+-llvm-symbolizer]
>>      |                     `-{llvm-symbolizer}]
>>      |-2*[llvm-symbolizer---{llvm-symbolizer}]
>>      |-2*[llvm-symbolizer---llvm-symbolizer]
>>      |-45*[llvm-symbolizer]
>
>
> This is really strange. Does llvm-symbolizer double-fork or something? How
> are these getting de-parented?
>
> -- Sean Silva
>
>
>>
>>      |-nrpe
>>      |-nscd---21*[{nscd}]
>>      |-ntpd
>>      |-polkitd---{polkitd}
>>      |-rpc.idmapd
>>      |-rpc.statd
>>      |-rpcbind
>>      |-rsyslogd---3*[{rsyslogd}]
>>      |-sshd---sshd---sshd---bash---pstree
>>      |-udevd---2*[udevd]
>>      |-upstart-file-br
>>      |-upstart-socket-
>>      |-upstart-udev-br
>>      `-whoopsie---{whoopsie}
>>
>>
>>
>>
>> On 11/14/2013 04:47 PM, Sergey Matveev wrote:
>>
>> +kcc, samsonov (please don't remove people from CC)
>>
>> You mean in the presence of threads? There's no such option because it's
>> not supposed to interfere with the symbolizer. If it does then it's a bug,
>> someone from our team will follow up on this tomorrow.
>>
>> Sergey
>>
>> On Fri, Nov 15, 2013 at 2:01 AM, Rick Foos <rfoos at codeaurora.org> wrote:
>>>
>>> Thank you Sergey!
>>>
>>> Address Sanitize running alone on a server is stable without the
>>> symbolizer option. It is running all the tests in a reasonable amount of
>>> time, and there are no llvm-symbolizer tasks.
>>>
>>> The problem is coming from Threads, and I'm trying to prove that now.
>>>
>>> If threads runs clean by itself alone on a server, there is an
>>> interaction with both address and threads running at the same time.
>>>
>>> Is there a similar feature to disable symbolizer in threads?
>>>
>>> Best Regards,
>>> Rick
>>>
>>>
>>> On 11/14/2013 03:51 PM, Sergey Matveev wrote:
>>>
>>> ASAN_OPTIONS=symbolize=false
>>>
>>>
>>> On Fri, Nov 15, 2013 at 1:14 AM, Nick Kledzik <kledzik at apple.com> wrote:
>>>>
>>>>
>>>> On Nov 14, 2013, at 9:07 AM, Rick Foos <rfoos at codeaurora.org> wrote:
>>>>
>>>> Status: System in swap overnight. Stopped both buildmaster and slave.
>>>> 187 llvm-symbolizer tasks were still running. Tasks did not stop after
>>>>
>>>> Retried this morning, no other workload, 8 llvm-symbolizer tasks
>>>> consuming 100% on each cpu
>>>>
>>>>
>>>> Doesn’t that mean that Asan found some problems, but is stuck trying to
>>>> symbolicate the backtraces?   Is there a way to run Asan and *not*
>>>> symbolicate?
>>>>
>>>> This also seems like a bug (infinite loop?) in llvm-symbolizer.
>>>>
>>>> -Nick
>>>>
>>>>
>>>> . 7 zombie tasks.
>>>>
>>>> So not quite ready this morning. If anyone knows of an llvm-sanitizer
>>>> issue like this it would help.
>>>>
>>>> From: llvm-commits-bounces at cs.uiuc.edu
>>>> [mailto:llvm-commits-bounces at cs.uiuc.edu] On Behalf Of Rick Foos
>>>> Sent: Wednesday, November 13, 2013 1:42 PM
>>>> To: Sergey Matveev; Shankar Easwaran
>>>> Cc: llvm-commits at cs.uiuc.edu; Galina Kistanova
>>>> Subject: Re: [lld] r194545 - [PECOFF] Fix use-after-return.
>>>>
>>>> Sorry for the delay,
>>>>
>>>> Our problem with running the sanitizers is that the load average running
>>>> under Ninja reached 146 and a short time after a system crash requiring
>>>> calling someone to power cycle the box...
>>>>
>>>> The address sanitizer by itself leaves a load average 40. This means the
>>>> OS over 100% utilization, and is thrashing a bit. Load Average doesn't say
>>>> what exactly is thrashing.
>>>>
>>>> Ninja supports make's -j, and -l options. The -l maximum load average,
>>>> is the key.
>>>>
>>>> The load average should be less than the total number of cores
>>>> (hyperthreads too) before Ninja launches another task.
>>>>
>>>> A Load Average at or lower than 100%  technically should benefit
>>>> performance, and maximize throughput. However, I will be happy if I don't
>>>> have to call someone to power cycle the server :)
>>>>
>>>> So the maximum load average of a 16 core machine with hyperthreads is 32
>>>> (keeping it simple). This needs to be passed to all make's and Ninja build
>>>> steps on that slave to maximize throughput.
>>>>
>>>> For now, I'm looking at a minimal patch to include jobs and a new
>>>> loadaverage variable for the sanitizers.
>>>>
>>>> Longer term, all buildslaves should define maximum loadaverage, and all
>>>> make/ninja steps should pass -j, and -l options.
>>>>
>>>> Best Regards,
>>>> Rick
>>>>
>>>> On 11/13/2013 11:21 AM, Sergey Matveev wrote:
>>>>
>>>> +kcc
>>>>
>>>>
>>>>
>>>> On Wed, Nov 13, 2013 at 6:41 AM, Shankar Easwaran
>>>> <shankare at codeaurora.org> wrote:
>>>> Sorry for another indirection. Rick foos is working on it. I think there
>>>> is some good news here :)
>>>>
>>>> Cced Rick + adding Galina,Dmitri.
>>>>
>>>> Thanks
>>>>
>>>> Shankar Easwaran
>>>>
>>>>
>>>> On 11/12/2013 8:37 PM, Rui Ueyama wrote:
>>>>
>>>> Shankar tried to set it up recently.
>>>>
>>>>
>>>> On Tue, Nov 12, 2013 at 6:31 PM, Sean Silva <silvas at purdue.edu> wrote:
>>>>
>>>> Sanitizers?
>>>>
>>>> There have been a couple of these sorts of bugs recently... we really
>>>> ought to have some sanitizer bots...
>>>>
>>>> -- Sean Silva
>>>>
>>>>
>>>> On Tue, Nov 12, 2013 at 9:21 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>>
>>>> Author: ruiu
>>>> Date: Tue Nov 12 20:21:51 2013
>>>> New Revision: 194545
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=194545&view=rev
>>>> Log:
>>>> [PECOFF] Fix use-after-return.
>>>>
>>>> Modified:
>>>>      lld/trunk/lib/Driver/WinLinkDriver.cpp
>>>>
>>>> Modified: lld/trunk/lib/Driver/WinLinkDriver.cpp
>>>> URL:
>>>>
>>>> http://llvm.org/viewvc/llvm-project/lld/trunk/lib/Driver/WinLinkDriver.cpp?rev=194545&r1=194544&r2=194545&view=diff
>>>>
>>>>
>>>> ==============================================================================
>>>> --- lld/trunk/lib/Driver/WinLinkDriver.cpp (original)
>>>> +++ lld/trunk/lib/Driver/WinLinkDriver.cpp Tue Nov 12 20:21:51 2013
>>>> @@ -842,7 +842,7 @@ WinLinkDriver::parse(int argc, const cha
>>>>
>>>>       case OPT_INPUT:
>>>>         inputElements.push_back(std::unique_ptr<InputElement>(
>>>> -          new PECOFFFileNode(ctx, inputArg->getValue())));
>>>> +          new PECOFFFileNode(ctx,
>>>> ctx.allocateString(inputArg->getValue()))));
>>>>         break;
>>>>
>>>>   #define DEFINE_BOOLEAN_FLAG(name, setter)       \
>>>> @@ -892,9 +892,11 @@ WinLinkDriver::parse(int argc, const cha
>>>>     // start with a hypen or a slash. This is not compatible with
>>>> link.exe
>>>>     // but useful for us to test lld on Unix.
>>>>     if (llvm::opt::Arg *dashdash =
>>>> parsedArgs->getLastArg(OPT_DASH_DASH)) {
>>>> -    for (const StringRef value : dashdash->getValues())
>>>> -      inputElements.push_back(
>>>> -          std::unique_ptr<InputElement>(new PECOFFFileNode(ctx,
>>>> value)));
>>>> +    for (const StringRef value : dashdash->getValues()) {
>>>> +      std::unique_ptr<InputElement> elem(
>>>> +          new PECOFFFileNode(ctx, ctx.allocateString(value)));
>>>> +      inputElements.push_back(std::move(elem));
>>>> +    }
>>>>     }
>>>>
>>>>     // Add the libraries specified by /defaultlib unless they are
>>>> already
>>>> added
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>>> hosted by the Linux Foundation
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>>
>>>> llvm-commits mailing list
>>>>
>>>> llvm-commits at cs.uiuc.edu
>>>>
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Rick Foos
>>>>
>>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>>> hosted by The Linux Foundation
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Rick Foos
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
>>> by The Linux Foundation
>>
>>
>>
>>
>> --
>> Rick Foos
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
>> by The Linux Foundation
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>