[lld] r288606 - Add comments about the use of threads in LLD.

Sun Dec 4 14:50:06 PST 2016

Lazy symbols are created only for defined symbols in object files in static
archives. LLD doesn't know the complete picture of symbols in that sense.
When you pull out a file from an archive, new symbols appear, and you need
to resolve them again. Isn't this a sequential process?

On Sun, Dec 4, 2016 at 2:03 PM, Sean Silva <chisophugis at gmail.com> wrote:

>
>
> On Sun, Dec 4, 2016 at 7:09 AM, Rui Ueyama <ruiu at google.com> wrote:
>
>> On Sun, Dec 4, 2016 at 1:55 AM, Sean Silva <chisophugis at gmail.com> wrote:
>>
>>>
>>>
>>> On Sat, Dec 3, 2016 at 3:35 PM, Rui Ueyama via llvm-commits <
>>> llvm-commits at lists.llvm.org> wrote:
>>>
>>>> Author: ruiu
>>>> Date: Sat Dec  3 17:35:22 2016
>>>> New Revision: 288606
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=288606&view=rev
>>>> Log:
>>>> Add comments about the use of threads in LLD.
>>>>
>>>> Modified:
>>>>     lld/trunk/ELF/Threads.h
>>>>
>>>> Modified: lld/trunk/ELF/Threads.h
>>>> URL: http://llvm.org/viewvc/llvm-project/lld/trunk/ELF/Threads.h?
>>>> rev=288606&r1=288605&r2=288606&view=diff
>>>> ============================================================
>>>> ==================
>>>> --- lld/trunk/ELF/Threads.h (original)
>>>> +++ lld/trunk/ELF/Threads.h Sat Dec  3 17:35:22 2016
>>>> @@ -6,6 +6,54 @@
>>>>  // License. See LICENSE.TXT for details.
>>>>  //
>>>>  //===------------------------------------------------------
>>>> ----------------===//
>>>> +//
>>>> +// LLD supports threads to distribute workloads to multiple cores.
>>>> Using
>>>> +// multicore is most effective when more than one core are idle. At the
>>>> +// last step of a build, it is often the case that a linker is the only
>>>> +// active process on a computer. So, we are naturally interested in
>>>> using
>>>> +// threads wisely to reduce latency to deliver results to users.
>>>> +//
>>>> +// That said, we don't want to do "too clever" things using threads.
>>>> +// Complex multi-threaded algorithms are sometimes extremely hard to
>>>> +// justify the correctness and can easily mess up the entire design.
>>>> +//
>>>> +// Fortunately, when a linker links large programs (when the link time
>>>> is
>>>> +// most critical), it spends most of the time to work on massive
>>>> number of
>>>> +// small pieces of data of the same kind. Here are examples:
>>>> +//
>>>> +//  - We have hundreds of thousands of input sections that need to be
>>>> +//    copied to a result file at the last step of link. Once we fix a
>>>> file
>>>> +//    layout, each section can be copied to its destination and its
>>>> +//    relocations can be applied independently.
>>>> +//
>>>> +//  - We have tens of millions of small strings when constructing a
>>>> +//    mergeable string section.
>>>> +//
>>>> +// For the cases such as the former, we can just use parallel_for_each
>>>> +// instead of std::for_each (or a plain for loop). Because tasks are
>>>> +// completely independent from each other, we can run them in parallel
>>>> +// without any coordination between them. That's very easy to
>>>> understand
>>>> +// and justify.
>>>> +//
>>>> +// For the cases such as the latter, we can use parallel algorithms to
>>>> +// deal with massive data. We have to write code for a tailored
>>>> algorithm
>>>> +// for each problem, but the complexity of multi-threading is isolated
>>>> in
>>>> +// a single pass and doesn't affect the linker's overall design.
>>>> +//
>>>> +// The above approach seems to be working fairly well. As an example,
>>>> when
>>>> +// linking Chromium (output size 1.6 GB), using 4 cores reduces
>>>> latency to
>>>> +// 75% compared to single core (from 12.66 seconds to 9.55 seconds) on
>>>> my
>>>> +// machine. Using 40 cores reduces it to 63% (from 12.66 seconds to
>>>> 7.95
>>>> +// seconds). Because of the Amdahl's law, the speedup is not linear,
>>>> but
>>>> +// as you add more cores, it gets faster.
>>>> +//
>>>> +// On a final note, if you are trying to optimize, keep the axiom
>>>> "don't
>>>> +// guess, measure!" in mind. Some important passes of the linker are
>>>> not
>>>> +// that slow. For example, resolving all symbols is not a very heavy
>>>> pass,
>>>> +// although it would be very hard to parallelize it. You want to first
>>>> +// identify a slow pass and then optimize it.
>>>>
>>>
>>> Actually, LLD's symbol resolution (the approach with Lazy symbols for
>>> archives) is a perfect example of a MapReduce type problem, so it is
>>> actually quite parallelizable.
>>> You basically have a huge number of (SymbolName,SymbolValue) pairs and
>>> you want to coalesce all values with the same SymbolName into pairs
>>> (SymbolName, [SymbolValue1,SymbolValue2,...]) which you can then
>>> process all the SymbolValueN's to see which is the real definition. This is
>>> precisely the problem that MapReduce solves.
>>>
>>
>> How do you handle static archives?
>>
>
> LLD's archive semantics insert lazy symbols for all the archive members,
> so it isn't a problem.
>
> -- Sean Silva
>
>
>>
>>
>>>
>>> (note: I don't necessarily mean that it needs to be done in a
>>> distributed fashion, just that the core problem is really one of coalescing
>>> values with the same keys.
>>> )
>>>
>>> MapReduce's core abstraction is also a good tool for deduplicating
>>> strings.
>>>
>>>
>>> Richard Smith and I were actually brainstorming at the latest llvm
>>> social a distributed linker may be a good fit for the linking problem at
>>> Google (but it was just brainstorming; obviously that would be a huge
>>> effort and we would need very serious justification before embarking on
>>> that effort).
>>>
>>> -- Sean Silva
>>>
>>>
>>>> +//
>>>> +//===------------------------------------------------------
>>>> ----------------===//
>>>>
>>>>  #ifndef LLD_ELF_THREADS_H
>>>>  #define LLD_ELF_THREADS_H
>>>>
>>>>
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161204/5d5e3c0a/attachment.html>