[LLVMdev] [lld] Current performance issues

Fri Dec 6 17:50:53 PST 2013

I happened to start making performance improvements for LLD this week, and
have already submitted the following patches for it. My test is to link LLD
on Windows, and it has been improved from roughly 10 seconds to 6 seconds
this week.

r196504: [PECOFF] Handle .lib files as if they are grouped by
--{start,end}-group.
r196628: Re-submit r195852 with GroupedSectionsPass change.

I'm now looking into the layout pass as it takes ~20% link time. It looks
to me that the main reason why the pass is so slow is because we do too
much things in _compare() function. We might be able to cache _compare()'s
result. Or simple parallel_sort might work.

I've also noticed that reading from archive files is slow because, as you
wrote, it's not multi-threaded. How would you parallelize it?

On Sat, Dec 7, 2013 at 10:30 AM, Michael Spencer <bigcheesegs at gmail.com>wrote:

> So I started doing performance analysis again, and we've slowed down
> quite a bit. My current test is statically linking clang for Linux on
> Windows. I currently care mostly about Windows performance as that's
> where we run it.
>
> Here's a rough breakdown of time usage (doesn't add up to %100 because
> of rounding):
>
> %8.8 - fs::get_magic from the driver.
> %0.8 - Reading the files on the command line.
> %29 - Resolver. This ~%90 of this is reading objects out of archives.
> This can be parallelized, and I have an outdated patch which does
> this.
> %51 - Passes. Mostly the layout pass. And in the layout pass it's
> mostly due to cache misses. I've already tried parallelizing the sort,
> it doesn't help much.
> %9   - Writer. Most of this is in prep work. The actual writing to
> disk part and applying relocations is very small.
> %1   - Unaccounted for.
>
> I'm going to do some work to solve the get_magic and resolver issue
> with threads. I think we really need to look into how the layout pass
> is handled. If the cache effects are bad enough, we may actually need
> to change to a non-virtual POD based interface for atoms. Meaning that
> readers fill in atom data at the start, instead of figuring it out at
> runtime.
>
> - Michael Spencer
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131207/a5f29483/attachment.html>