[PATCH] [opt] Replace the recursive walk for GC with a worklist algorithm.

Mon Jun 29 12:02:11 PDT 2015

On Fri, Jun 26, 2015 at 9:02 PM, Chandler Carruth <chandlerc at gmail.com>
wrote:

> Hi ruiu,
>
> This flattens the entire liveness walk from a recursive mark approach to
> a worklist approach. It also sinks the worklist management completely
> out of the SectionChunk and into the Writer by exposing the ability to
> iterato over children of a chunk and over the symbol bodies of relocated
> symbols. I'm not 100% happy with the API names, so suggestions welcome
> there.
>
> This allows us to use a single worklist for the entire recursive walk
> and would also be a natural place to take advantage of parallelism at
> some future point.
>
> With this, we completely inline away the GC walk into the
> Writer::markLive function and it makes it very easy to profile what is
> slow. Currently, time is being wasted checking whether a Chunk isa
> SectionChunk (it essentially always is), finding (or skipping)
> a replacement for a symbol, and chasing pointers between symbols and
> their chunks. There are a bunch of things we can do to fix this, and its
> easier to do them after this change IMO.
>
> This change alone saves 1-2% of the time for my self-link of lld.exe
> (which I'm running and benchmarking on Linux ironically).
>
> Perhaps more notably, we'll no longer blow out the stack for large
> links. =]
>
> Just as an FYI, at this point, I/O is starting to really dominate the
> profile. Well over 10% of the time appears to be inside the kernel doing
> page table silliness. I think a decent chunk of this can be nuked as
> well, but it's a little odd as cross-linking in this way isn't really
> the primary goal here.
>

This is really curious. Are you using huge pages on Linux? I know on Mac
I've had workloads that slow down by 4x due to simply due to the cost of
faulting in the memory (just minor faults; nothing disk related), since Mac
only uses 4K pages. Switching to 2M pages should alleviate that by a factor
of 512, unless the memory workload can't make use of 2M pages (e.g. if all
the input object files are so small that using a 2M page doesn't really
help).

-- Sean Silva

>
> Depends on D10789
>
> http://reviews.llvm.org/D10790
>
> Files:
>   COFF/Chunks.cpp
>   COFF/Chunks.h
>   COFF/Symbols.h
>   COFF/Writer.cpp
>
> EMAIL PREFERENCES
>   http://reviews.llvm.org/settings/panel/emailpreferences/
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150629/f122ae1c/attachment.html>