[llvm-dev] Making LLD PDB generation faster

Zachary Turner via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 25 06:34:26 PST 2019


I don’t think changing the compiler or linker is supported with the vs
generator, but I also don’t think it’s a bug
On Mon, Feb 25, 2019 at 6:31 AM Alexandre Ganea <alexandre.ganea at ubisoft.com>
wrote:

> Can you please try using Ninja instead?
>
> cmake -G Ninja f:/svn/llvm -DCMAKE_BUILD_TYPE=Release
> -DLLVM_OPTIMIZED_TABLEGEN=true -DLLVM_EXTERNAL_LLD_SOURCE_DIR=f:/svn/lld
> -DLLVM_TOOL_LLD_BUILD=true -DLLVM_ENABLE_LLD=true
> -DCMAKE_C_COMPILER="C:/Program Files/LLVM/bin/clang-cl.exe"
> -DCMAKE_CXX_COMPILER="C:/Program Files/LLVM/bin/clang-cl.exe"
> -DCMAKE_LINKER="C:/Program Files/LLVM/bin/lld-link.exe"
> -DLLVM_ENABLE_PDB=true
>
> It will be faster to compile. The setup I use is the above Ninja cmd-line
> for compiling optimized builds; and in addition, I keep the Visual Studio
> generator, as you do, but only for having a .sln to debug. It is a bit
> annoying to cmake twice, in two different build folders, but you can write
> a batch script.
>
> If the above works, maybe you should log the bug on https://bugs.llvm.org/
> so it is not forgotten.
>
> Alex.
>
> -----Original Message-----
> From: Leonardo Santagada <santagada at gmail.com>
> Sent: Monday, February 25, 2019 9:04 AM
> To: Alexandre Ganea <alexandre.ganea at ubisoft.com>
> Cc: Zachary Turner <zturner at google.com>; Reid Kleckner <rnk at google.com>;
> llvm-dev <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] Making LLD PDB generation faster
>
> Ok so there's a lot of confusion on cmake regarding using llvm as a
> toolset. It still does all its checks against cl.exe (not clang-cl) and
> somehow overriders CMAKE_LINKER to be link.exe. I tried a couple of places
> including:
>
> cmake -G "Visual Studio 15 2017" -A x64 -T"llvm",host=x64
> -DCMAKE_LINKER="C:/Program Files/LLVM/bin/lld-link.exe"
> -DCMAKE_CXX_COMPILER="C:/Program Files/LLVM/bin/clang-cl.exe"
> -DCMAKE_C_COMPILER="C:/Program Files/LLVM/bin/clang-cl.exe"
> -DLLVM_ENABLE_LTO=true -DLLVM_ENABLE_PDB=true -DLLVM_ENABLE_PROJECTS=lld
> ../llvm
>
> but it seems like the generator overrides it.
>
>
> ps: Created a phabricator account
>
> On Mon, Feb 25, 2019 at 2:48 PM Alexandre Ganea <
> alexandre.ganea at ubisoft.com> wrote:
> >
> > That's good news. For having debug info, you could try adding /Z7 on the
> cmake cmd-line, such as -DCMAKE_CXX_FLAGS="/Z7". Or use the
> 'RelWithDebInfo' target instead of 'Release' and add
> -DCMAKE_CXX_FLAGS="/Ob2" (because that target uses /Ob1 as a default).
> >
> > Can you please send a patch on Phabricator if you fix the
> LLVM_ENABLE_PDB issue with Clang? The goal is to have performance
> out-of-the-box.
> >
> > Alex.
> >
> > -----Original Message-----
> > From: Leonardo Santagada <santagada at gmail.com>
> > Sent: Monday, February 25, 2019 7:36 AM
> > To: Alexandre Ganea <alexandre.ganea at ubisoft.com>
> > Cc: Zachary Turner <zturner at google.com>; Reid Kleckner
> > <rnk at google.com>; llvm-dev <llvm-dev at lists.llvm.org>
> > Subject: Re: [llvm-dev] Making LLD PDB generation faster
> >
> > With your patch for cmake and reconfiguring it with "cmake -G "Visual
> Studio 15 2017" -A x64 -T"llvm",host=x64 -DLLVM_ENABLE_PDB=true
> -DLLVM_ENABLE_PROJECTS=lld  ../llvm" we get these results:
> >
> >   Input File Reading:          1602 ms (  3.5%)
> >   Code Layout:                  493 ms (  1.1%)
> >   PDB Emission (Cumulative):  43127 ms ( 94.5%)
> >     Add Objects:              34577 ms ( 75.8%)
> >       Type Merging:           26709 ms ( 58.5%)
> >       Symbol Merging:          7598 ms ( 16.7%)
> >     TPI Stream Layout:         1107 ms (  2.4%)
> >     Globals Stream Layout:      602 ms (  1.3%)
> >     Commit to Disk:            5636 ms ( 12.4%)
> >   Commit Output File:            16 ms (  0.0%)
> > -------------------------------------------------
> > Total Link Time:              45626 ms (100.0%)
> >
> > Unfortunately there were no pdb generated with lld.exe (or any other
> > binaries) so I can't debug them. It seems like LLVM_ENABLE_PDB is not
> made to support using clang to complie itself as it tries to att /Zi to the
> targets instead of /Z7 and global hashes. I can patch it over here, but we
> probably want to fix this in cmake and on the docs, as its not clear at all
> how to compile lld in a performance 64bit way.
> >
> > On Mon, Feb 25, 2019 at 2:38 AM Alexandre Ganea <
> alexandre.ganea at ubisoft.com> wrote:
> > >
> > > How do you compile LLD? There's a big difference between when using
> > > MSVC vs Clang. The parallel ghash patch I was mentioning is almost
> > > 2x as fast when using Clang 7.0+ vs. MSVC 15.9+, I don't know
> > > exactly why. I also suggest you use the Release target. You should
> also grab this patch:
> > > https://reviews.llvm.org/D55056 - I had to revert it because it was
> > > causing issues with LLDB. But it will give an improvement for LLD.
> > > Please let me know if that improves your timings.
> > >
> > > The page faults are probably the OS loading from disk: most, if not
> > > all the files are accessed by LLD by mmap'ing them.
> > >
> > > The lockless DenseHash I was talking about will be published in an
> > > upcoming patch. As for reproducibility, this can be an issue on
> > > build systems. But on local machines, we could explicitly state that
> > > we want non-deterministic builds, through some cmd-line flag. If your
> 57sec for "Type Merging"
> > > transforms into 5sec when non-deterministic, I think that's worth it.
> > >
> > > Alex.
> > >
> > > -----Original Message-----
> > > From: Leonardo Santagada <santagada at gmail.com>
> > > Sent: Sunday, February 24, 2019 6:43 PM
> > > To: Alexandre Ganea <alexandre.ganea at ubisoft.com>
> > > Cc: Zachary Turner <zturner at google.com>; Reid Kleckner
> > > <rnk at google.com>; llvm-dev <llvm-dev at lists.llvm.org>
> > > Subject: Re: [llvm-dev] Making LLD PDB generation faster
> > >
> > > More info inline, I think there is a couple of misconceptions on what
> I'm doing:
> > >
> > > 1) I already patch all my .obj files to contain .debug$H entries so
> > > it is all ghashed already
> > > 2) All the 35s is spent adding to the DenseMap
> > >
> > > Here is my current times (lld-link.exe compiled with -O2 so no
> lto/pgo), lld generates a 141 MB binary and 1.2GB pdb file:
> > >
> > >   Input File Reading:          1724 ms (  2.1%)
> > >   Code Layout:                  482 ms (  0.6%)
> > >   PDB Emission (Cumulative):  79261 ms ( 96.8%)
> > >     Add Objects:              68650 ms ( 83.8%)
> > >       Type Merging:           57534 ms ( 70.2%)
> > >       Symbol Merging:         10822 ms ( 13.2%)
> > >     TPI Stream Layout:         1501 ms (  1.8%)
> > >     Globals Stream Layout:      770 ms (  0.9%)
> > >     Commit to Disk:            7007 ms (  8.6%)
> > >   Commit Output File:            19 ms (  0.0%)
> > > -------------------------------------------------
> > > Total Link Time:              81900 ms (100.0%)
> > >
> > > Our target is for < 20 seconds linking, anything bellow 40 seconds
> would be ok. Ideal times would be around 8s (in which it will mostly beat
> link.exe incremental linking).
> > >
> > > My tip for profiling is using superluminal
> > > (https://www.superluminal.eu/) its the easiest way to see everything
> your code is doing.
> > >
> > > On Sun, Feb 24, 2019 at 5:18 PM Alexandre Ganea <
> alexandre.ganea at ubisoft.com> wrote:
> > > >
> > > > Leonardo, to answer to your questions, yes to all of them  You
> > > > can take a
> > > >
> > > > look at this prototype/proposal: https://reviews.llvm.org/D55585
> > > >
> > > >
> > > >
> > > > Overall, computing ghashes in parallel at link-time and merging
> > > > Types with them
> > > >
> > > > is less costly that the current approach to merging. The 35sec
> > > > you’re seeing
> > > >
> > > > for merging should go down to about 15sec.
> > >
> > > I don't do much computing of ghashes as we already preprocess all .obj
> files from msvc to add a .debug$H to them. The whole 35 seconds is spent in
> just densehash findbucket function. The rest of the time is mostly
> pagefaults (I guess to load in obj data and to grow the final pdb?).
> > >
> > > > The patch doesn’t parallelize
> > > >
> > > > (yet) the Type merging itself, but we have an alternate
> > > > multithread-suitable
> > > >
> > > > implementation of DenseHash which already supports lockless,
> > > > wait-free,
> > > >
> > > > insert/fetch/resize.
> > >
> > > Where is this lockless densehash? This is the part were I would love
> to help, but if there is a densehash it is probably just creating the
> threads and letting them merge the results. I'm a bit afraid of
> reproduceability of builds, but as we already don't have that with link.exe
> we are not really loosing anything.
> > >
> > > >
> > > >
> > > > The prototype allows for testing different hashing algorithms, and
> > > > indeed
> > > >
> > > > xxHash seems to be the best general-purpose choice. I’ve also
> > > > added support
> > > >
> > > > for more specialized hardware-based hashes, like Casey Muratori’s
> > > > Meow Hash
> > > >
> > > > (uses hardware AES SSE 4.2 instructions), which brings the figures
> down a bit.
> > > >
> > >
> > > I remembered Meow hashes needing at least k bytes of data, but looking
> at their website right now there is no mention of it. Hashing performance
> isn't much of an impact as we do it per .obj file distributed through our
> company so the time to calculate those are completely distributed.
> > >
> > > >
> > > >
> > > > Future changes could write back the computed ghash stream back to
> > > > OBJs if
> > > >
> > > > /INCREMENTAL is specified (just an idea). Incrementally linking
> > > > will be faster
> > > >
> > > > that way when working with MSVC OBJs.
> > > >
> > >
> > > I already have a patch for llvm-objcopy that adds a -add-ghashes
> > > option that does this, I will be cleaning it up this week and
> > > sending a PR for it
> > >
> > > >
> > > >
> > > > As for creating PDBs for independent projects, that would help most
> likely.
> > > >
> > > > However the ghash stream would need to be stored in the PDB in
> > > > that case
> > > >
> > > > (currently, ghashes are dropped after merging). That could help
> > > > when using
> > > >
> > > > rarely compiled projects, used along with network caches.
> > >
> > > I meant actually a .lib, with all the obj files inside plus a merged
> .debug$H entry. No pdb generation or changes necessary, we just run the
> same code that merges types in lld and do that a the librarian level.
> > >
> > > >
> > > > I will start sending smaller patches to converge towards the
> > > > functionally of
> > > >
> > > > the prototype above.
> > > >
> > > >
> > > >
> > > > Best,
> > > >
> > > > Alex.
> > > >
> > > >
> > > >
> > > > From: Zachary Turner <zturner at google.com>
> > > > Sent: Sunday, February 24, 2019 1:20 AM
> > > > To: Leonardo Santagada <santagada at gmail.com>
> > > > Cc: Alexandre Ganea <alexandre.ganea at ubisoft.com>; Reid Kleckner
> > > > <rnk at google.com>; llvm-dev <llvm-dev at lists.llvm.org>
> > > > Subject: Re: [llvm-dev] Making LLD PDB generation faster
> > > >
> > > >
> > > >
> > > > +Reid and Alexandre, who have been doing work in this area
> > > > +recently
> > > >
> > > >
> > > >
> > > > On Sat, Feb 23, 2019 at 4:07 AM Leonardo Santagada via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > Is anyone working on making the PDB generation on LLD faster?
> > > > Looking of a trace for linking one of our binaries (it takes
> > > > 1min6s-1min20s) I see two things:
> > > >
> > > > 1) LookupBucketFor(Val, ConstFoundBucket); takes 35s so almost
> > > > half of the time of linking, mostly finding duplicates
> > > > 2) There is no parallelization inside of addObjectsToPDB
> > > >
> > > > Is anyone working on those? Also has anyone thought about merging
> > > > .obj files to deduplicate type infomation so we can do the linking
> > > > on projects to generate something like a lib file, but
> > > > deduplicated debug information (as far as I know actual .lib just
> > > > put all pdbs or
> > > > /Z7 debug info inside a file without dedup).
> > > >
> > > > Just looking at the code it seems it is much more mature and also
> > > > the choice of SHA1_8 seems interesting (still don't know why not
> > > > use xxHash64).
> > > >
> > > > ps: My code to add ghashes to msvc compiled .obj files is almost
> > > > ready to be pushed as an option for llvm-objcopy.
> > > >
> > > > --
> > > >
> > > > Leonardo Santagada
> > > > _______________________________________________
> > > > LLVM Developers mailing list
> > > > llvm-dev at lists.llvm.org
> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > >
> > >
> > >
> > > --
> > >
> > > Leonardo Santagada
> >
> >
> >
> > --
> >
> > Leonardo Santagada
>
>
>
> --
>
> Leonardo Santagada
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190225/bd68b052/attachment.html>


More information about the llvm-dev mailing list