[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Leonardo Santagada via llvm-dev llvm-dev at lists.llvm.org
Sat Jan 20 13:34:14 PST 2018


if we get to < 30s I think most users would prefer it to link.exe, just
hopping there is still some more optimizations to get closer to ELF linking
times (around 10-15s here).

On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner <zturner at google.com> wrote:

> Generally speaking a good rule of thumb is that /debug:ghash will be close
> to or faster than /debug:fastlink, but with none of the penalties like slow
> debug time
> On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <zturner at google.com>
> wrote:
>
>> Chrome is actually one of my exact benchmark cases. When building
>> blink_core.dll and browser_tests.exe, i get anywhere from a 20-40%
>> reduction in link time. We have some other optimizations in the pipeline
>> but not upstream yet.
>>
>> My best time so far (including other optimizations not yet upstream) is
>> 28s on blink_core.dll, compared to 110s with /debug
>> On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada <santagada at gmail.com>
>> wrote:
>>
>>> On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <zturner at google.com>
>>> wrote:
>>>
>>>> You probably don't want to go down the same route that clang goes
>>>> through to write the object file.  If you think yaml2coff is convoluted,
>>>> the way clang does it will just give you a headache.  There are multiple
>>>> abstractions involved to account for different object file formats (ELF,
>>>> COFF, MachO) and output formats (Assembly, binary file).  At least with
>>>> yaml2coff
>>>>
>>>
>>> I think your phrase got cut there, but yeah I just found AsmPrinter.cpp
>>> and it is convoluted.
>>>
>>>
>>>
>>>> It's true that yaml2coff is using the COFFParser structure, but if you
>>>> look at the writeCOFF function in yaml2coff it's pretty bare-metal.
>>>> The logic you need will be almost identical, except that instead of
>>>> checking the COFFParser for the various fields, you'll check the existing
>>>> COFFObjectFile, which should have similar fields.
>>>>
>>>> The only thing you need to different is when writing the section table
>>>> and section contents, to insert a new entry.  Since you're injecting a
>>>> section into the middle, you'll also probably need to push back the file
>>>> pointer of all subsequent sections so that they don't overlap.  (e.g. if
>>>> the original sections are 1, 2, 3, 4, 5 and you insert between 2 and 3,
>>>> then the original sections 3, 4, and 5 would need to have their
>>>> FilePointerToRawData offset by the size of the new section).
>>>>
>>>
>>> I have the PE/COFF spec open here and I'm happy that I read a bit of it
>>> so I actually know what you are talking about... yeah it doesn't seem too
>>> complicated.
>>>
>>>
>>>
>>>> If you need to know what values to put for the other fields in a
>>>> section header, run `dumpbin /headers foo.obj` on a clang-generated object
>>>> file that has a .debug$H section already (e.g. run clang with
>>>> -emit-codeview-ghash-section, and look at the properties of the .debug$H
>>>> section and use the same values).
>>>>
>>>
>>> Thanks I will do that and then also look at how the CodeView part of the
>>> code does it if I can't understand some of it.
>>>
>>>
>>>> The only invariant that needs to be maintained is that Section[N]->FilePointerOfRawData ==
>>>> Section[N-1]->FilePointerOfRawData + Section[N-1]->SizeOfRawData
>>>>
>>>
>>> Well, that and all the sections need to be on the final file... But I'm
>>> hopeful.
>>>
>>>
>>> Anyone has times on linking a big project like chrome with this so that
>>> at least I know what kind of performance to expect?
>>>
>>> My numbers are something like:
>>>
>>> 1 pdb per obj file: link.exe takes ~15 minutes and 16GB of ram,
>>> lld-link.exe takes 2:30 minutes and ~8GB of ram
>>> around 10 pdbs per folder: link.exe takes 1 minute and 2-3GB of ram,
>>> lld-link.exe takes 1:30 minutes and ~6GB of ram
>>> faslink: link.exe takes 40 seconds, but then 20 seconds of loading at
>>> the first break point in the debugger and we lost DIA support for listing
>>> symbols.
>>> incremental: link.exe takes 8 seconds, but it only happens when very
>>> minor changes happen.
>>>
>>> We have an non negligible number of symbols used on some runtime systems.
>>>
>>>
>>>>
>>>> On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada <
>>>> santagada at gmail.com> wrote:
>>>>
>>>>> Thanks for the tips, I now have something that reads the obj file,
>>>>> finds .debug$T sections and global hashes it (proof of concept kind of
>>>>> code). What I can't find is: how does clang itself writes the coff files
>>>>> with global hashes, as that might help me understand how to create the
>>>>> .debug$H section, how to update the file section count and how to properly
>>>>> write this back.
>>>>>
>>>>> The code on yaml2coff is expecting to be working on the yaml
>>>>> COFFParser struct and I'm having quite a bit of a headache turning the
>>>>> COFFObjectFile into a COFFParser object or compatible... Tomorrow I might
>>>>> try the very non efficient path of coff2yaml and then yaml2coff with the
>>>>> hashes header... but it seems way too inefficient and convoluted.
>>>>>
>>>>> On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <zturner at google.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada <
>>>>>> santagada at gmail.com> wrote:
>>>>>>
>>>>>>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner <zturner at google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo Santagada <
>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> No I didn't, I used cl.exe from the visual studio toolchain. What
>>>>>>>>> I'm proposing is a tool for processing .obj files in COFF format, reading
>>>>>>>>> them and generating the GHASH part.
>>>>>>>>>
>>>>>>>>> To make our build faster we use hundreds of unity build files
>>>>>>>>> (.cpp's with a lot of other .cpp's in them aka munch files) but still have
>>>>>>>>> a lot of single .cpp's as well (in total something like 3.4k .obj files).
>>>>>>>>>
>>>>>>>>> ps: sorry for sending to the wrong list, I was reading about llvm
>>>>>>>>> mailing lists and jumped when I saw what I thought was a lld exclusive list.
>>>>>>>>>
>>>>>>>>
>>>>>>>> A tool like this would be useful, yes.  We've talked about it
>>>>>>>> internally as well and agreed it would be useful, we just haven't
>>>>>>>> prioritized it.  If you're interested in submitting a patch along those
>>>>>>>> lines though, I think it would be a good addition.
>>>>>>>>
>>>>>>>> I'm not sure what the best place for it would be.  llvm-readobj and
>>>>>>>> llvm-objdump seem like obvious choices, but they are intended to be
>>>>>>>> read-only, so perhaps they wouldn't be a good fit.
>>>>>>>>
>>>>>>>> llvm-pdbutil is kind of a hodgepodge of everything else related to
>>>>>>>> PDBs and symbols, so I wouldn't be opposed to making a new subcommand there
>>>>>>>> called "ghash" or something that could process an object file and output a
>>>>>>>> new object file with a .debug$H section.
>>>>>>>>
>>>>>>>> A third option would be to make a new tool for it.
>>>>>>>>
>>>>>>>> I don't htink it would be that hard to write.  If you're interested
>>>>>>>> in trying to make a patch for this, I can offer some guidance on where to
>>>>>>>> look in the code.  Otherwise it's something that we'll probably get to, I'm
>>>>>>>> just not sure when.
>>>>>>>>
>>>>>>>>>
>>>>>>> I would love to write it and contribute it back, please do tell, I
>>>>>>> did find some of the code of ghash in lld, but in fuzzy on the llvm
>>>>>>> codeview part of it and never seen llvm-readobj/objdump or llvm-pdbutil,
>>>>>>> but I'm not afraid to look :)
>>>>>>>
>>>>>>>
>>>>>>  Luckily all of the important code is hidden behind library calls,
>>>>>> and it should already just do the right thing, so I suspect you won't need
>>>>>> to know much about CodeView to do this.
>>>>>>
>>>>>> I think Peter has the right idea about putting this in llvm-objcopy.
>>>>>>
>>>>>> You can look at one of the existing CopyBinary functions there, which
>>>>>> currently only work for ELF, but you can just make a new overload that
>>>>>> accepts a COFFObjectFile.
>>>>>>
>>>>>> I would probably start by iterating over each of the sections
>>>>>> (getNumberOfSections / getSectionName) looking for .debug$T and .debug$H
>>>>>> sections.
>>>>>>
>>>>>> If you find a .debug$H section then you can just skip that object
>>>>>> file.
>>>>>>
>>>>>> If you find a .debug$T but not a .debug$H, then basically do the same
>>>>>> thing that LLD does in PDBLinker::mergeDebugT  (create a CVTypeArray, and
>>>>>> pass it to GloballyHashedType::hashTypes.  That will return an array
>>>>>> of hash values.  (the format of .debug$H is the header, followed by the
>>>>>> hash values).  Then when you're writing the list of sections, just add in
>>>>>> the .debug$H section right after the .debug$T section.
>>>>>>
>>>>>> Currently llvm-objcopy only writes ELF files, so it would need to be
>>>>>> taught to write COFF files.  We have code to do this in the yaml2obj
>>>>>> utility (specifically, in yaml2coff.cpp in the function writeCOFF).  There
>>>>>> may be a way to move this code to somewhere else (llvm/Object/COFF.h?) so
>>>>>> that it can be re-used by both yaml2coff and llvm-objcopy, but in the worst
>>>>>> case scenario you could copy the code and re-write it to work with these
>>>>>> new structures.
>>>>>>
>>>>>> Lastly, you'll probably want to put all of this behind an option in
>>>>>> llvm-objcopy such as -add-codeview-ghash-section
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Leonardo Santagada
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Leonardo Santagada
>>>
>>


-- 

Leonardo Santagada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180120/97945e5f/attachment.html>


More information about the llvm-dev mailing list