[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Mon Jan 22 15:28:23 PST 2018

Ok some information was lost on getting this example to you, I'm sorry for
not being clear.

We have a huge code base, let's say 90% of it doesn't include either
header, 9% include win32.h and 1% includes both, I will try to discover
why, but my guess is they include both a third party that includes
windows.h and some of our libs that use win32.h.

I will try to fully understand this tomorrow.

I guess clang will not implement this ever so finishing the object copier
is the best solution until all code is ported to clang.

On 23 Jan 2018 00:02, "Zachary Turner" <zturner at google.com> wrote:

> You said win32.h doesn't include windows.h, but main.cpp does.  So what's
> the disadvantage of just including it in win32.h anyway, since it's already
> going to be in every translation unit?  (Unless you didn't mean to #include
> it in main.cpp)
>
>
> I guess all I can do is warn you how bad of an idea this is.  For
> starters, I already found a bug in your code ;-)
>
> // stdint.h
> typedef int                int32_t;
>
> // winnt.h
> typedef long LONG;
>
> // windef.h
> typedef struct tagPOINT
> {
>     LONG  x;   // long x
>     LONG  y;   // long y
> } POINT, *PPOINT, NEAR *NPPOINT, FAR *LPPOINT;
>
> // win32.h
> typedef int32_t LONG;
>
> struct POINT
> {
> LONG x;   // int x
> LONG y;   // int y
> };
>
> So POINT is defined two different ways.  In your minimal interface, it's
> declared as 2 int32's, which are int.  In the actual Windows header files,
> it's declared as 2 longs.
>
> This might seem like a unimportant bug since int and long are the same
> size, but int and long also mangle differently and affect overload
> resolution, so you could have weird linker errors or call the wrong
> function overload.
>
> Plus, it illustrates the fact that this struct *actually is* a different
> type from the one in the windows header.
>
> You said at the end that you never intentionally import win32.h and
> windows.h from the same translation unit.  But then in this example you
> did.  I wonder if you could enforce that by doing this:
>
> // win32.h
> #pragma once
>
> // Error if windows.h was included before us.
> #if defined(_WINDOWS_)
> #error "You're including win32.h after having already included windows.h.
> Don't do this!"
> #endif
>
> // And also make sure windows.h can't get included after us
> #define _WINDOWS_
>
> For the record, I tried the test case you linked when windows.h is not
> included in main.cpp and it works (but still has the bug about int and
> long).
>
> On Mon, Jan 22, 2018 at 2:23 PM Leonardo Santagada <santagada at gmail.com>
> wrote:
>
>> It is super gross, but we copy parts of windows.h because having all of
>> it if both gigantic and very very messy. So our win32.h has a couple
>> thousands of lines and not 30k+ for windows.h and we try to have zero
>> macros. Win32.h doesn't include windows.h so using ::BOOL wouldn't work. We
>> don't want to create a namespace, we just want a cleaner interface to
>> windows api. The namespace with c linkage is the way to trick cl into
>> allowing us to in some files have both windows.h and Win32.h. I really
>> don't see any way for us to have this Win32.h without this cl support, so
>> maybe we should either put windows.h in a compiled header somewhere and not
>> care that it is infecting everything or just have one place we can call to
>> clean up after including windows.h (a massive set of undefs).
>>
>> So using can't work, because we never intentionally import windows.h and
>> win32.h on the same translation unit.
>>
>> On Mon, Jan 22, 2018 at 7:08 PM, Zachary Turner <zturner at google.com>
>> wrote:
>>
>>> This is pretty gross, honestly :)
>>>
>>> Can't you just use using declarations?
>>>
>>> namespace Win32 {
>>> extern "C" {
>>>
>>> using ::BOOL;
>>> using ::LONG;
>>> using ::POINT;
>>> using ::LPPOINT;
>>>
>>> using ::GetCursorPos;
>>> }
>>> }
>>>
>>> This works with clang-cl.
>>>
>>> On Mon, Jan 22, 2018 at 5:39 AM Leonardo Santagada <santagada at gmail.com>
>>> wrote:
>>>
>>>> Here it is a minimal example, we do this so we don't have to import the
>>>> whole windows api everywhere.
>>>>
>>>> https://gist.github.com/santagada/7977e929d31c629c4bf18ebb987f6be3
>>>>
>>>> On Sun, Jan 21, 2018 at 2:31 AM, Zachary Turner <zturner at google.com>
>>>> wrote:
>>>>
>>>>> Clang-cl maintains compatibility with msvc even in cases where it’s
>>>>> non standards compliant (eg 2 phase name lookup), but we try to keep these
>>>>> cases few and far between.
>>>>>
>>>>> To help me understand your case, do you mean you copy windows.h and
>>>>> modify it? How does this lead to the same struct being defined twice? If i
>>>>> were to write this:
>>>>>
>>>>> struct Foo {};
>>>>> struct Foo {};
>>>>>
>>>>> Is this a small repro of the issue you’re talking about?
>>>>>
>>>>> On Sat, Jan 20, 2018 at 3:44 PM Leonardo Santagada <
>>>>> santagada at gmail.com> wrote:
>>>>>
>>>>>> I can totally see something like incremental linking with a simple
>>>>>> padding between obj and a mapping file (which can also help with edit and
>>>>>> continue, something we also would love to have).
>>>>>>
>>>>>> We have another developer doing the port to support clang-cl, but
>>>>>> although most of our code also goes trough a version of clang, migrating
>>>>>> the rest to clang-cl has been a fight. From what I heard the main problem
>>>>>> is that we have a copy of parts of windows.h (so not to bring the awful
>>>>>> parts of it like lower case macros) and that totally works on cl, but clang
>>>>>> (at least 6.0) complains about two struct/vars with the same name, even
>>>>>> though they are exactly the same. Making clang-cl as broken as cl.exe is
>>>>>> not an option I suppose? I would love to turn on a flag
>>>>>> --accept-that-cl-made-bad-decisions-and-live-with-it and have this
>>>>>> at least until this is completely fixed in our code base.
>>>>>>
>>>>>> the biggest win with moving to cl would be a better more standards
>>>>>> compliant compiler, no 1 minute compiles on heavily templated files and
>>>>>> maybe the holy grail of ThinLTO.
>>>>>>
>>>>>> On Sat, Jan 20, 2018 at 10:56 PM, Zachary Turner <zturner at google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> 10-15s will be hard without true incremental linking.
>>>>>>>
>>>>>>> At some point that's going to be the only way to get any faster, but
>>>>>>> incremental linking is hard (putting it lightly), and since our full links
>>>>>>> are already really fast we think we can get reasonably close to link.exe
>>>>>>> incremental speeds with full links.  But it's never enough and I will
>>>>>>> always want it to be faster, so you may see incremental linking in the
>>>>>>> future after we hit a performance wall with full link speed :)
>>>>>>>
>>>>>>> In any case, I'm definitely interested in seeing what kind of
>>>>>>> numbers you get with /debug:ghash after you get this llvm-objcopy feature
>>>>>>> implemented.  So keep me updated :)
>>>>>>>
>>>>>>> As an aside, have you tried building with clang instead of cl?  If
>>>>>>> you build with clang you wouldn't even have to do this llvm-objcopy work,
>>>>>>> because it would "just work".  If you've tried but ran into issues I'm
>>>>>>> interested in hearing about those too.  On the other hand, it's also
>>>>>>> reasonable to only switch one thing at a time.
>>>>>>>
>>>>>>> On Sat, Jan 20, 2018 at 1:34 PM Leonardo Santagada <
>>>>>>> santagada at gmail.com> wrote:
>>>>>>>
>>>>>>>> if we get to < 30s I think most users would prefer it to link.exe,
>>>>>>>> just hopping there is still some more optimizations to get closer to ELF
>>>>>>>> linking times (around 10-15s here).
>>>>>>>>
>>>>>>>> On Sat, Jan 20, 2018 at 9:50 PM, Zachary Turner <zturner at google.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Generally speaking a good rule of thumb is that /debug:ghash will
>>>>>>>>> be close to or faster than /debug:fastlink, but with none of the penalties
>>>>>>>>> like slow debug time
>>>>>>>>> On Sat, Jan 20, 2018 at 12:44 PM Zachary Turner <
>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Chrome is actually one of my exact benchmark cases. When building
>>>>>>>>>> blink_core.dll and browser_tests.exe, i get anywhere from a 20-40%
>>>>>>>>>> reduction in link time. We have some other optimizations in the pipeline
>>>>>>>>>> but not upstream yet.
>>>>>>>>>>
>>>>>>>>>> My best time so far (including other optimizations not yet
>>>>>>>>>> upstream) is 28s on blink_core.dll, compared to 110s with /debug
>>>>>>>>>> On Sat, Jan 20, 2018 at 12:28 PM Leonardo Santagada <
>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Sat, Jan 20, 2018 at 9:05 PM, Zachary Turner <
>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> You probably don't want to go down the same route that clang
>>>>>>>>>>>> goes through to write the object file.  If you think yaml2coff is
>>>>>>>>>>>> convoluted, the way clang does it will just give you a headache.  There are
>>>>>>>>>>>> multiple abstractions involved to account for different object file formats
>>>>>>>>>>>> (ELF, COFF, MachO) and output formats (Assembly, binary file).  At least
>>>>>>>>>>>> with yaml2coff
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I think your phrase got cut there, but yeah I just found
>>>>>>>>>>> AsmPrinter.cpp and it is convoluted.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> It's true that yaml2coff is using the COFFParser structure, but
>>>>>>>>>>>> if you look at the writeCOFF function in yaml2coff it's pretty
>>>>>>>>>>>> bare-metal.  The logic you need will be almost identical, except that
>>>>>>>>>>>> instead of checking the COFFParser for the various fields, you'll check the
>>>>>>>>>>>> existing COFFObjectFile, which should have similar fields.
>>>>>>>>>>>>
>>>>>>>>>>>> The only thing you need to different is when writing the
>>>>>>>>>>>> section table and section contents, to insert a new entry.  Since
>>>>>>>>>>>> you're injecting a section into the middle, you'll also probably need to
>>>>>>>>>>>> push back the file pointer of all subsequent sections so that they don't
>>>>>>>>>>>> overlap.  (e.g. if the original sections are 1, 2, 3, 4, 5 and you insert
>>>>>>>>>>>> between 2 and 3, then the original sections 3, 4, and 5 would need to have
>>>>>>>>>>>> their FilePointerToRawData offset by the size of the new section).
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I have the PE/COFF spec open here and I'm happy that I read a
>>>>>>>>>>> bit of it so I actually know what you are talking about... yeah it doesn't
>>>>>>>>>>> seem too complicated.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> If you need to know what values to put for the other fields in
>>>>>>>>>>>> a section header, run `dumpbin /headers foo.obj` on a clang-generated
>>>>>>>>>>>> object file that has a .debug$H section already (e.g. run clang with
>>>>>>>>>>>> -emit-codeview-ghash-section, and look at the properties of the .debug$H
>>>>>>>>>>>> section and use the same values).
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks I will do that and then also look at how the CodeView
>>>>>>>>>>> part of the code does it if I can't understand some of it.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> The only invariant that needs to be maintained is that
>>>>>>>>>>>> Section[N]->FilePointerOfRawData == Section[N-1]->FilePointerOfRawData +
>>>>>>>>>>>> Section[N-1]->SizeOfRawData
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Well, that and all the sections need to be on the final file...
>>>>>>>>>>> But I'm hopeful.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Anyone has times on linking a big project like chrome with this
>>>>>>>>>>> so that at least I know what kind of performance to expect?
>>>>>>>>>>>
>>>>>>>>>>> My numbers are something like:
>>>>>>>>>>>
>>>>>>>>>>> 1 pdb per obj file: link.exe takes ~15 minutes and 16GB of ram,
>>>>>>>>>>> lld-link.exe takes 2:30 minutes and ~8GB of ram
>>>>>>>>>>> around 10 pdbs per folder: link.exe takes 1 minute and 2-3GB of
>>>>>>>>>>> ram, lld-link.exe takes 1:30 minutes and ~6GB of ram
>>>>>>>>>>> faslink: link.exe takes 40 seconds, but then 20 seconds of
>>>>>>>>>>> loading at the first break point in the debugger and we lost DIA support
>>>>>>>>>>> for listing symbols.
>>>>>>>>>>> incremental: link.exe takes 8 seconds, but it only happens when
>>>>>>>>>>> very minor changes happen.
>>>>>>>>>>>
>>>>>>>>>>> We have an non negligible number of symbols used on some runtime
>>>>>>>>>>> systems.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jan 20, 2018 at 11:52 AM Leonardo Santagada <
>>>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the tips, I now have something that reads the obj
>>>>>>>>>>>>> file, finds .debug$T sections and global hashes it (proof of concept kind
>>>>>>>>>>>>> of code). What I can't find is: how does clang itself writes the coff files
>>>>>>>>>>>>> with global hashes, as that might help me understand how to create the
>>>>>>>>>>>>> .debug$H section, how to update the file section count and how to properly
>>>>>>>>>>>>> write this back.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The code on yaml2coff is expecting to be working on the yaml
>>>>>>>>>>>>> COFFParser struct and I'm having quite a bit of a headache turning the
>>>>>>>>>>>>> COFFObjectFile into a COFFParser object or compatible... Tomorrow I might
>>>>>>>>>>>>> try the very non efficient path of coff2yaml and then yaml2coff with the
>>>>>>>>>>>>> hashes header... but it seems way too inefficient and convoluted.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 10:38 PM, Zachary Turner <
>>>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 1:02 PM Leonardo Santagada <
>>>>>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 9:44 PM, Zachary Turner <
>>>>>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jan 19, 2018 at 12:29 PM Leonardo Santagada <
>>>>>>>>>>>>>>>> santagada at gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> No I didn't, I used cl.exe from the visual studio
>>>>>>>>>>>>>>>>> toolchain. What I'm proposing is a tool for processing .obj files in COFF
>>>>>>>>>>>>>>>>> format, reading them and generating the GHASH part.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To make our build faster we use hundreds of unity build
>>>>>>>>>>>>>>>>> files (.cpp's with a lot of other .cpp's in them aka munch files) but still
>>>>>>>>>>>>>>>>> have a lot of single .cpp's as well (in total something like 3.4k .obj
>>>>>>>>>>>>>>>>> files).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ps: sorry for sending to the wrong list, I was reading
>>>>>>>>>>>>>>>>> about llvm mailing lists and jumped when I saw what I thought was a lld
>>>>>>>>>>>>>>>>> exclusive list.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> A tool like this would be useful, yes.  We've talked about
>>>>>>>>>>>>>>>> it internally as well and agreed it would be useful, we just haven't
>>>>>>>>>>>>>>>> prioritized it.  If you're interested in submitting a patch along those
>>>>>>>>>>>>>>>> lines though, I think it would be a good addition.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm not sure what the best place for it would be.
>>>>>>>>>>>>>>>> llvm-readobj and llvm-objdump seem like obvious choices, but they are
>>>>>>>>>>>>>>>> intended to be read-only, so perhaps they wouldn't be a good fit.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> llvm-pdbutil is kind of a hodgepodge of everything else
>>>>>>>>>>>>>>>> related to PDBs and symbols, so I wouldn't be opposed to making a new
>>>>>>>>>>>>>>>> subcommand there called "ghash" or something that could process an object
>>>>>>>>>>>>>>>> file and output a new object file with a .debug$H section.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> A third option would be to make a new tool for it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I don't htink it would be that hard to write.  If you're
>>>>>>>>>>>>>>>> interested in trying to make a patch for this, I can offer some guidance on
>>>>>>>>>>>>>>>> where to look in the code.  Otherwise it's something that we'll probably
>>>>>>>>>>>>>>>> get to, I'm just not sure when.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I would love to write it and contribute it back, please do
>>>>>>>>>>>>>>> tell, I did find some of the code of ghash in lld, but in fuzzy on the llvm
>>>>>>>>>>>>>>> codeview part of it and never seen llvm-readobj/objdump or llvm-pdbutil,
>>>>>>>>>>>>>>> but I'm not afraid to look :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  Luckily all of the important code is hidden behind library
>>>>>>>>>>>>>> calls, and it should already just do the right thing, so I suspect you
>>>>>>>>>>>>>> won't need to know much about CodeView to do this.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think Peter has the right idea about putting this in
>>>>>>>>>>>>>> llvm-objcopy.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You can look at one of the existing CopyBinary functions
>>>>>>>>>>>>>> there, which currently only work for ELF, but you can just make a new
>>>>>>>>>>>>>> overload that accepts a COFFObjectFile.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would probably start by iterating over each of the sections
>>>>>>>>>>>>>> (getNumberOfSections / getSectionName) looking for .debug$T and .debug$H
>>>>>>>>>>>>>> sections.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you find a .debug$H section then you can just skip that
>>>>>>>>>>>>>> object file.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you find a .debug$T but not a .debug$H, then basically do
>>>>>>>>>>>>>> the same thing that LLD does in PDBLinker::mergeDebugT  (create a
>>>>>>>>>>>>>> CVTypeArray, and pass it to GloballyHashedType::hashTypes.
>>>>>>>>>>>>>> That will return an array of hash values.  (the format of .debug$H is the
>>>>>>>>>>>>>> header, followed by the hash values).  Then when you're writing the list of
>>>>>>>>>>>>>> sections, just add in the .debug$H section right after the .debug$T section.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Currently llvm-objcopy only writes ELF files, so it would
>>>>>>>>>>>>>> need to be taught to write COFF files.  We have code to do this in the
>>>>>>>>>>>>>> yaml2obj utility (specifically, in yaml2coff.cpp in the function
>>>>>>>>>>>>>> writeCOFF).  There may be a way to move this code to somewhere else
>>>>>>>>>>>>>> (llvm/Object/COFF.h?) so that it can be re-used by both yaml2coff and
>>>>>>>>>>>>>> llvm-objcopy, but in the worst case scenario you could copy the code and
>>>>>>>>>>>>>> re-write it to work with these new structures.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Lastly, you'll probably want to put all of this behind an
>>>>>>>>>>>>>> option in llvm-objcopy such as -add-codeview-ghash-section
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Leonardo Santagada
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Leonardo Santagada
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Leonardo Santagada
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Leonardo Santagada
>>>>
>>>
>>
>>
>> --
>>
>> Leonardo Santagada
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180123/6848fe1c/attachment.html>