[llvm-dev] [lldb-dev] Trying out lld to link windows binaries (using msvc as a compiler)

Leonardo Santagada via llvm-dev llvm-dev at lists.llvm.org
Fri Jan 26 10:32:15 PST 2018


yeah, apparently .bss has a flag of unitialized data that is not being
respected on the layout of the coff files (it should skip those sections)
but I dunno what to do with .data as it doesn't have a size.

On Fri, Jan 26, 2018 at 7:23 PM, Zachary Turner <zturner at google.com> wrote:

> dumpbin has some clues.  I ran dumpbin /all on both object files and
> diffed the results.
>
> In the good object file, Section #2 (.data) has File Pointer to Raw Data =
> 208, but in the bad file Section #2 (.data) has File Pointer to Raw Data =
> 0.
> Also, Section #3 (.bss) in the good file has Size of Raw Data = 4, but in
> the bad file Section #3 (.bss) has Size of Raw Data = 0.
>
>
>
> On Fri, Jan 26, 2018 at 10:06 AM Zachary Turner <zturner at google.com>
> wrote:
>
>> Interesting.  If it is generating yaml files that can't be decoded, then
>> definitely sounds like a bug.  If you can provide a reduced test case we
>> can try to fix it, but admittedly it can often take some effort to generate
>> a reduced test case.  The best way is to use creduce.  Use cl or clang-cl
>> and write the pre-processed output to a file, then run creduce on that file
>> with a test that basically roundtrips from obj2yaml to yaml2obj and return
>> 1 if there's an error.  Then let it run for a couple of hours (or days) and
>> you should come back to a minimal repro.
>>
>> Granted, it's understandable if you don't have the time for that :)
>>
>>
>> Also, I got rid of my local changes and re-ran the test case and I'm
>> seeing what you see.  the 2 yaml files are identical.  But the 2 binary
>> files aren't.
>>
>> 00000004: 83 94
>> 00000050: 00 08
>> 00000051: 00 02
>> 00000074: 00 04
>> 0000077C: 04 0F
>> 000007A0: 0F 04
>> 000007C5: 61 62
>> 000007D0: 62 61
>>
>> Luckily 00000004 is a pretty easy offset to identify, so we should be
>> able to figure this out.  It looks probably some header fields aren't being
>> initialized correctly (not sure why obj2yaml isn't printing this
>> information)
>>
>> On Fri, Jan 26, 2018 at 9:59 AM Leonardo Santagada <santagada at gmail.com>
>> wrote:
>>
>>> I'm now thinking that there's a bug in either obj2yaml or yaml2obj,
>>> because if I run just those two tools on my codebase it generates yaml
>>> files that can't be decoded, will try now to not add any section to the obj
>>> file in llvm-objcopy to see if I can link with obj files that I rewrite
>>> (but without adding symbols or sections).
>>>
>>> One of the bugs that do annoy me is that the timedatestamp is not
>>> carried when obj2yaml writes a file, and that the layout function on
>>> yaml2coff does generate different indexes to the sections, none that look
>>> wrong, but it seems that it leaves some padding, but I didn't have time to
>>> look to closely at why.
>>>
>>> On Fri, Jan 26, 2018 at 6:52 PM, Zachary Turner <zturner at google.com>
>>> wrote:
>>>
>>>> Hmm, ok.  In that case let me try again without my local changes.
>>>> Maybe they are getting in the way :-/
>>>>
>>>>
>>>> On Fri, Jan 26, 2018 at 9:51 AM Leonardo Santagada <santagada at gmail.com>
>>>> wrote:
>>>>
>>>>> it is identical to me... wierd.
>>>>>
>>>>> On Fri, Jan 26, 2018 at 6:49 PM, Zachary Turner <zturner at google.com>
>>>>> wrote:
>>>>>
>>>>>> (Ignore the fact that my hashes are 8 byte in the "good" file, this
>>>>>> is due to some local changes I've been experimenting with)
>>>>>>
>>>>>> On Fri, Jan 26, 2018 at 9:48 AM Zachary Turner <zturner at google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I did this:
>>>>>>>
>>>>>>> // a.cpp
>>>>>>> static int x = 0;
>>>>>>> void b(int);
>>>>>>> void a(int) {
>>>>>>>   if (x)
>>>>>>>     b(x);
>>>>>>> }
>>>>>>> int main(int argc, char **argv) {
>>>>>>>   a(argc);
>>>>>>>   return x;
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> clang-cl /Z7 /c a.cpp /Foa.noghash.obj
>>>>>>> clang-cl /Z7 /c a.cpp -mllvm -emit-codeview-ghash-section
>>>>>>> /Foa.ghash.good.obj
>>>>>>> llvm-objcopy a.noghash.obj a.ghash.bad.obj
>>>>>>> obj2yaml a.ghash.good.obj > a.ghash.good.yaml
>>>>>>> obj2yaml a.ghash.bad.obj > a.ghash.bad.yaml
>>>>>>>
>>>>>>> Then open these 2 yaml files up in a diff viewer.  It looks like the
>>>>>>> hashes aren't getting emitted at all.  For example, in the good yaml file I
>>>>>>> see this:
>>>>>>>
>>>>>>>   - Name:            '.debug$H'
>>>>>>>     Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
>>>>>>> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
>>>>>>>     Alignment:       4
>>>>>>>     SectionData:     C5C93301000001005549419E78044E
>>>>>>> 3896D45CD7009428758BE4A1E2B3E022BA267DEE221F5C42B17BCA182AF8
>>>>>>> 4584814A8B5E7E3FB17B397A9E3DEA75CD5627
>>>>>>>     GlobalHashes:
>>>>>>>       Version:         0
>>>>>>>       HashAlgorithm:   1
>>>>>>>       HashValues:
>>>>>>>         - 5549419E78044E38
>>>>>>>         - 96D45CD700942875
>>>>>>>         - 8BE4A1E2B3E022BA
>>>>>>>         - 267DEE221F5C42B1
>>>>>>>         - 7BCA182AF8458481
>>>>>>>         - 4A8B5E7E3FB17B39
>>>>>>>         - 7A9E3DEA75CD5627
>>>>>>>   - Name:            .pdata
>>>>>>>
>>>>>>> And in the bad yaml file I see this:
>>>>>>>   - Name:            '.debug$H'
>>>>>>>     Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA,
>>>>>>> IMAGE_SCN_MEM_DISCARDABLE, IMAGE_SCN_MEM_READ ]
>>>>>>>     Alignment:       4
>>>>>>>     SectionData:     C5C9330100000000
>>>>>>>     GlobalHashes:
>>>>>>>       Version:         0
>>>>>>>       HashAlgorithm:   0
>>>>>>>   - Name:            .pdata
>>>>>>>
>>>>>>> Don't focus too much on trying to figure out weird linker errors.
>>>>>>> Just get the output of obj2yaml to be identical when run under a diff
>>>>>>> utility, then everything should work fine.
>>>>>>>
>>>>>>> On Fri, Jan 26, 2018 at 7:27 AM Leonardo Santagada <
>>>>>>> santagada at gmail.com> wrote:
>>>>>>>
>>>>>>>> I'm so close I can almost smell it :)
>>>>>>>>
>>>>>>>> I know how bad the code looks, I don't intend to submit this, but
>>>>>>>> if you want to try it out its at: https://gist.github.com/
>>>>>>>> santagada/544136b1ee143bf31653b1158ac6829e
>>>>>>>>
>>>>>>>> I'm seeing: lld-link.exe: error: duplicate symbol:
>>>>>>>> "<redacted_unmangled>" (<redacted>) in <internal> and in
>>>>>>>> <redacted_filename>.obj, looking at the .yaml dump the symbols are all
>>>>>>>> similar to this:
>>>>>>>>
>>>>>>>> - Name: <redacted>
>>>>>>>> Value: 0
>>>>>>>> SectionNumber: 0
>>>>>>>> SimpleType: IMAGE_SYM_TYPE_NULL
>>>>>>>> ComplexType: IMAGE_SYM_DTYPE_FUNCTION
>>>>>>>> StorageClass: IMAGE_SYM_CLASS_WEAK_EXTERNAL
>>>>>>>> WeakExternal:
>>>>>>>> TagIndex: 134
>>>>>>>> Characteristics: IMAGE_WEAK_EXTERN_SEARCH_LIBRARY
>>>>>>>>
>>>>>>>> On Thu, Jan 25, 2018 at 8:01 PM, Zachary Turner <zturner at google.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> I haven't really dabbled in this part of the COFF format
>>>>>>>>> personally, so hopefully I'm not leading you astray :)
>>>>>>>>>
>>>>>>>>> But I checked the code for coff2yaml, and I see this:
>>>>>>>>>
>>>>>>>>>       } else if (Symbol.isSectionDefinition()) {
>>>>>>>>>         // This symbol represents a section definition.
>>>>>>>>>         assert(Symbol.getNumberOfAuxSymbols() == 1 &&
>>>>>>>>>                "Expected a single aux symbol to describe this
>>>>>>>>> section!");
>>>>>>>>>         const object::coff_aux_section_definition *ObjSD =
>>>>>>>>>             reinterpret_cast<const object::coff_aux_section_definition
>>>>>>>>> *>(
>>>>>>>>>                 AuxData.data());
>>>>>>>>>
>>>>>>>>> So it looks like you need exactly 1 aux symbol for each section
>>>>>>>>> symbol.
>>>>>>>>>
>>>>>>>>> I then scrolled up in this function to figure out where AuxData
>>>>>>>>> comes from, and it comes from COFFObjectFile::getSymbolAuxData.
>>>>>>>>> I think that function holds the clue to what you need to do.  It looks like
>>>>>>>>> you need to set coff::symbol::NumberOfAuxSymbols to 1, and then
>>>>>>>>> there is a comment in getSymbolAuxData which says:
>>>>>>>>>
>>>>>>>>>     // AUX data comes immediately after the symbol in COFF
>>>>>>>>>     Aux = reinterpret_cast<const uint8_t *>(Symbol.getRawPtr()) +
>>>>>>>>> SymbolSize;
>>>>>>>>>
>>>>>>>>> So I think you just need to write the bytes immediately after the
>>>>>>>>> coff::symbol.  The thing you need to write looks like a
>>>>>>>>> coff::coff_aux_section_definition structure.
>>>>>>>>>
>>>>>>>>> For the CheckSum, look at WinCOFFObjectWriter::writeSection.  It
>>>>>>>>> looks like its a CRC32 of the actual section contents, which you can
>>>>>>>>> generate with a couple of lines of code:
>>>>>>>>>
>>>>>>>>>   JamCRC JC(/*Init=*/0);
>>>>>>>>>   JC.update(DebugHContents);
>>>>>>>>>   AuxSymbol.CheckSum = JC.getCRC();
>>>>>>>>>
>>>>>>>>> Hope this helps
>>>>>>>>>
>>>>>>>>
>>>
>>>
>>> --
>>>
>>> Leonardo Santagada
>>>
>>


-- 

Leonardo Santagada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180126/35125e9e/attachment.html>


More information about the llvm-dev mailing list