[llvm-dev] lld-link with MSVC6 object files

Rui Ueyama via llvm-dev llvm-dev at lists.llvm.org
Thu Oct 3 02:38:50 PDT 2019


On Wed, Oct 2, 2019 at 8:59 PM Paul Moran <bankybooks at gmail.com> wrote:

> Yeah ideally I wanted the tool chain to just produce the same binary. I
> suppose running a disassembly step could work to ensure that only offsets
> to imports have changed. But I think this would still give me issues with
> comparing data sections since offsets to constant strings and globals could
> also be swapped around too?
>
> I believe in GCC this can be "fixed" by using a linker script. MSVC
> doesn't have anything like this however.
>
> I haven't looked at lld-links binary output yet - but I would have
> imagined that the import table format and the way that global data is
> created must be done in the same way? It would just be orderings/lack of
> "rich" header and other things that lld-link does differently?
>

lld's section layout is not the same as MSVC's, and even if you make the
section layout the same, there are still many things that are not the same.
For example, executables contain string tables for string constants (e.g.
imported symbol names), and two string tables that contains the same set of
strings doesn't have to contain the strings in the same order.


> On Wed, Oct 2, 2019 at 9:33 AM Rui Ueyama <ruiu at google.com> wrote:
>
>> I think it would be quite hard to hack lld so that the linker produces
>> the same output as Microsoft link.exe. Although lld can produce the
>> semantically same executables as link.exe, every detail is different. If
>> you are working on it as a long-term project, it is probably doable, but it
>> doesn't seem like it is something you can easily hack.
>>
>> Have you considered disassembling the original binary and your new binary
>> and compare the two as text files? If only imported functions are
>> different, the text outputs will be mostly the same, and you would be able
>> to tell if you succeeded recovering the source code.
>>
>> On Wed, Oct 2, 2019 at 5:18 PM Paul Moran <bankybooks at gmail.com> wrote:
>>
>>> That isn't the case but my idea is that I can hack a copy of lld-link to
>>> produce the same output. Since the other option is to use the MSVC6 linker
>>> which will do things like randomly re-order the order of imported functions
>>> and the likes. I can't change that without doing something crazy like
>>> reverse engineering the linker and patching something in there to force a
>>> particular ordering. I suspect that the imported function order isn't the
>>> only thing that it might change on a rebuild.
>>>
>>>
>>> On Wed, Oct 2, 2019 at 8:36 AM Rui Ueyama <ruiu at google.com> wrote:
>>>
>>>> On Tue, Oct 1, 2019 at 8:18 PM Paul Moran <bankybooks at gmail.com> wrote:
>>>>
>>>>> I have the most edge of edge use cases :). I am recovering the lost
>>>>> source code to an application built with MSVC 6. However because I want to
>>>>> produce byte for byte exact output I need to ensure that the import table
>>>>> is in the same order as the original binary.
>>>>>
>>>>
>>>> I'm not sure if I follow this part -- if you build an executable using
>>>> lld-link and compare it with an executable built with MSVC linker, they are
>>>> almost always different. lld-link doesn't attempt to produce the byte-wise
>>>> same outputs as MSVC. So, if you want to compare lld-link-produced output,
>>>> the other file needs to be built with lld-link too. But is that the case?
>>>>
>>>> Since the MSVC6 linker has no way of doing this I figured I could hack
>>>>> this feature into lld-link. I need to also set the PDB path in the debug
>>>>> data but a newer version of the MS linker can do this and I believe
>>>>> lld-link already supports this too.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 1, 2019 at 8:58 AM Rui Ueyama <ruiu at google.com> wrote:
>>>>>
>>>>>> Out of curiosity, why do you want to use lld-link with a compiler
>>>>>> that was released 20 years ago?
>>>>>>
>>>>>> On Tue, Oct 1, 2019 at 7:02 AM Zachary Turner via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> I would expect it to be able to link the object file, even if it
>>>>>>> ignored debug info.  It's a bit strange that it complains about bad file
>>>>>>> magic.
>>>>>>>
>>>>>>> It might be tricky to get debug information working and produce a
>>>>>>> valid PDB file since that is pretty old and the format has changed both
>>>>>>> with how it was stored in the object file itself as well as the format of
>>>>>>> the PDB file.
>>>>>>>
>>>>>>> My guess is that the "magic" it's complaining about is not the magic
>>>>>>> of the object file itself but rather the first 4 bytes of the .debug$S (or
>>>>>>> was it the .debug$T?) section.  Perhaps a simple fix in this case is that
>>>>>>> instead of erroring out if we encounter an "older" magic, we just link as
>>>>>>> if debug info was not present to begin with.
>>>>>>>
>>>>>>> This will at least make it work.  If you want to actually consume
>>>>>>> the debug info though, you're in for a fun ride :)
>>>>>>>
>>>>>>> On Mon, Sep 30, 2019 at 2:19 PM Paul Moran via llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> It sounds like perhaps it might mostly work with some tweaks -
>>>>>>>> given its complaining about bad file magic. I'll see if I can get lld-link
>>>>>>>> to build locally and hack out the magic checks to see if it works.
>>>>>>>>
>>>>>>>> On Mon, Sep 30, 2019 at 10:14 PM David Blaikie <dblaikie at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Sep 30, 2019 at 2:07 PM Paul Moran <bankybooks at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> MSVC 6 is 1998 not 1989 :)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Ah, I just glanced briefly at the Wikipedia article (
>>>>>>>>> https://en.wikipedia.org/wiki/Microsoft_Visual_C%2B%2B ) &
>>>>>>>>> misread the "C 6.0" and didn't notice it was distinct from "Visual C++ 6.0"
>>>>>>>>> - thanks for the catch!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The latest MSVC linker can link these object files. Is this just
>>>>>>>>>> because it has support for C13 types and some other code path for whatever
>>>>>>>>>> MSVC6 uses? After some digging around it appears to be this format:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#coff-file-header-object-and-image
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Which is COFF object file format? Does lld link support this
>>>>>>>>>> format?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> COFF is still the windows object file format, and the Windows
>>>>>>>>> support in lld is COFF support, yeah. I guess there might be some format
>>>>>>>>> variations that haven't been implemented in lld, though. It's mostly an "on
>>>>>>>>> demand" sort of approach.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Sep 30, 2019 at 7:39 PM Alexandre Ganea <
>>>>>>>>>> alexandre.ganea at ubisoft.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> The CodeView library in LLVM only supports Codeview C13 types,
>>>>>>>>>>> that is, MSVC 7.0 / Visual Studio 2002 or after.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *De :* llvm-dev <llvm-dev-bounces at lists.llvm.org> *De la part
>>>>>>>>>>> de* David Blaikie via llvm-dev
>>>>>>>>>>> *Envoyé :* September 30, 2019 2:38 PM
>>>>>>>>>>> *À :* Paul Moran <bankybooks at gmail.com>; Rui Ueyama <
>>>>>>>>>>> ruiu at google.com>
>>>>>>>>>>> *Cc :* llvm-dev at lists.llvm.org
>>>>>>>>>>> *Objet :* Re: [llvm-dev] lld-link with MSVC6 object files
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> MSVC 6 as in the Visual Studio released in 1989? Yes, I imagine
>>>>>>>>>>> that's a bit outside the intended support window.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Sep 30, 2019 at 11:18 AM Paul Moran via llvm-dev <
>>>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I have a question about lld-link. What obj file formats should
>>>>>>>>>>> it support? When I try to use an obj from msvc 6.0 it complains that the
>>>>>>>>>>> file magic is not valid.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> However when running  llvm-objdump it reports:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> test1.obj:      file format COFF-i386
>>>>>>>>>>>
>>>>>>>>>>> Disassembly of section .text:
>>>>>>>>>>> 0000000000000000 _main:
>>>>>>>>>>>        0:       68 00 00 00 00  pushl   $0
>>>>>>>>>>>        5:       e8 00 00 00 00  calll   0 <_main+0xa>
>>>>>>>>>>>        a:       83 c4 04        addl    $4, %esp
>>>>>>>>>>>        d:       33 c0   xorl    %eax, %eax
>>>>>>>>>>>
>>>>>>>>>>>       f:       c3      retl
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Paul
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191003/3fce9578/attachment.html>


More information about the llvm-dev mailing list