[llvm-dev] lld: ELF/COFF main() interface

Thu Jan 7 17:21:02 PST 2016

On Thu, Jan 7, 2016 at 5:19 PM Philip Reames via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
>
> On 01/07/2016 05:12 PM, Chandler Carruth via llvm-dev wrote:
>
> On Thu, Jan 7, 2016 at 4:05 PM Rui Ueyama < <ruiu at google.com>
> ruiu at google.com> wrote:
>
>> By organizing it as a library, I'm expecting something coarse. I don't
>> expect to reorganize the linker itself as a collection of small libraries,
>> but make the entire linker available as a library, so that you can link
>> stuff in-process. More specifically, I expect that the library would
>> basically export one function, link(std::vector<StringRef>), which takes
>> command line arguments, and returns a memory buffer for a newly created
>> executable. We may want to allow a mix of StringRef and MemoryBuffer as
>> input, so that you can directly pass in-memory objects to the linker, but
>> the basic idea remains the same.
>>
>> Are we on the same page?
>>
>
> Let me answer this below, where I think you get to the core of the problem.
>
>
>>
>> On Thu, Jan 7, 2016 at 3:44 PM, Chandler Carruth < <chandlerc at gmail.com>
>> chandlerc at gmail.com> wrote:
>>
>>> On Thu, Jan 7, 2016 at 3:18 PM Rui Ueyama <ruiu at google.com> wrote:
>>>
>>>> On Thu, Jan 7, 2016 at 2:56 PM, Chandler Carruth <
>>>> <chandlerc at gmail.com>chandlerc at gmail.com> wrote:
>>>>
>>>>> On Thu, Jan 7, 2016 at 7:18 AM Rui Ueyama via llvm-dev <
>>>>> <llvm-dev at lists.llvm.org>llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> On Thu, Jan 7, 2016 at 7:03 AM, Arseny Kapoulkine via llvm-dev <
>>>>>> <llvm-dev at lists.llvm.org>llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> In the process of migrating from old lld ELF linker to new
>>>>>>> (previously ELF2) I noticed the interface lost several important features
>>>>>>> (ordered by importance for my use case):
>>>>>>>
>>>>>>> 1. Detecting errors in the first place. New linker seems to call
>>>>>>> exit(1) for any error.
>>>>>>>
>>>>>>> 2. Reporting messages to non-stderr outputs. Previously all link
>>>>>>> functions had a raw_ostream argument so it was possible to delay the error
>>>>>>> output, aggregate it for multiple linked files, output via a different
>>>>>>> format, etc.
>>>>>>>
>>>>>>> 3. Linking multiple outputs in parallel (useful for test drivers) in
>>>>>>> a single process. Not really an interface issue but there are at least two
>>>>>>> global pointers (Config & Driver) that refer to stack variables and are
>>>>>>> used in various places in the code.
>>>>>>>
>>>>>>> All of this seems to indicate a departure from the linker being
>>>>>>> useable as a library. To maintain the previous behavior you'd have to use a
>>>>>>> linker binary & popen.
>>>>>>>
>>>>>>> Is this a conscious design decision or a temporary limitation?
>>>>>>>
>>>>>>
>>>>>> That the new ELF and COFF linkers are designed as commands instead of
>>>>>> libraries is very much an intended design change.
>>>>>>
>>>>>
>>>>> I disagree.
>>>>>
>>>>> During the discussion, there was a *specific* discussion of both the
>>>>> new COFF port and ELF port continuing to be libraries with a common command
>>>>> line driver.
>>>>>
>>>>
>>>> There was a discussion that we would keep the same entry point for the
>>>> old and the new, but I don't remember if I promised that we were going to
>>>> organize the new linker as a library.
>>>>
>>>
>>> Ok, myself and essentially everyone else thought this was clear. If it
>>> isn't lets clarify:
>>>
>>> I think it is absolutely critical and important that LLD's architecture
>>> remain one where all functionality is available as a library. This is *the*
>>> design goal of LLVM and all of LLVM's infrastructure. This applies just as
>>> much to LLD as it does to Clang.
>>>
>>> You say that it isn't compelling to match Clang's design, but in fact it
>>> is. You would need an overwhelming argument to *diverge* from Clang's
>>> design.
>>>
>>> The fact that it makes the design more challenging is not compelling at
>>> all. Yes, building libraries that can be re-used and making the binary
>>> calling it equally efficient is more challenging, but that is the express
>>> mission of LLVM and every project within it.
>>>
>>>
>>>> The new one is designed as a command from day one. (Precisely speaking,
>>>> the original code propagates errors all the way up to the entry point, so
>>>> you can call it and expect it to always return. Rafael introduced error()
>>>> function later and we now depends on that function does not return.)
>>>>
>>>
>>> I think this last was a mistake.
>>>
>>> The fact that the code propagates errors all the way up is fine, and
>>> even good. We don't necessarily need to be able to *recover* from link
>>> errors and try some other path.
>>>
>>> But we absolutely need the design to be a *library* that can be embedded
>>> into other programs and tools. I can't even begin to count the use cases
>>> for this.
>>>
>>> So please, let's go back to where we *do not* rely on never-returning
>>> error handling. That is an absolute mistake.
>>>
>>>
>>>>
>>>> If you want to consider changing that, we should have a fresh (and
>>>>> broad) discussion, but it goes pretty firmly against the design of the
>>>>> entire LLVM project. I also don't really understand why it would be
>>>>> beneficial.
>>>>>
>>>>
>>>> I'm not against organizing it as a library as long as it does not make
>>>> things too complicated
>>>>
>>>
>>> I am certain that it will make things more complicated, but that is the
>>> technical challenge that we must overcome. It will be hard, but I am
>>> absolutely confident it is possible to have an elegant library design here.
>>> It may not be as simple as a pure command line tool, but it will be
>>> *dramatically* more powerful, general, and broadly applicable.
>>>
>>> The design of LLVM is not the simplest way to build a compiler. But it
>>> is valuable to all of those working on it precisely because of this
>>> flexibility imparted by its library oriented design. This is absolutely not
>>> something that we should lose from the linker.
>>>
>>>
>>>> , and I guess reorganizing the existing code as a library is relatively
>>>> easy because it's still pretty small, but I don't really want to focus on
>>>> that until it becomes usable as an alternative to GNU ld or gold. I want to
>>>> focus on the linker features themselves at this moment. Once it's complete,
>>>> it becomes more clear how to organize it.
>>>>
>>>
>>> Ok, now we're talking about something totally reasonable.
>>>
>>> If it is easier for you all to develop this first as a command line
>>> tool, and then make it work as a library, sure, go for it. You're doing the
>>> work, I can hardly tell you how to go about it. ;]
>>>
>>
>> It is not only easier for me to develop but is also super important for
>> avoiding over-designing the API of the library. Until we know what we need
>> to do and what can be done, it is too easy to make mistake to design API
>> that is supposed to cover everything -- including hypothetical unrealistic
>> ones. Such API would slow down the development speed significantly, and
>> it's a pain when we abandon that when we realize that that was not needed.
>>
>
> I'm very sympathetic to the problem of not wanting to design an API until
> the concrete use cases for it appear. That makes perfect sense.
>
> We just need to be *ready* to extend the library API (and potentially find
> a more fine grained layering if one is actually called for) when a
> reasonable and real use case arises for some users of LLD. Once we have
> people that actually have a use case and want to introduce a certain
> interface to the library that supports it, we need to work with them to
> figure out how to effectively support their use case.
>
> At the least, we clearly need the super simple interface[1] that the
> command line tool would use, but an in-process linker could also probably
> use.
>
> We might need minor extensions to effectively support Arseny's use case (I
> think an in-process linker is a *very* reasonable thing to support, I'd
> even like to teach the Clang driver to optionally work that way to be more
> efficient on platforms like Windows). But I have to imagine that the
> interface for an in-process static linker and the command line linker are
> extremely similar if not precisely the same.
>
> At some point, it might also make sense to support more interesting
> linking scenarios such as linking a PIC "shared object" that can be mapped
> into the running process for JIT users. But I think it is reasonable to
> build the interface that those users need when those users are ready to
> leverage LLD. That way we can work with them to make sure we don't build
> the wrong interface or an overly complicated one (as you say).
>
> I don't disagree with anything Chandler said, but it's worth noting that
> we *already* have a specialized in-process linker used to MCJIT to resolve
> relocations and patch things like symbolic calls.  It'd be really really
> nice if the new linker library supported that use case.
>
>
> This is, in fact, the goal that Dave and I mentioned :)

Lang and I have been talking about this wrt MCJIT for a long time.

-eric

> Make sense?
> -Chandler
>
>>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160108/f339813e/attachment.html>