[cfe-dev] LLVM/Clang and getting rid of the system linker (GNU ld or MSVC link.exe)

Ruben Van Boxem vanboxem.ruben at gmail.com
Sun Apr 17 12:31:44 PDT 2011


2011/4/17 Michael Spencer <bigcheesegs at gmail.com>:
> On Sun, Apr 17, 2011 at 10:29 AM, Ruben Van Boxem
> <vanboxem.ruben at gmail.com> wrote:
>> Hi,
>>
>> I've been thinking recently about how Clang can solve it's system
>> linker dependency problem. I've seen some proposals and thoughts on
>> the mailing list about this issue, so I think this will be interesting
>> to explore. Please excuse my non-professional language, I am not a
>> Computer Science master, just a very interested hobbyist.
>
> I would not call depending on the system linker a problem. The system
> linker saves us from worrying about the nitty gritty details of how
> the system works.
>
>> Concretely, Clang now uses the system linker (GNU ld or MSVC link.exe)
>> to turn the series of object files into an executable/library. This is
>> not very good, because Clang depends on the whims of a completely
>> different project to work correctly, and this is very platform
>> dependent on too high a level.
>
> What do you mean by on too high a level? Also, linkers don't really
> change much, and in fact are kinda standardized, so I wouldn't use the
> term whim. Dealing with linkers is a rather small part of the Clang
> codebase.

Let's talk Windows, cause that's where I'm mostly concerned with this.
Cross-compiling projects (like VLC, and any GNU project) have to use a
GNU toolchain, because GNU ld is the only linker capable of producing
a win32 executable on a non-Windows OS (You are not legally allowed to
use Visual Studio or the Windows SDK on a non-Windows machine). On top
of that, it has its problems: for one, the architecture commandline
arguments are different from the GCC arguments that mean the same,
patching it requires a lot of work, as it's not a simple project, with
a lot of legacy code.
Thirdly, it requires a UNIX shell to build (which is really my biggest
gripe), which either means a toolchain needs to be cross-compiled, or
you need to use slow MSYS/Cygwin to handle that (which I currently
do), and those are not without their caveats. I agree the mingw
runtime also requires a Bash shell, but I don't think this is the
biggest hurdle to overcome. Finally, there's the "whim" part. GNU ld
is tightly bound to GCC, meaning that any change in the latter will be
reflected by a change in the first. This is fine as long as Clang
doesn't decide to do things differently. Again, I don't know the
details, but both projects may conflict someday, and then you're
either stuck with forking the "old" ld, or starting to replace it
then. I'd rather see a replacement now.

On a sidenote, the license of GNU ld is, well, GPL. I'm sure that's a
showstopper for people trying to bundle integrated linker
functionality in their commercial project.

>
>> So the current setup is (how I see it):
>> 1. Clang compiles C/C++ to object files (either GNU *.o or MSVC *.obj
>> files). This happens through LLVM IR and accompanying optimization
>> runs of the LLVM toolchain.
>> 2. The system linker is responsible for all the usual link stuff:
>> turning the object files into native binaries, doing some
>> link-time-optimizations while it works its magic.
>>
>> How I would propose to have it in an ideal world:
>>
>> 1. Clang compiles C/C++ into LLVM IR.
>> 2. LLVM toolchain stuff optimizes everything as well as it can.
>> 3. LLVM linker (+Clang?)  links together the IR files into one object
>> file (perhaps existing GNU .o or MSVC .obj files), executing its
>> link-time-optimizations in the process.
>> 4. A *simple* tool turns the complete object file into native
>> executable format, adding the platform-dependent parts that are
>> missing from the semi-platform-agnostic file created in the previous
>> step. This tool can in the first steps of the implementation be the
>> system linker, but all it would do is do the object->executable
>> conversion.
>
> Here's where the main problem is. There is no *simple* tool to to do
> this. You need a full linker to turn an object file into an
> executable. You have to link to the c and system libraries for any
> real program, even "puts("hello world");". There's a lot of code that
> gets run before main is entered.

Agreed, there is more to this than I let shine out. But one can always
hope and be naive :)

Would my LTO optimization story work though? This would only be
limited to exclude external (static) libraries unfortunately.

>
>> The "linker" may be the assembler or something else, this is what I
>> don't know. What I do know is that this setup (if possible) provides a
>> way to integrate more LLVM optimizations in a C/C++/<other language of
>> your choice> toolchain, remove any complicated linker applications,
>> and be easily extensible to new platforms.
>>
>> I understand the hand-wavingness of this whole story, but any comments
>> or thoughts on what is wrong in my reasoning or not as simple as it
>> seems are very welcome.
>>
>> Thanks!
>>
>> Ruben
>
> I agree that LLVM should have a linker, and I am currently writing one
> (see Object Files in LLVM from last year's dev meeting). I intend for
> it to replace the system linker on the major platforms (Win, Linux,
> FBSD, Mac).

Aha, so that presentation did turn out to be a WIP project! That's
great news! Would you have a timeline on a usable form of hat project
;)? I would offer to help, but I fear that my knowledge is not near
enough to be useful.

Thanks!

>
> - Michael Spencer
>




More information about the cfe-dev mailing list