[llvm-dev] Need help with code generation

mats petersson via llvm-dev llvm-dev at lists.llvm.org
Sun Mar 20 08:53:29 PDT 2016


Yes, in my case, `main` is in the runtime library and is the first thing
that runs from MY code, and in turn calls the __PascalMain that the
compiler generated. The compiler knows about certain built-in functions
such as `write` and `writeln`, and will replace those with corresponding
calls to the runtime library function equivalent.

Of course, you can either completely forego the convenience of the C
library, and implement your own library, starting with Linux or Windows
system calls written in assembler [and if you plan on supporting further
OS's, obviously you need to implement those too] - it's worth noting that
system calls are different per architecture, so if you plan on supporting
more than x86-32 and x86-64, you will need to write assembler code for each
of those too. Oh, and system calls are quite different between OS's.

You will then be able to simply call `ld -o name-supplied-by-compiler
object-files-supplied-by-compiler -lyour-runtime.a`. I decided to go the
simple route of using the already existing C runtime for file and console
I/O, random number, a few floating point functions and some other
functions. Because I'm a bit lazy and think that it's OK to do that. Like I
said early in this thread, I do have some plans to replace this in the
future, but it's nowhere near the top of the todo-list.

Just as a reference, the `ld` line of my compiler is:

/usr/bin/ld" --hash-style=gnu --no-add-needed --build-id --eh-frame-hdr -m
elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o gol
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/../../../../lib64/crt1.o
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/../../../../lib64/crti.o
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/crtbegin.o -L/home/mats/src/lacsap
-L/usr/lib/gcc/x86_64-redhat-linux/4.9.2
-L/usr/lib/gcc/x86_64-redhat-linux/4.9.2/../../../../lib64
-L/usr/local/bin/../lib64 -L/lib/../lib64 -L/usr/lib/../lib64
-L/usr/lib/gcc/x86_64-redhat-linux/4.9.2/../../.. -L/usr/local/bin/../lib
-L/lib -L/usr/lib gol.o -lruntime -lm -lgcc --as-needed -lgcc_s
--no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/crtend.o
/usr/lib/gcc/x86_64-redhat-linux/4.9.2/../../../../lib64/crtn.o

Looking at that, you can probably understand why I decided to NOT write a
bit of code that finds and links to all of those things myself.

--
Mats

On 20 March 2016 at 15:35, Lorenzo Laneve <lore97drk at icloud.com> wrote:

> I think I'll have to do as you did for the runtime, but will the lib's
> main function be the resulting program's real main function? And your
> __PascalMain implementation is in the object file your compiler created,
> isn't it?
> We can say that the main function a programmer writes is not the very
> first thing that is called in the final program (well obviously).
>
> Replying to the suggestion you told me about calling the C compiler to
> finish the job, I know I can do it, but I don't want because I want my
> compiler to be standalone.
>
> On Mar 20, 2016, at 2:02 PM, mats petersson <mats at planetcatfish.com>
> wrote:
>
> Adding back the "all recipients" - sorry, sending message from my phone, I
> forgot...
>
> Since my runtime is for a Pascal compiler, it has to "adapt" the C startup
> into a suitable Pascal environment. This means running the init portion of
> other modules as well as discarding the argc, argv arguments. So, I have a
> C main, which calls __PascalMain, which is the "main" for the pascal
> program itself after some other setup code.
>
> The whole runtime of C is quite complex (in terms of "what code is
> executed in what order", at least), and C++ is a little worse on top of
> that, but basically there is code "before main" in C. If you don't use the
> C library, you will probably have to replace this by some other code that
> does something similar.
>
> But assuming you don't have a very good reason for doing so, I would
> certainly suggest that you make your code simply pass the .bc or .o file
> that your compiler generates to the C compiler.
>
> I would also ignore things like "is it faster to do puts than fputc" - at
> least until you have other things working reasonably well. This according
> to the principle of "avoid premature optimisat". Unless you are really
> familiar with how compilers work and the design thereof, you have "bigger
> fish to fry" than micro-optimising your string output... I have two years
> of experience in writing my own LLVM frontend, and I guarantee that
> optimising string output is dead easy to do "later". Getting the compiler
> to deal with some of the more complex parts of whatever language it is will
> not be...
>
> --
> Mats
>
> On 20 March 2016 at 12:34, Lorenzo Laneve <lore97drk at icloud.com> wrote:
>
>> My goal is a complete and independent compiler for a new, safe and
>> portable programming language.
>> I read the code you put the link of earlier, but I still don't get it. If
>> I link against that library will that main function included and started on
>> the resulting program?
>> And yeah for example I have to initialize stdout, I have to call it
>> before my main function. The runtime library does the trick but I didn't
>> get the main function in your library
>>
>> On Mar 20, 2016, at 1:11 PM, mats petersson <mats at planetcatfish.com>
>> wrote:
>>
>>
>> On 20 Mar 2016 11:13, "Lorenzo Laneve" <lore97drk at icloud.com> wrote:
>> >
>> > So won't my program have as main function the main function declared in
>> the IR?
>>
>> Depends on the linker script (which is "prepared by the compiler vendor"
>> if you use clang or gcc).
>>
>> There will be a statement of "start here" somewhere. It usually isn't
>> "main" but something that runs "global constructors" and various other
>> "needed before main" work.
>>
>> Of course, if you don't use "stdout" and "stdin", and don't need
>> "malloc", etc, etc, you can start directly at main. You have to write your
>> own linker script to do that tho'.
>>
>> What us your final goal?
>>
>> --
>> Mats
>> >
>> > On Mar 20, 2016, at 9:03 AM, mats petersson <mats at planetcatfish.com>
>> wrote:
>> >
>> >>
>> >>
>> >> On 19 March 2016 at 22:15, Lorenzo Laneve <lore97drk at icloud.com>
>> wrote:
>> >>>
>> >>> @james
>> >>> Yeah for code generation I figured out that clang doesn't actually
>> use llc, and I already started reading its code to see how it works.
>> >>> For the ld, there's not an "helper" in the llvm library that calls
>> it, is there?
>> >>> By the way, I thought about calling ld with things like execl() or
>> std::system(), I don't know if it's a good idea, I'm always afraid there
>> are better ways than mine!
>> >>>
>> >>>
>> >>> @mats
>> >>> Yea, I haven't used C's system calls in my own code yet but if I just
>> have to declare the function puts in the IR modules (e.g.: putc) and then
>> link against libc, am I right?
>> >>> Should I use basic C functions such as putc() and getc(), in my
>> runtime library or is there a more efficient way to set up my runtime
>> library?
>> >>
>> >>
>> >> That is what I mean by using C runtime calls: You are calling into C
>> functions in libc, so you need:
>> >> 1. Link to libc.
>> >> 2. run libc startup code.
>> >> 3. call your program's `main` from the libc startup code.
>> >>
>> >> To get this working, you need to either extract/construct the long
>> list of arguments to `ld` (or whatever linker is used in the system) - I
>> looked at doing that, but it gets really messy if you want to support
>> running on anything other than your very own machine, since libraries are
>> installed in different places on different systems [sure, they are
>> "findable", but it's a fairly messy process still]. And of course, you may
>> need `libm` as well, and other parts that in turn depend on the C library -
>> very few libraries written in C are completely independent, most will at
>> the very least call printf or fprintf in their "Help, something didn't
>> work" functions....
>> >>
>> >> Of course, since I'm implementing a "real" language with previously
>> defined functionality for I/O, I'm more reliant on specific behaviour than
>> if I were implementing a language with "my own definition". Having said
>> that, even doing a "simple" putc is quite a few lines of code, including
>> some inline assembler, I think.
>> >>
>> >> The alternative is to have your completely separate functions for I/O
>> and whatever else you need from the system. But if you are interested in
>> writing your own compiler, rather than interested in writing your own OS
>> interface libraries, I'd suggest that you just (more or less directly)
>> interface to the C library. It's relatively easy [as long as your compiler
>> has sane calling conventions, etc] and solves the immediate problem without
>> too much effort.
>> >>
>> >> What I did:
>> >> https://github.com/Leporacanthicus/lacsap/tree/master/runtime
>> >>
>> >> Originally, my runtime was one monolithic .c file, with a couple dozen
>> functions, but as it grew, I decided to move it into separate parts.
>> >>
>> >> Obviously, there are OTHER solutions, but I don't know of a way that
>> trivially removes the need for a libc linkage.
>> >>
>> >> --
>> >> Mats
>> >>>
>> >>>
>> >>> On Mar 19, 2016, at 9:58 PM, mats petersson <mats at planetcatfish.com>
>> wrote:
>> >>>
>> >>>> If you plan on calling C runtime library functions, you probably
>> want to do what I did:
>> >>>> Cheat, and make a libruntime.a (with C functions to do stuff your
>> compiler can't do natively) and then link that using clang or gcc.
>> >>>>
>> >>>>
>> https://github.com/Leporacanthicus/lacsap/blob/master/binary.cpp#L124
>> >>>>
>> >>>> At some point, I plan to replace my runtime library with native
>> Pascal code, at which point I will be able to generate the ELF binary
>> straight from my compiler without the runtime library linking in the C
>> runtime library, but that's not happening anytime real soon. Getting the
>> compiler to compile v5 of Wirth's original Pascal compiler is higher on the
>> list... :)
>> >>>>
>> >>>> --
>> >>>> Mats
>> >>>>
>> >>>> On 19 March 2016 at 20:51, James Molloy via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> >>>>>
>> >>>>> Hi Lorenzo,
>> >>>>>
>> >>>>> Clang doesn't call llc; LLVM is compiled into Clang. Clang does
>> call the system linker though.
>> >>>>>
>> >>>>> Making your compiler generate *object* code is very simple. Making
>> it fixup that object code and execute it in memory (JIT style) is also
>> simple. Linking it properly and creating a fixed up ELF file is less
>> simple. For that, you need to compile to object (using
>> addPassesToEmitFile() - see llc.cpp) then invoke a linker. Getting that
>> command line right can be quite difficult.
>> >>>>>
>> >>>>> Rafael, This would be a good usecase for LLD as a library. I heard
>> that this is is an explicit non-goal, which really surprised me. Is that
>> indeed the case?
>> >>>>>
>> >>>>> Cheers,
>> >>>>>
>> >>>>> James
>> >>>>>
>> >>>>> On Sat, 19 Mar 2016 at 13:32 Lorenzo Laneve via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> >>>>>>
>> >>>>>> I'd like to make my compiler independent, just like Clang. Doesn't
>> Clang call llc and then system's ld by itself? I don't want my compiler to
>> depend by any other program.
>> >>>>>> I guess there will be a class in the llvm library that generates
>> the object files based on the system's triple and data layout, and then
>> call the system's ld?
>> >>>>>>
>> >>>>>> On Mar 19, 2016, at 11:48 AM, Bruce Hoult <bruce at hoult.org> wrote:
>> >>>>>>
>> >>>>>>> If you've created a .bc or a .ll file then the simplest thing is
>> to just give it to clang exactly the same as you would for a .c file. Clang
>> will just Do The Right Thing with it.
>> >>>>>>>
>> >>>>>>> If you don't want to link, then pass flags such as -c to clang as
>> usual.
>> >>>>>>>
>> >>>>>>> e.g.
>> >>>>>>>
>> >>>>>>> ---- hello.ll ----
>> >>>>>>> declare i32 @puts(i8*)
>> >>>>>>> @str = constant [12 x i8] c"Hello World\00"
>> >>>>>>>
>> >>>>>>> define i32 @main() {
>> >>>>>>>   %1 = call i32 @puts(i8* getelementptr inbounds ([12 x i8]*
>> @str, i64 0, i64 0))
>> >>>>>>>   ret i32 0
>> >>>>>>> }
>> >>>>>>> ----------------
>> >>>>>>>
>> >>>>>>> $ clang hello.ll -o hello && ./hello
>> >>>>>>> warning: overriding the module target triple with
>> x86_64-apple-macosx10.10.0
>> >>>>>>> 1 warning generated.
>> >>>>>>> Hello World
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Sat, Mar 19, 2016 at 3:03 AM, Lorenzo Laneve via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> >>>>>>>>
>> >>>>>>>> I wrote my compiler and now it generates LLVM IR modules. Now
>> i’d like to go ahead and make object file and then executable, just like
>> clang does.
>> >>>>>>>>
>> >>>>>>>> What should I have to use to create the object files? and then
>> how do I call the ld? (not llvm-ld, I want my compiler to work like Clang
>> and I read that Clang doesn’t use llvm-ld).
>> >>>>>>>> _______________________________________________
>> >>>>>>>> LLVM Developers mailing list
>> >>>>>>>> llvm-dev at lists.llvm.org
>> >>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>>>>>>
>> >>>>>>>
>> >>>>>> _______________________________________________
>> >>>>>> LLVM Developers mailing list
>> >>>>>> llvm-dev at lists.llvm.org
>> >>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> LLVM Developers mailing list
>> >>>>> llvm-dev at lists.llvm.org
>> >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>>>>
>> >>>>
>> >>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160320/1d447d98/attachment.html>


More information about the llvm-dev mailing list