[LLVMdev] Hello World assembly without clib "puts"?
David.Chisnall at cl.cam.ac.uk
Sun Sep 30 11:35:40 PDT 2012
On 30 Sep 2012, at 18:30, Andrew Pennebaker wrote:
>> The more important question is: why would you want to do that? What problem are you trying to solve?
> As weird as it sounds, I'm looking for multiplatform assembly languages. I want to learn assembly, but I want my knowledge and code to carry over no matter which operating system I'm using. I regularly use Windows, Mac, and Linux, and I don't want to have to rewrite my codebase every time I boot into another operating system.
In that case, LLVM IR is a really bad choice. It has to be in static single assignment form, which makes it totally unlike any real assembly (it is designed to be easy to generate machine code from).
LLVM IR also does not abstract differences in calling conventions, nor does it have a macro language, and so LLVM IR is intrinsically not portable between architectures and often not between operating systems on the same architecture.
> I can do this by writing assembly code that calls C functions, but I get the distinct feeling: Why am I doing it this way? Why not just write in C? And there's only so much assembly you can learn by calling C functions, instead of writing lower level code.
Learning C will be a lot more use to you.
> I understand that OS's have different conventions for I/O, but what I don't understand is why multiplatform assembly languages like LLVM, NASM, YASM, FASM, and Gas don't give coders an macro or instruction set that gets expanded to the actual, per-OS instructions during assembly. I guess it lowers development efforts to reuse libc rather than add multiplatform I/O assembly macros. Smaller, non-libc dependent binaries don't matter in a world with hefty hard drives.
Using libc is more sensible because libc is the official public interface for any POSIX system. On Windows, for example, the underlying kernel routines are undocumented and are not part of the public API. On OS X, there are quite limited ABI stability guarantees at the kernel level and the official way of interacting with the kernel is via libc. A non-libc-dependent (or, more accurately, a non-libSystem-dependent) binary is one that may be broken by a minor kernel upgrade (this doesn't often happen, but it's not guaranteed not to).
Even on open source kernels, such as Linux of FreeBSD, if you look at the system call table you will see that once you stray past trivial things that were inherited from ancient UNIX there are significant differences between platforms and even between versions of the same platform. FreeBSD and Linux may implement the same C library functionality in terms of wildly different APIs. It takes more than a little assembly macro to, for example, abstract the difference between epoll() and kqueue(). Once you stray to something like OS X, you may see even wider varieties, such as things being implemented in terms of Mach ports.
If you want portable code, don't use assembly.
More information about the llvm-dev