[LLVMdev] Hello World assembly without clib "puts"?

Joshua Cranmer pidgeot18 at gmail.com
Sun Sep 30 11:54:19 PDT 2012


On 9/30/2012 12:30 PM, Andrew Pennebaker wrote:
>
>     The more important question is: why would you want to do that?
>      What problem are you trying to solve?
>
>
> As weird as it sounds, I'm looking for multiplatform assembly 
> languages. I want to learn assembly, but I want my knowledge and code 
> to carry over no matter which operating system I'm using. I regularly 
> use Windows, Mac, and Linux, and I don't want to have to rewrite my 
> codebase every time I boot into another operating system.

LLVM IR is not an assembly language. It is a public, well-documented 
compiler intermediate representation that abstracts away several (but 
not all) details of platform ABIs.

> I can do this by writing assembly code that calls C functions, but I 
> get the distinct feeling: /Why am I doing it this way? Why not just 
> write in C?/ And there's only so much assembly you can learn by 
> calling C functions, instead of writing lower level code.

Different operating systems have _extremely_ different conventions for 
system calls, and the system calls are themselves quite different 
between operating systems. Even on the same architecture: Linux only 
uses int 0x80 for example, while DOS uses all of the different possible 
interrupt codes. If you want portable assembly, use C (which has often 
been called, literally, portable assembly language).

> I understand that OS's have different conventions for I/O, but what I 
> don't understand is why multiplatform assembly languages like LLVM, 
> NASM, YASM, FASM, and Gas don't give coders an macro or instruction 
> set that gets expanded to the actual, per-OS instructions during 
> assembly. I guess it lowers development efforts to reuse libc rather 
> than add multiplatform I/O assembly macros. Smaller, non-libc 
> dependent binaries don't matter in a world with hefty hard drives.

Only one of those languages is intended to be "multiplatform" in the 
sense that it can be compiled to two different platforms 
(OS/architecture combinations) reliably, and that one isn't an assembly 
language but a compiler IR. NASM, YASM, and FASM are all Intel-syntax 
x86 assemblers with varying degrees of macro support and output format 
support. Gas is pretty much a suite of assemblers that have a 
more-or-less uniform syntax.

Reliably abstracting over I/O for an actual assembler is impossible, 
since the registers, stack, and operands you need for actual syscalls 
differs wildly from platform to platform. It's pointless for LLVM IR, 
since it's designed mostly to handle the output of compilers, which are 
already going to use the libc if possible. And libc isn't that 
large--/lib/libc.so.6 (i.e., glibc) I measured to be 1MB on my laptop, 
and that's probably the heftiest C standard library implementation 
around for *nix platforms. And when you link to it shared, all it 
requires is a few shared library entries that amounts to only a few K at 
most.

-- 
Joshua Cranmer
News submodule owner
DXR coauthor

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120930/2aae8c78/attachment.html>


More information about the llvm-dev mailing list