[cfe-dev] preserving type signatures
Renato Golin
rengolin at systemcall.org
Thu Mar 25 03:16:35 PDT 2010
On 24 March 2010 22:05, David Chisnall <theraven at sucs.org> wrote:
> Then you need to use something other than LLVM IR. You also need to start with a source language other than C, because things like sizeof(int) change on different platforms, as do a lot of things defined in headers and a number of predefined macros. Even preprocessing the same C source on FreeBSD/x86 and Solaris/SPARC64 will give very different output for all but the most trivial programs. Compiling it to IR will add even more differences.
I thought the sizeof problem was solved with the data layout
information... But yes, headers will mess up things.
One more thing that occurred to me is that, if you create all IR as
the same (see inline answer below), some optimizations might assume
wrong things and screw up the ABI, not just slow down the program.
> No. LLVM IR is less expressive than C.
Indeed, and that's when all my assumptions went away.
The whole point of converting to IR is to reduce the expressiveness so
the optimizations and codegen can work with something simple and avoid
herculean tasks. So having an IR that is as expressive as any language
(or the mix of all of them) is not just difficult, it's wrong. Right?
;)
> There is no way of generating 'plain and simple IR' that can be turned into native code trivially. Consider something simple like a function taking an argument that is a union of an int and a void*. On any sane ABI, this argument will be passed in a register, so the IR will use an i32 or i64. Linux/86, however, will pass it via a pointer. How would you represent this in LLVM IR?
If you leave this to the lower levels to decide, all definitions would be:
%union.foo = type { i32, i8* }; ; obvious platform dependent problems
here, solved by data layout, maybe
define void @func (%union.foo arg)
And the codegen would change to pointers or registers.
But, as I said above, that would encourage (or discourage)
optimizations at this level that could break the logic. Also, that
would serve to no purpose, since the idea of the IR is to make things
simpler and not run opt/codegen on pure C.
Thanks for the clarifications.
cheers,
--renato
http://systemcall.org/
Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm
More information about the cfe-dev
mailing list