[LLVMdev] LLVM IR is a compiler IR

Wed Oct 5 05:01:53 PDT 2011

Hi Oscar,

>>> There are places where compatibility with the native C ABI is taken too
>>> far. For instance, time ago I noted that what the user sets through
>>> Module::setDataLayout is simply ignored.
>>
>> it's not ignored, it's used by the IR level optimizers.  That way these
>> optimizers can know stuff about the target without having to be linked
>> to a target backend.
>
> Well, it is used by one layer, ignored by another. Anyways LLVM is not
> doing what the user expects.

it's not doing what *you* expect: it doesn't match your mental model of what
it is for (or should be for).  The question is whether LLVM should be changed
or your expectations should be changed.  Just observing the mismatch between
your expectations and current reality is not in itself an argument that LLVM
should be changed.

>>> LLVM uses the data layout
>>> required by the native C ABI, which is hardcoded into LLVM's source
>>> code. So I asked: pass the value setted by Module::setDataLayout to the
>>> layers that are interested on it, as any user would expect.
>>
>> There are two classes of information in datalayout: things which correspond
>> to stuff hard-wired into the target processor (for example that x86 is little
>> endian), and stuff which is not hard-wired in (for example the alignment of
>> x86 long double, which is 4 or 8 bytes on x86-32 depending on whether you are
>> on linux, darwin or windows).  Hoping to have code generators override the
>> hard-wired stuff if they see something different in the data layout is just
>> too much to ask for - eg the x86 code generators are never going to produce big
>> endian code just because you set big-endianness in the datalayout.  Even the
>> second class of "soft" parameters is not completely flexible: for example most
>> processors enforce a minimum alignment for types, and trying to reduce it by
>> giving types a lesser alignment in the datalayout just isn't going to work.
>> So given that the ways in which codegen could adapt to various datalayout
>> settings are quite limited and constrained by the target, does it really make
>> sense to try to parametrize the codegenerators by the datalayout at all?
>> In any case, it might be good if the code generators produced a warning if they
>> see that the datalayout string doesn't correspond to what codegen thinks it
>> should be (I though someone added that already?).
>
> You focus your reasoning on possible wrong uses of the data layout
> setting (endianness) when, as you say, there are other uses which are
> perfectly legit (using a specific alignment within the limits allowed by
> the processor.)  So if I need to align my data on a different way of
> what the C ABI requires or generate code for a platform that LLVM still
> does not know about, my only solution is to patch LLVM because the value
> setted through one of its APIs is ignored on key places, as LLVM assumes
> that everybody wants full interoperability with C. This is the kind of
> logic that tells me that LLVM is a C-obsessed project: any requirement
> that falls outside the needs of a C compiler writer is seen as
> superfluous even if it does not conflict with the rest of LLVM.

You are talking to the wrong person: I pretty much only use Ada not C, so I
don't think I'm C obsessed.  Yet I never had any problems using LLVM with Ada.
LLVM gives you several mechanisms for aligning things the way you like.  Are
they inadequate?  Do you have a specific example of something you find
problematic?

Ciao, Duncan.