[LLVMdev] Re: gcc like attributes and annotations

Thu Mar 2 07:36:56 PST 2006

hi,

Chris Lattner schrieb:
>> thanks for your reply.
> 
> Yes, it was added to the .ll/.bc formats:
> http://llvm.cs.uiuc.edu/docs/LangRef.html#globalvars
> http://llvm.cs.uiuc.edu/docs/BytecodeFormat.html#globalinfo

Interesting. I will check it out.
> 
>> Did you think about a mapping of common attributes on different
>> platforms. For instance DLLMain Entry point under Win32 and the
>> __attribute__((constructor)) under Linux.
> 
> 
> __attribute__((constructor)) is handled with a the llvm.globalctors
> global variable (even with llvm 1.6), try it out.
> 
Great!

>> Okay so I am on quite the opposite attitude than the LLVM team towards
>> that issue :-)
> 
> 
> I don't follow.
>
All I wanted to say here is that while I thought Function was the first
Value beeing annotatable, it turned out to remain the last :-)

>>> At one point in time, Value was annotatable.  The problem with this was
>>> two fold:
>>>
>>> 1. This bloat every value in the system, by adding an extra pointer.
>>> 2. These annotations would get stale and not be updated correctly.
>>>
>>> The problem is basically that adding annotations really amounts to
>>> extending the LLVM IR, and making it look like something simple doesn't
>>> make it easier to deal with.  For example, if you add an "I'm special"
>>> attribute to an instruction, then the function is cloned by some pass,
>>> is that attribute copied or not?  What if it is deleted, moved,
>>> rearranged, etc?  Further, how can annotations be serialized to .ll
>>> files and .bc files?  In llvm, we always want "opt -pass1 -pass2" to be
>>> the same as "opt -pass1 | opt -pass2", which would break if annotations
>>> can't be serialized (which they can't currently).
>>>
>>
>> I get you 100 % here. But as you say later in the mail, many information
>> is done by some runtime std::map<Value*,foo> stuff. Which is really
>> handy at runtime, but I *had* serialization in mind when I was thinking
>> about Annotations.
> 
> Okay, if you want to serialize/deserialize, they become much more
> palatable, the implementation just gets stickier.

Hmm not sure I understand you here. What I don't want is that the spec
and the implementation intermix. I think if there is a serialization in
use it should have a well known format. That have to be easy to beeing
operated on with small tools. It should be made of primitve types that
can be composed (like the Struct Type for instance). For instnace an
annotation value could refer to a Typeslot, which means Annotions uses
the LLVM types.

> 
>> I see annotations as a way to serialize some extra
>> information with the bytecode without having to extend/change the core
>> classes. The best way to implemented in runtime is to use some kind of
>> std::map subscripting, plus the additional benefit that you can
>> serialize it to the bytecode. Perhaps the best of both worlds.
> 
> 
> That's fine, but don't think that makes them solve all of the problems.
> Again, there is still the updating issue.
> 
Hmm. I have not too much experience here. So I think you are right in
this regard. See below. Perhaps the update issue is really in the domain
of the annotation writer. If the person does not think about it, then
the combination of two annotations should be left out. I think one could
implement them using a callback. But see below.

>> Two things here:
>> (1) Annotations should not be something which really changes the meaning
>> of a Value/Type. All the passes should work without the annotation.
> 
> 
> Okay, what use are they then?
> 
I think the difference is mission critical versus usable. I think
metadata of this kind is very usable, but should not stop non-aware
passes from running. So it is optional. I put it in wrong way obove.
Values should get augmented by more metadata, but not change the way the
other passes work.

> Note that source language types are not unique in LLVM, and they
> shouldn't be even with annotations.  For example:
> 
> struct X { int A; };
> struct Y { int B; };
> 
> Both X and Y map to the same LLVM Type.  This cannot change.

This is alright. And I am aware of that. LLVM has a structural
equivalent type system. And I think it is the right in terms of LIR.
But you can for instance tag the alloca instruction with an annotation,
which adds symbolic type information. Since the alloca binds the type to
a stack location this would be an option. (and other
allocation/getelementptr instruction as well).

> 
>> Perhaps the thing could be solved by adding policy statemetns to
>> annotations. I could imagine the inventor of an Annotation should think
>> about how the annotation should behave during optimisation/change. So
>> the anntation should have a policy field which defaults to DontCare. In
>> that case the user of the Annotation cannot be sure that it will get
>> retained or something like that.
> 
> 
> Personally, I see annotations as a convenient way to do experiments and
> allow rapid development.  If we decide that a feature makes sense in the
> LLVM IR long term, it should be added as a first class feature of it.
> 
Hmm. In my view most of the annotations should be just metadata. Like
the example above. But if it turns out that an annotation is more than
this, and valueable this would be the right way to do.

>> since I saw the llvm-gcc generates code like:
>>
>> %pa = alloca %struct.A
>> %pb = alloca %struct.B
>>
>> this means that the AllocaInst must have knowledge about two types which
>> can only be so by having two different pointers? right?
> 
> 
> This is an implementation detail of the old llvm-gcc that breaks with
> the new one.  Do not depend on it.
> 

Okay. Thanks for the info.

> 
>> Hehe, I know. Certainly if someone like you says that. But *if* the
>> front end is aware of the annotation, which would be doable, and the
>> annations are serializable in bytecode, then one would have this
>> information during LLVM bytecode processing as well.
> 
> 
> Yes.  However, there would be no way to keep isomorphic LLVM types
> separate.  This dramatically limits the usefulness of what you're trying
> to do.
> 

Maybe I was not clear.
I want to attach extra informfation at use time of variables for
instance. This information is optional. So I don't divide isomorphic
LLVM types. Every type is bound to a variable through a special
instruction. If you annotate these instructions you can always find the
symbolic type of a such a variable. This is just one use of annotations.
It clearly does not change the meaning of instructions, but only attach
meta information.

For later use: You can then use the information, for instance:
-> in the JIT (easy)
-> during code generation, you can add symbol information to memory
address much like the relocation entries in ELF. So you can use the
address of the variable to get its symbolic type.

>>
>>
>> See above. Every information must be understood in order to be usable.
>> But I would do it as an annotation since it is just additional meta
>> information and the program would perfectly run without the information.
>
> 
> Hopefully I made the issue more clear above.
> 
Hope so too. Please tell me what you think. But I could see many uses of
simple meta annotations.

> -Chris
>