[LLVMdev] Re: gcc like attributes and annotations

Sat Feb 25 03:41:52 PST 2006

Hi Reid,

Reid Spencer schrieb:
> I have some thoughts on this too ..
>
Great!

> On Fri, 2006-02-24 at 19:56 +0100, Jakob Praher wrote:
> 
>>I get you 100 % here. But as you say later in the mail, many information
>>is done by some runtime std::map<Value*,foo> stuff. Which is really
>>handy at runtime, but I *had* serialization in mind when I was thinking
>>about Annotations. I see annotations as a way to serialize some extra
>>information with the bytecode without having to extend/change the core
>>classes. The best way to implemented in runtime is to use some kind of
>>std::map subscripting, plus the additional benefit that you can
>>serialize it to the bytecode. Perhaps the best of both worlds.
>>
...
> 
> As Chris mentioned, I would prefer that we keep annotations out of the
> core IR altogether as they are fraught with problems that are not easy
> to resolve. However, I understand where you're coming from in wanting to
> keep additional information with the bytecode. I have wanted the same
> thing for use by front end or specialized tools. For example an IDE that
> could keep track of source information or a language that needs special
> passes that can only be done at link time.
>
Yes.

> In thinking about the "right" way to do this, I came up with the idea of
> a single "blob" of data that could be appended to a Module. This single
> "annotation" would always be ignored by LLVM, would not require
> significant additional space to construct, and there is already a
> mechanism for constructing the information via the bytecode reader's
> handler interface (might need some extension).
> 
As far as locality is concerned, perhaps it would make sense to make
such a blob on every primary object (module,function), so that
annotations that only apply to a certain function can be stored directly
in the function. That would make certain collisions easier to resolve.

> This is simply a way of making that std::map of information embeddable
> in the bytecode. It means the information is stored in one additional
> bytecode block (at the end) where it doesn't have any impact on LLVM
> (JIT/storage/etc).  The only question is: how do multiple tools avoid
> collision in this approach. Some kind of registry or partitioning of the
> data could likely solve that.
> 
Yes that sounds like a doable approach. But I would not write any binary
data into the blob, but use a LLVM type encoding approach/table
approach. Many annotations are simple or can be composite simple types
and people should be encouraged to store data in a way, that makes it
possible to read it without library code. If you just serialize C++
structs, you end up relying heavy on the code that wrote it. Which makes
it harder for tools to introspect anntoations. Java's annotations rely
on simple types for the same principle and I think it is the right way
for most things. There could be an opaque type for more complex
information, which should be discouraged.

This would also make it possible to have tripple of
Value,AnnotationType,Name to match the Annotation, which helps to the
solve the collision problem too.

The lookup mechanism could lookup by anything of the tripple:
- Target Value
- AnnotationType
- Name

NULL values are wildcards.

So you could say:

Give me all annotations for a Value*

/// Function local annotations
Value* v = ...
vector< const Annotation *>  &ans = curFunction->lookupAnnotation( v,
NULL, NULL);

Or based on a specific type:

/// Module wide annoations
AnnotationType *type = ...
Value< const Annotation *> &ans = module->lookupAnnotation( v, type, NULL );

This just random thought though.

> 
>>>As a historical curiosity, Function still needs to be annotatable due to
>>>the LLVM code generator relying on it.  This will be fixed in LLVM 1.8
>>>and Function will not be annotable anymore.
>>>
>>>If you *really* just want per-pass local data, you should just use an
>>>std::map from the Value* to your data.
>>
>>Why not see Annotations as the means to serialize these Maps. Maybe we
>>could add an Annotations table that maps Value types to ConstantPool
>>entries or something like that. This would make it more easily for LLVM
>>libraries in other languages too.
> 
> 
> This is similar to my idea above, but I wouldn't want to restrict it to
> any particular data structure. The application can construct the data
> however it wishes and simply pass a pointer to a block of memory to the
> bytecode writer. 
>

Great that we have a similar view. I would use a public simple type
encoding for the annotations, So that annotations are introspectable
without knowing much on the details of the annotation data. This helps
to keep the bytecode free from language specific data encoding too.

-- Jakob

> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev