[LLVMdev] [RFC] Attributes on Values

Philip Reames listmail at philipreames.com
Tue Sep 9 20:22:12 PDT 2014


On 09/09/2014 01:36 AM, Hal Finkel wrote:
> Hi everyone,
>
> Nick and Philip suggested something yesterday that I'd also thought about: supporting attributes on values (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140908/234323.html). The primary motivation for this is to provide a way of attaching pointer information, such as noalias, nonnull and dereferenceable(n), to pointers generated by loads. Doing this for pointers generated by inttoptr also seems potentially useful.
>
> A problem that we currently have with C++ lambda captures from Clang (and will have with other similar constructs, like outlined OpenMP loop bodies), is that we end up packing things that would be function parameters into structures to pass to the generated function. Unfortunately, this means that, in the generated anonymous function, these values are generated by loads (not true function parameters), and so we lose our ability to assert things about them (like noalias, nonnull, etc.).
>
> How this might work:
>
>   1. Instead of only CallInst/InvokeInst having an AttributeSet member variable, all instructions would have one. Accessor functions like getAttributes would be moved from CallInst/InvokeInst to Instruction. Only 'AttributeSet::ReturnIndex' would be meaningful on most instructions.
 From a usability perspective, I want to rename the enum (or maybe 
provide an alias?).  More generally, the interface around attributes is 
an utter mess to work with.  Before attempting to replace metadata 
(either in part or completely), I'd want to invest in creating an 
attributes API which was easier to understand and extend.

To address the size question asked in a followup, I'll respond somewhat 
glibly.  We already have metadata on a Value, how is having Attributes 
in their place any different?  (This is assuming we completely merge 
metadata and attributes.  No one has seriously proposed doing that yet.  
Alternatively, we could merge the storage and preserve the interface 
separation if we thought that was useful.)
>
>   2. For the text IR format: Like with call/invoke currently, we'd optionally parse attributes in between the instruction name ('load', etc.) and the first type parameter. So load would become:
>    <result> = load [ret attrs] [volatile] <ty>* <pointer>[, align <alignment>] (...)
> allowing you to write:
>    %val = load dereferenceable(32) i32** %ptr
>
>   3. The bitcode format extension would mirror the setup for metadata. A subblock type bitc::ATTRIBUTES_ATTACHMENT_ID would be added with a record array of [Instruction #, Attribute Set #]. This will allow attaching attribute sets to any instruction without increasing the bitcode record sizes for the instructions themselves.
>
> Obviously there is some potential functionality overlap here with metadata, and an alternate approach would be to add a metadata type that mirrors each attribute we'd like to model on values. Potential downsides to using metadata are:
>   - It could be a lot of metadata (consider adding nonnull/dereferenceable(n) to every load of a C++ reference type)
I can't really comment on the implications of this.  I don't actually 
know how wasteful metadata is space wise.  So far, I've never cared in 
practice.
>   - We already have code that handles these as return attributes on call instructions; extending that to look at all instructions seems straightforward. Adding alternate code to also look at metadata also would require additional development for each attribute.
I honestly don't see there being that much duplication.  I'd prefer to 
see us taking an approach of getting it working, then generalizing.  If 
we do find there's a lot of duplication over time and that simple 
functional abstraction doesn't abstract it well, we can revisit.
>
> One thing to keep in mind is that, like with metadata, the attributes can have control-flow dependencies, and we'd generally need to clear them when hoisting loads, etc. We already do this with metadata, at least for loads, so the places where we'd need to do this should hopefully be relatively-easy to find.
Agreed.  This is non-trivial.
> What do you think?

Overall, I think supporting one unified metadata/attribute model long 
term is a good idea.  We need to support all the various use cases (well 
known properties, prototyping, easy extensions, required vs dropable, 
etc..), but having two mechanisms is less than ideal. I don't believe 
the separation is an urgent problem.  I don't believe that work on 
improving our ability to optimize by proving hints using metadata should 
be held on this idea goal.  (Unless you're volunteering to do the work 
on this in the near future.  In which case, yeah!)

Philip



More information about the llvm-dev mailing list