[LLVMdev] [RFC] Attributes on Values

Tue Sep 9 12:44:40 PDT 2014

----- Original Message -----
> From: "Bob Wilson" <bob.wilson at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM-Dev" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, September 9, 2014 2:34:05 PM
> Subject: Re: [LLVMdev] [RFC] Attributes on Values
> 
> 
> > On Sep 9, 2014, at 1:36 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> > 
> > Hi everyone,
> > 
> > Nick and Philip suggested something yesterday that I'd also thought
> > about: supporting attributes on values
> > (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140908/234323.html).
> > The primary motivation for this is to provide a way of attaching
> > pointer information, such as noalias, nonnull and
> > dereferenceable(n), to pointers generated by loads. Doing this for
> > pointers generated by inttoptr also seems potentially useful.
> > 
> > A problem that we currently have with C++ lambda captures from
> > Clang (and will have with other similar constructs, like outlined
> > OpenMP loop bodies), is that we end up packing things that would
> > be function parameters into structures to pass to the generated
> > function. Unfortunately, this means that, in the generated
> > anonymous function, these values are generated by loads (not true
> > function parameters), and so we lose our ability to assert things
> > about them (like noalias, nonnull, etc.).
> > 
> > How this might work:
> > 
> > 1. Instead of only CallInst/InvokeInst having an AttributeSet
> > member variable, all instructions would have one. Accessor
> > functions like getAttributes would be moved from
> > CallInst/InvokeInst to Instruction. Only
> > 'AttributeSet::ReturnIndex' would be meaningful on most
> > instructions.
> 
> Do you have any idea what impact this would have on memory use? We’ve
> been seeing a lot of small regressions in unoptimized compile times,
> and they’re really starting to add up. Adding a member to all
> instructions will have a cost even when not optimizing. We need to
> be more careful about that.

I don't know, but I'll see if I can make some reasonable measurements. Another possibility would be to add the AttributeSet only to LoadInst, IntToPtr, etc. corresponding to the use cases I currently understand.

> 
> > 
> > 2. For the text IR format: Like with call/invoke currently, we'd
> > optionally parse attributes in between the instruction name
> > ('load', etc.) and the first type parameter. So load would become:
> >  <result> = load [ret attrs] [volatile] <ty>* <pointer>[, align
> >  <alignment>] (...)
> > allowing you to write:
> >  %val = load dereferenceable(32) i32** %ptr
> > 
> > 3. The bitcode format extension would mirror the setup for
> > metadata. A subblock type bitc::ATTRIBUTES_ATTACHMENT_ID would be
> > added with a record array of [Instruction #, Attribute Set #].
> > This will allow attaching attribute sets to any instruction
> > without increasing the bitcode record sizes for the instructions
> > themselves.
> > 
> > Obviously there is some potential functionality overlap here with
> > metadata, and an alternate approach would be to add a metadata
> > type that mirrors each attribute we'd like to model on values.
> > Potential downsides to using metadata are:
> > - It could be a lot of metadata (consider adding
> > nonnull/dereferenceable(n) to every load of a C++ reference type)
> 
> Metadata is really expensive but we could simply omit it when not
> optimizing. The tradeoff isn’t obvious to me, but it seems like it
> would be OK if limited to C++ lambda captures and things like
> outlined OpenMP loop bodies. If you’re talking about moving the
> attributes that are currently on call/invoke instructions to
> metadata, the cost may be too high.

We could omit the attributes also when not optimizing (maybe we should be doing that now for the function parameter attributes we do emit). I'm not proposing to move attributes currently on call/invoke to metadata (although Reid did suggest that, also noting that they could be omitted at -O0), but if we had these attributes (nonnull, dereferenceable, etc.) on loads, for example, we could certainly emit them in more places in Clang's CodeGen -- this would likely provide improvements not just in lambda captures, etc. (although that is a key use case for this functionality, however we implement it).

Thanks again,
Hal

> 
> 
> > - We already have code that handles these as return attributes on
> > call instructions; extending that to look at all instructions
> > seems straightforward. Adding alternate code to also look at
> > metadata also would require additional development for each
> > attribute.
> > 
> > One thing to keep in mind is that, like with metadata, the
> > attributes can have control-flow dependencies, and we'd generally
> > need to clear them when hoisting loads, etc. We already do this
> > with metadata, at least for loads, so the places where we'd need
> > to do this should hopefully be relatively-easy to find.
> > 
> > What do you think?
> > 
> > Thanks again,
> > Hal
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory