[PATCH] Implement function prefix data as an IR feature.

Chris Lattner clattner at apple.com
Tue Jul 30 22:57:30 PDT 2013


On Jul 30, 2013, at 7:10 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:
>> - Given that this is completely target specific, is there any value in having some targets (e.g. not darwin) emit the blob *before* the function's label?  This would avoid the need to conditional branch around the blob.
> 
> Yes.  I don't think this should be target-dependent behaviour, though.
> The way I think this should be done is through global symbol offsets
> [1].
> 
> We can't use this layout in cases where the metadata is optional,
> as a metadata load behind the symbol may segfault if it lies on
> a section boundary.  So it wouldn't work for UBSan, for example,
> as an arbitrary indirect call may be calling an uninstrumented function.

Ok.

> Now that you've brought it up, is it true that Darwin prohibits this
> kind of thing?  After Duncan brought it up I wrote a small test program
> (load.cpp) which I compiled (loadcpp.orig.s) and modified (loadcpp.s).
> The resulting program worked as I expected.  Maybe it only fails to
> work in cases which I haven't considered.

Yes, it is not valid - usually.  Darwin uses a directive (.subsection_via_symbols iirc) that tells the linker it can "atomize" the object file, splitting it at public labels.  This directive is technically optional, but not using it causes serious pessimization to the toolchain. Ask the lld folks about it.  It is best to assume that Darwin requires chunks of the object file to start with a label.

>> - What is the behavior of inlining and other IPO w.r.t. this?  If this requires disabling all inlining and other IPO, then we need to know how to model it and I'd like to know how frontends are going to use it.
> 
> I designed this feature so that a call to a function with prefix data
> has the same semantics as a call to a function without, so IPO passes
> shouldn't be affected.

Ok, but what if the serialized metadata is encoding something about an argument, and DAE removes the argument in question?  Is it safe for the IPO optimizer to assume anything about these sorts of functions?  Maybe they should be considered to be of the same class as llvm.used?

>  The prefix data only has semantics when the
> function pointer is bitcasted and loaded from.  This is because of
> the requirement on the prefix data that it must transfer control to
> the point immediately succeeding the prefix data, without performing
> any other visible action.  (This property enables a number of other
> optimisations; e.g. we can drop prefix data from functions with
> internal linkage when the prefix data is provably never accessed,
> or demote an internal linkage function with prefix data to a global
> constant if the prefix data is read but the function is provably
> never called.)

Ok, but what does it mean if inlining happens?  Does the presence of this data block inlining?  Should it?

> 
>> - In Function.h, this is really unfortunate:
>> 
>> +  bool hasPrefixData() const {
>> +    return getNumOperands() != 0;
>> +  }
>> +
>> +  Constant *getPrefixData() const {
>> +    assert(hasPrefixData());
>> +    return cast<Constant>(Op<0>());
>> +  }
>> +
>> +  void setPrefixData(Constant *PrefixData);
>> 
>> We want to keep the sizeof(Function) as small as possible, and this feature will almost never be used.  I'd much rather see this be stored with a single bit in Function::SubclassOptionalData (from Value) that indicates "has entry in an on-the-side hash table" (and also is used to implement hasPrefixData()) and store the Constant* in the hashtable.
> 
> What if someone RAUWs one of the constants?  Wouldn't we need to
> introduce a new User subclass that would be stored in the hash table
> and hold the prefix data as an operand?

We don't generally support RAUW on constants (e.g., turning "i32 4" into "i32 5" doesn't make sense).  For things that do (e.g. those derived from global values) making the value in the hashtable be a WeakVH should work fine.

>> - Should this be modeled as some kind of crazy attribute?  That gives you a *different* place to unique it into and also would avoid punishing clients that aren't using it.
> 
> I initially wanted to implement this as an attribute, but I realised
> that we'd have the same RAUW problem I mentioned above.  It may well be
> better to do the work to allow attributes to hold Values rather than
> implementing something for prefix data that can't be reused.  I also
> don't know how difficult maintaining the bit in SubclassOptionalData
> would be (or if it's worth maintaining it) if we store this as an
> attribute.  I'd like to get Bill's views on this as well.

I don't think the RAUW problem is a serious one.  BTW, if you'd like to talk about this, I'll be at the social on Thursday.

-Chris



More information about the llvm-commits mailing list