[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

Sun Jan 24 10:25:48 PST 2016


On 01/19/2016 02:47 PM, Eddy B. via llvm-dev wrote:
> Hi,
>
> In the past months, several options have been presented for making byval
> (and similar attributes, such as inalloca or sret) work with opaque pointers.
>
> The main two I've seen were byval(T) and byval(N) where N is the size of T.
>
> They both have their upsides and downsides, for example: byval(T) would be
> a type-parametric attribute, which, AFAIK, does not already exist and may
> complicate the attribute system significantly, while byval(N) would be hard
> to introduce in tests as computing N from T requires LLVM's DataLayout.
>
> Also, this would have to be done for inalloca and sret as well - sret only
> needs it when targeting SPARC, although still generally useful in analysis.
>
> To sidestep some of the concerns and allow a smooth transition towards a
> byval that works with opaque pointers, I've come up with a new approach:
>
> Reuse dereferenceable(S) and align A for the size and alignment of byval.
>
> That is, a byval dereferenceable(S) align A argument is guaranteed to have
> S bytes available to read from, *and only S*, aligned to a multiple of A.
> Reading past that size is UB, as LLVM will not copy more than S bytes.
Just to make sure I understand, you are *not* planning on changing the 
semantics of the existing attributes right?  The current semantics for 
"dereferenceable(N)" are "N bytes are known to be dereferenceable, more 
bytes might be".  What worried my in your wording was the UB comment.  
It is not currently UB to have a read beyond the *known* dereferenceable 
bytes.  This is important to me and I would be very hesitant to change it.
>
> An API can be provided to add the attribute alongside dereferenceable
> and align attributes, for a given Type* and DataLayout.
>
> A preliminary implementation (w/o sret) can be found at:
> https://github.com/eddyb/llvm/compare/2579466...65ac99b
>
> To maintain compatibility with existing code, dereferenceable and align
> attributes are automatically injected as soon as a non-default DataLayout
> is available. The "injection" mechanism could potentially be replaced with
> a pass, although it was easier to experiment with it being guaranteed.
>
> This works out pretty well in practice, as analysis already understands
> dereferenceable and can make decisions based on it.
>
> The verifier checks that for byval & friends, dereferenceable(S) and
> align A are present (clang always adds align, but not all tests have it)
> and that S is the exact size of the pointee type (while we still know that).
>
> That last bit is very important, because it allows a script to do the following:
>
> 1. Find all byval arguments in tests that are missing dereferenceable, e.g.
>      ... i32* byval align 4 ...
>      .... {i8, i64}* byval ...
> 2. Add a bogus dereferenceable(unique ID) to each of them, i.e.
>      ... i32* byval dereferenceable(123400001) align 4 ...
>      .... {i8, i16}* byval dereferenceable(123400002) ...
> 3. Run the tests and record the errors, which may look like:
>
> Attribute 'byval' expects 'dereferenceable(4)' for type i32*,
>      found 'dereferenceable(123400001)'
>
> Attribute 'byval' expects 'dereferenceable(16) align 8' for type {i8, i64}*,
>      found 'dereferenceable(123400002)'
>
> 4. Use the verifier error messages to replace the bogus attributes
> with the proper ones, which include align A when it is missing:
>      ... i32* byval dereferenceable(4) align 4 ...
>      .... {i8, i16}* byval dereferenceable(16) align 8 ...
>
> For what is worth, the same scheme would also work for byval(N), and
> would be entirely unnecessary for byval(T).
>
> I would love to know your thoughts on this, and more specifically:
> Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align)
> do you think would provide the easiest transition path for front-ends?
>
> Thank you,
>   - eddyb
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev