[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

Tue Jan 19 15:09:55 PST 2016

+Chandler, Duncan, David M, and Reid (pseudorandom selection of people who
might be interested/voiced opinions on prior threads/conversations on the
subject)

Firstly, thanks Eddy for doing a lot of work in the opaque pointer area -
it's greatly appreciated. This is a particularly thorny part of it (or, at
least I think so at the moment - sounds like you've seen thornier parts
further down the trail).

On Tue, Jan 19, 2016 at 2:47 PM, Eddy B. via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi,
>
> In the past months, several options have been presented for making byval
> (and similar attributes, such as inalloca or sret) work with opaque
> pointers.
>
> The main two I've seen were byval(T) and byval(N) where N is the size of T.
>
> They both have their upsides and downsides, for example: byval(T) would be
> a type-parametric attribute, which, AFAIK, does not already exist and may
> complicate the attribute system significantly, while byval(N) would be hard
> to introduce in tests as computing N from T requires LLVM's DataLayout.
>

By "hard to introduce in tests" you just mean it's hard to edit the test
cases to add the correct value of N automatically? Yeah - it's certainly
not a simple sed script like my previous transformations.

> Also, this would have to be done for inalloca and sret as well - sret only
> needs it when targeting SPARC, although still generally useful in analysis.
>

Reid, Dave - bringing this to your attention in case you've got any
thoughts on inalloca/sret (was sret part of the Windows-y things, or just
inalloca?), how they should be handled, if it should be different/the same
as byval, etc?

> To sidestep some of the concerns and allow a smooth transition towards a
> byval that works with opaque pointers, I've come up with a new approach:
>
> Reuse dereferenceable(S) and align A for the size and alignment of byval.
>

Side note: align should already be what's used for the alignment of byval.
I think I remember seeing that there was one target hook and one target
using  that hook, that did some backend alignment fixes. If that is still
present, we should fix/remove it so that the align attribute is the one
source of truth for the alignment of a byval parameter. This should happen
before any changes to byval.

The dereferenceable part is the novelty here and I really don't know how
everyone feels about it. I don't much mind - I suppose I like it because it
means optimizations can, maybe, hack on byval parameters more effectively
without having to have byval-specific smarts necessarily. (also, because
you've already done a bunch of the work)

> That is, a byval dereferenceable(S) align A argument is guaranteed to have
> S bytes available to read from, *and only S*, aligned to a multiple of A.
> Reading past that size is UB, as LLVM will not copy more than S bytes.
>
> An API can be provided to add the attribute alongside dereferenceable
> and align attributes, for a given Type* and DataLayout.
>
> A preliminary implementation (w/o sret) can be found at:
> https://github.com/eddyb/llvm/compare/2579466...65ac99b
>
> To maintain compatibility with existing code,

^ this isn't actually a goal/necessity, fwiw. So we might not need the
complexity you're outlining below. Or if we do choose to take it, we may
only want/need to keep it around for a relatively short period.

> dereferenceable and align
> attributes are automatically injected as soon as a non-default DataLayout
> is available. The "injection" mechanism could potentially be replaced with
> a pass, although it was easier to experiment with it being guaranteed.
>
> This works out pretty well in practice, as analysis already understands
> dereferenceable and can make decisions based on it.
>
> The verifier checks that for byval & friends, dereferenceable(S) and
> align A are present (clang always adds align, but not all tests have it)
>

Do all tests need it? Align is just for non-default alignment (> align(1))
so it may be OK for the tests to omit it & it's probably not right to
/require/ an align attribute on a byval parameter.

> and that S is the exact size of the pointee type (while we still know
> that).
>
> That last bit is very important, because it allows a script to do the
> following:
>
> 1. Find all byval arguments in tests that are missing dereferenceable, e.g.
>     ... i32* byval align 4 ...
>     .... {i8, i64}* byval ...
> 2. Add a bogus dereferenceable(unique ID) to each of them, i.e.
>     ... i32* byval dereferenceable(123400001) align 4 ...
>     .... {i8, i16}* byval dereferenceable(123400002) ...
> 3. Run the tests and record the errors, which may look like:
>
> Attribute 'byval' expects 'dereferenceable(4)' for type i32*,
>     found 'dereferenceable(123400001)'
>
> Attribute 'byval' expects 'dereferenceable(16) align 8' for type {i8,
> i64}*,
>     found 'dereferenceable(123400002)'
>
> 4. Use the verifier error messages to replace the bogus attributes
> with the proper ones, which include align A when it is missing:
>     ... i32* byval dereferenceable(4) align 4 ...
>     .... {i8, i16}* byval dereferenceable(16) align 8 ...
>
> For what is worth, the same scheme would also work for byval(N), and
> would be entirely unnecessary for byval(T).
>
> I would love to know your thoughts on this, and more specifically:
> Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align)
> do you think would provide the easiest transition path for front-ends?
>
> Thank you,
>  - eddyb
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160119/ac98fc59/attachment.html>