[llvm-dev] [RFC] A proposal for byval in a world with opaque pointers

Tue Jan 19 15:16:10 PST 2016

> Do all tests need it? Align is just for non-default alignment (> align(1)) so it may be OK for the tests to omit it & it's probably not right to /require/ an align attribute on a byval parameter.

I've had to fix some of the tests which checked that the right
alignment was applied,
by looking at the generated asm, but had no "align A" attributes and
relied on the
target alignment guessing, which I had taken out in my experimental
implementation.

The rest of the tests might be fine, but could behave surprisingly
upon closer inspection.

2016-01-20 1:09 GMT+02:00 David Blaikie <dblaikie at gmail.com>:
> +Chandler, Duncan, David M, and Reid (pseudorandom selection of people who
> might be interested/voiced opinions on prior threads/conversations on the
> subject)
>
> Firstly, thanks Eddy for doing a lot of work in the opaque pointer area -
> it's greatly appreciated. This is a particularly thorny part of it (or, at
> least I think so at the moment - sounds like you've seen thornier parts
> further down the trail).
>
> On Tue, Jan 19, 2016 at 2:47 PM, Eddy B. via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> Hi,
>>
>> In the past months, several options have been presented for making byval
>> (and similar attributes, such as inalloca or sret) work with opaque
>> pointers.
>>
>> The main two I've seen were byval(T) and byval(N) where N is the size of
>> T.
>>
>> They both have their upsides and downsides, for example: byval(T) would be
>> a type-parametric attribute, which, AFAIK, does not already exist and may
>> complicate the attribute system significantly, while byval(N) would be
>> hard
>> to introduce in tests as computing N from T requires LLVM's DataLayout.
>
>
> By "hard to introduce in tests" you just mean it's hard to edit the test
> cases to add the correct value of N automatically? Yeah - it's certainly not
> a simple sed script like my previous transformations.
>
>>
>> Also, this would have to be done for inalloca and sret as well - sret only
>> needs it when targeting SPARC, although still generally useful in
>> analysis.
>
>
> Reid, Dave - bringing this to your attention in case you've got any thoughts
> on inalloca/sret (was sret part of the Windows-y things, or just inalloca?),
> how they should be handled, if it should be different/the same as byval,
> etc?
>
>>
>> To sidestep some of the concerns and allow a smooth transition towards a
>> byval that works with opaque pointers, I've come up with a new approach:
>>
>> Reuse dereferenceable(S) and align A for the size and alignment of byval.
>
>
> Side note: align should already be what's used for the alignment of byval. I
> think I remember seeing that there was one target hook and one target using
> that hook, that did some backend alignment fixes. If that is still present,
> we should fix/remove it so that the align attribute is the one source of
> truth for the alignment of a byval parameter. This should happen before any
> changes to byval.
>
> The dereferenceable part is the novelty here and I really don't know how
> everyone feels about it. I don't much mind - I suppose I like it because it
> means optimizations can, maybe, hack on byval parameters more effectively
> without having to have byval-specific smarts necessarily. (also, because
> you've already done a bunch of the work)
>
>>
>> That is, a byval dereferenceable(S) align A argument is guaranteed to have
>> S bytes available to read from, *and only S*, aligned to a multiple of A.
>> Reading past that size is UB, as LLVM will not copy more than S bytes.
>>
>> An API can be provided to add the attribute alongside dereferenceable
>> and align attributes, for a given Type* and DataLayout.
>>
>> A preliminary implementation (w/o sret) can be found at:
>> https://github.com/eddyb/llvm/compare/2579466...65ac99b
>>
>> To maintain compatibility with existing code,
>
>
> ^ this isn't actually a goal/necessity, fwiw. So we might not need the
> complexity you're outlining below. Or if we do choose to take it, we may
> only want/need to keep it around for a relatively short period.
>
>>
>> dereferenceable and align
>> attributes are automatically injected as soon as a non-default DataLayout
>> is available. The "injection" mechanism could potentially be replaced with
>> a pass, although it was easier to experiment with it being guaranteed.
>>
>> This works out pretty well in practice, as analysis already understands
>> dereferenceable and can make decisions based on it.
>>
>> The verifier checks that for byval & friends, dereferenceable(S) and
>> align A are present (clang always adds align, but not all tests have it)
>
>
> Do all tests need it? Align is just for non-default alignment (> align(1))
> so it may be OK for the tests to omit it & it's probably not right to
> /require/ an align attribute on a byval parameter.
>
>>
>> and that S is the exact size of the pointee type (while we still know
>> that).
>>
>> That last bit is very important, because it allows a script to do the
>> following:
>>
>> 1. Find all byval arguments in tests that are missing dereferenceable,
>> e.g.
>>     ... i32* byval align 4 ...
>>     .... {i8, i64}* byval ...
>> 2. Add a bogus dereferenceable(unique ID) to each of them, i.e.
>>     ... i32* byval dereferenceable(123400001) align 4 ...
>>     .... {i8, i16}* byval dereferenceable(123400002) ...
>> 3. Run the tests and record the errors, which may look like:
>>
>> Attribute 'byval' expects 'dereferenceable(4)' for type i32*,
>>     found 'dereferenceable(123400001)'
>>
>> Attribute 'byval' expects 'dereferenceable(16) align 8' for type {i8,
>> i64}*,
>>     found 'dereferenceable(123400002)'
>>
>> 4. Use the verifier error messages to replace the bogus attributes
>> with the proper ones, which include align A when it is missing:
>>     ... i32* byval dereferenceable(4) align 4 ...
>>     .... {i8, i16}* byval dereferenceable(16) align 8 ...
>>
>> For what is worth, the same scheme would also work for byval(N), and
>> would be entirely unnecessary for byval(T).
>>
>> I would love to know your thoughts on this, and more specifically:
>> Which of the 3 (byval(T), byval(N) and byval + dereferenceable + align)
>> do you think would provide the easiest transition path for front-ends?
>>
>> Thank you,
>>  - eddyb
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>