[LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata
Hal Finkel
hfinkel at anl.gov
Fri Nov 21 20:35:29 PST 2014
----- Original Message -----
> From: "Raul Silvera" <rsilvera at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Chandler Carruth" <chandlerc at google.com>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Tuesday, November 18, 2014 9:09:40 PM
> Subject: Re: [LLVMdev] Upcoming Changes/Additions to Scoped-NoAlias metadata
>
> I preserve them only weakly... I don't want full barriers; in fact, I
> plan to add InstCombine code to combine calls to @llvm.noalias (it
> will also take a list of scopes, not just one, so this is possible).
> The goal is to have as few barriers as possible.
>
>
> Good.
>
> > > Going further, logically the intrinsic should return a pointer to
> > > a
> > > new object, disjoint from all other live objects. It is not
> > > aliased
> > > to A, and is well defined even if it contains &A because A is not
> > > referenced in the scope.
> >
> > This is essentially what is done, but only for accesses in the
> > scope
> > (or some sub-scope). I don't think the semantics allow for what
> > you're suggesting. The specific language from 6.7.3.1p4 says:
> >
> > [from C]
> > During each execution of B, let L be any lvalue that has &L based
> > on
> > P. If L is used to
> > access the value of the object X that it designates, ...,
> > then the following requirements apply: ... Every other lvalue
> > used to access the value of X shall also have its address based on
> > P.
> > [end from C]
> >
> > Where B is defined in 6.7.3.1p2 to be, essentially, the block in
> > which the relevant declaration appears. And we can really only
> > extrapolate from that to the other access in that block, and not to
> > the containing block.
> >
> >
> > Inside that block
> > (the lifetime of P) , it is safe to assume that X is
> > disjoint from an arbitrary live object
> > A. It if was
> >
> > n't
> > , either:
> > - A is independently referenced inside the block, so there is UB
> > and
> > all bets are off.
> > - A is not independently referenced inside the blo ck,
> > so t here are no pairs of accesses to incorrectly reorder as all
> > accesses to A in
> > the block are done through P. You just need to delimit the block
> > with dataflow barriers
> > , summar iz
> > ing the effect of the block at entry/exit.
>
> Okay, I think I agree with you assuming that we put in entry/exit
> barriers to preserve the block boundaries. I'd specifically like to
> avoid that, however.
>
> I'm not proposing full code motion barriers, only punctual dataflow
> use/defs to signal entry/exit to the scope.
>
>
>
> Logically, entering the scope transfers the pointed data into a new
> unique block of memory, and puts its address on the restrict
> pointer. Exiting the scope transfers it back. Of course you do not
> want to actually allocate a new object and move the data, but you
> can use these semantics to define the scope entry/exit intrinsics.
> Their contribution to dataflow is only limited to the content of the
> address used to initialize the restricted pointer. These would be
> lighter than the proposed intrinsic as they would not have
> specialized control-flow ​restrictions.
Thanks for explaining, I now understand what you're proposing.
>
> This approach makes the restrict attribute effective against all live
> variables without having to examine the extent of the scope to
> collect all references, which is in general impractical.
I think you've misunderstood this. For restrict-qualified local variables, every memory access within the containing block (which is everything in the function for function argument restrict-qualified pointers) get tagged with the scope. This is trivial to determine.
> It also
> removes the need for scope metadata, as there would be no need to
> name the scopes.
Indeed.
>
>
> Anyway, this is just a general alternate design, since you were
> asking for one.
Yes, and thank you for doing so.
> I'm sure still would take some time/effort to map it
> onto the LLVM framework.
That does not seem too difficult, the question is really just whether or not it gives us what we need...
>
So in this scheme, we'd have the following:
void foo(T * restrict a, T * restrict b) {
*a = *b;
}
T * x = ..., *y = ..., *z = ..., *w = ...;
foo(x, y);
foo(z, w);
become:
T * x = ..., *y = ..., *z = ..., *w = ...;
T * a1 = @llvm.noalias.start(x); // model: reads from x (with a general write control dep).
T * b1 = @llvm.noalias.start(y);
*a1 = *b1;
@llvm.noalias.end(a1, x); // model: reads from a1, writes to x.
@llvm.noalias.end(b1, y);
T * a2 = @llvm.noalias.start(z);
T * b2 = @llvm.noalias.start(w);
*a2 = *b2;
@llvm.noalias.end(a2, z);
@llvm.noalias.end(b2, w);
This does indeed seem generally equivalent to the original proposal in the sense that the original proposal has an implicit ending barrier at the last relevant derived access, and here we have explicit ending barriers. The advantage is the lack of metadata (and associated implementation complexity). The disadvantage is that we have additional barriers to manage, and these are write barriers on the underlying pointers. It is not clear to me this would make too much difference, so long as we aggressively hoisted the ending barriers to just after the last use based on their 'noalias' operands.
So this is relatively appealing, and I think would not be a bad way to model C99 restrict (extending the scheme to handle mutually-ambiguous restrict-qualified pointers from aggregates seems straightforward). It does not, however, cover cases where the region of guaranteed disjointness (for lack of a better term) is not continuous. This will come up when implementing a scheme such as that in the current C++ alias-set proposal (N4150). To construct a quick example, imagine that our implementation of std::vector is annotated such that (assuming the standard allocator) each std::vector object's internal storage has a distinct alias set, and we have:
std::vector<T> x, y;
...
T * q = &x[0];
for (int i = 0; i < 1600; ++i) {
x[i] = y[i];
*q += x[i];
}
so here we know that the memory accesses inside the operator[] from x and y don't alias, but the alias-set attribute does not tell us about the relationship between those accesses and the *q. The point of dominance, however, needs to associated with the declaration of x and y (specifically, we want to preserve the dominance over the loop). A start/end barrier scheme localized around the inlined operator[] functions would not do that, and placing start/end barriers around the entire live region of x and y would not be correct. I can, however, represent this using the metadata scheme.
Thanks again,
Hal
>
>
>
> Regards,
>
>
>
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list