[LLVMdev] llvm.meta (was Rotated loop identification)

Hal Finkel hfinkel at anl.gov
Fri Sep 13 11:08:27 PDT 2013

----- Original Message -----
> On Thu, Sep 12, 2013 at 4:52 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> > If we only try to solve your immediate problem of
> > builtin_assume_aligned, isn't that good enough for now?
> The thing that most concerns me about __builtin_assume_aligned and
> this scheme is the control dependencies. In gcc, it is the return
> value of the intrinsic that carries the alignment guarantee, and I
> think that this makes a lot of sense. Consider something like this:
> void foo(double *x) {
> if (check_if_x_is_special(&global_state)) {
> y = __builtin_assume_aligned(x, 16);
> do_something(y);
> } else
> do_something_else(x);
> }
> with this scheme, there is never a danger that the alignment
> assumption can be lifted and incorrectly applied to x in an inlined
> do_something_else(x). If we simply have the intrinsic not return a
> value and apply its invariant back on its arguments, then I don't
> see how to guarantee correctness.
> I would like general invariants, but can I make general invariants
> return a value in the same way? Alternatively, maybe the invariant
> could take a pointer and only apply to values in a specific memory
> location at some particular point? Maybe then AA will keep
> everything safe?
> As Chandler pointed in response to my original patch on this topic,
> making the intrinsic return a value means that everyone else needs
> to look through it. On the other hand, it would make it
> conservatively correct ;) (as I'm writing this, I'm leaning toward
> trying the pointer thing).
> Isn't alignment already annotated on loads and stores?

Yes, they get the type alignment.

> It seems like
> the frontend could lower __builtin_assume_aligned() to put the right
> alignment on all such loads and stores.

No, because a lot of the import use cases involve inlining, and post-unrolling adjustments. The fundamental problem is that the LLVM load/store alignment is a property of the load or store, not of the pointer. So, for example, if I have:

double *a = __builtin_assume_aligned(x, 32);
double *b = __builtin_assume_aligned(y, 32);

for (i = 0; i < n; ++i) {
  a[i] = b[i];

Can we now tag the loads from b and the stores to a as 32-byte aligned? No (only every 4th one is 32-byte aligned). Only after unrolling, if the backend decided to perform it, can some of the loads and stores have their alignment adjusted.

Regarding inlining, it is also important to note that we want the intrinsics to stick around until LTO if we're doing LTO.

> An optimization pass could
> propagate large alignments forward to dominated memory accesses.

In my original patchset for this (going back quite a few months now), this is essentially what I did. There was a pass that ran after unrolling to propagate alignments (and the intrinsics were discarded at the end).

> If there are no memory accesses (as your example), there's probably
> nothing to optimize until after inlining, so perhaps there should be
> an LLVM intrinsic which gets eliminated after inlining, leaving
> behind no shadow uses.

I agree; if we're not doing LTO, then the intrinsics should be deleted after the last propagation pass run.

Thanks again,

Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

More information about the llvm-dev mailing list