[PATCH] D54966: Implement P1007R3 `std::assume_aligned`

Sat Dec 22 02:47:20 PST 2018

chandlerc added a comment.

In D54966#1319390 <https://reviews.llvm.org/D54966#1319390>, @hfinkel wrote:

> In D54966#1319368 <https://reviews.llvm.org/D54966#1319368>, @rsmith wrote:
>
> > In D54966#1317750 <https://reviews.llvm.org/D54966#1317750>, @hfinkel wrote:
> >
> > > In D54966#1317428 <https://reviews.llvm.org/D54966#1317428>, @rsmith wrote:
> > >
> > > > @chandlerc, @hfinkel: does an attribute-only implementation (with no constant evaluation enforcement) materially hurt the ability for the optimizer to use this annotation? Eg, in:
> > > >
> > > >   extern char k[16];
> > > >   void f() {
> > > >     // the initializer of p is a constant expression and might get constant-folded to &k by the frontend
> > > >     char *p = std::assume_aligned<16>(&k);
> > > >     // ...do stuff...
> > > >   }
> > > >
> > > >
> > > > the alignment assumption may well never be emitted as IR. Is that acceptable?
> > >
> > >
> > > `__attribute__((assume_aligned(N)))` and `__builtin_assume_aligned` are, at the IR level, implemented in a very-similar way. For functions with that attribute on the return type, we essentially emit an alignment assumption on the return value at every call site (in CodeGenFunction::EmitCall). Thus, from the optimizer's perspective, I don't think that it makes a big difference.
> >
> >
> > The point here is that you may well get //no IR annotation whatsoever// for the above call, because the frontend might constant-evaluate the initializer of `p` down to just `&k`, and then emit IR that just initializes `p` to `&k` with no alignment assumption. Whereas if we treated `assume_aligned<N>(p)` as non-constant in the cases where we cannot prove that `p` is suitably aligned (as `__builtin_assume_aligned` does), then we would emit IR for the alignment assumption, but the downside is that the initializer of `p` would no longer be a constant expression.
> >
> > Essentially, what I'm trying to gauge here is, is it OK that you probably don't actually get an alignment assumption in a function like the `f()` above, because it will probably be constant-evaluated away to nothing? Or do we need the constant evaluator to have some kind of side-channel by which it can communicate back to the code generator that an alignment assumption should be applied to `k`?
>
>
> I thought about this side channel option for k, but I don't think that we can because we'd need to prove that `f()` function was always executed, and that's likely not generally possible. I think that the side channel would apply only to p.

After thinking more about it, I think users of `p` in the function `f` really should be able to assume the alignment, and whether they succeed at that should not be determined by whether the initialize of `p` happens to fail to be a constant expression for some reason.

Let me lay out my reasoning, it comes from considering a few examples.

1a) The fact that the address happens to fold to a constant and thus the initializer is a constant as well is not going to be enough to reliably optimize the uses. Just because we know the address of some very hot vector data is a global and thus a constant "relocation" that we can fold does *not* mean that we will be able to reconstruct the alignment guarantees if we need to do so. This means that we would be missing real optimization opportunities here, and indeed, the exact opportunities that `std::assume_aligned` was intended to open up.

1b) Indeed, I could imagine a collection of routines which all use the same constant initialized address but which make different alignment assumptions. And I could imagine code dispatching (potentially dynamically) to the correctly aligned routine. That seems like reasonable code to expect to be able to write given this facility, and yet it would be directly undermined by what you describe.

2. Imagine well tuned code using `std::assume_aligned` that gets refactored slightly such that the pointer happens to become suitable for constant initialization when previously it wasn't. This would in turn cause significant regressions in generated code quality which I think would be an unacceptable surprise to users. We would never be able to explain reasonably why a particular refactoring or change would *regress* performance of some code.

When considering these kinds of situations, it really seems like this needs to be propagated. And not just locally, but as far as the constant evaluation proceeds, walking past as many constant evaluated wrappers as needed. =/

I don't really think of this as needing a side-channel so much as address in the constant evaluation needing to track alignment in some way such that these get propagated as one would expect.

> 
> 
>> Or, indeed, should `assume_aligned<N>(p)` not be treated as a constant expression unless we can prove during constant expression evaluation that `p` is in fact suitably aligned -- as GCC and Clang currently do for `__builtin_assume_aligned`?
> 
> From a C++ perspective, this seems suboptimal. I don't want people to duplicate code, some with assume_aligned, some without, if I want the same code with work both in a constexpr and not. A side channel would be better. It is a trade off, however, and I'd need to think more about it.

Yeah, I really don't think we want to kill constant evaluation just to preserve alignment assumptions. Instead, we want to model those assumptions in the evaluator IMO (even if we don't allow them to fold away in core constant expressions or parts of the ABI).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54966/new/

https://reviews.llvm.org/D54966