[PATCH] D14147: Hanlding of aligned allocas on a target that does not align stack pointer.

Fri Nov 20 17:32:10 PST 2015

hfinkel added a comment.

In http://reviews.llvm.org/D14147#287994, @uweigand wrote:

> In http://reviews.llvm.org/D14147#287293, @hfinkel wrote:
>
> > > SystemZ maintains normal SP alignment always, and instead dynamically realigns stack objects when needed.
> >
> >
> > Before we get into the details, please explain this statement. Why can you not implement dynamic stack realignment using a base pointer like other targets?
>
>
> Given that Jonas' patch is based on a suggestion of mine, I'll jump in here :-)
>
> Of course, we *can* implement dynamic stack realignment using a new reserved hard register like other targets.  The point is rather that we don't *need* to.  On SystemZ, the only parts of the stack frame that require non-default alignment are local variables that were manually over-aligned by the programmer.  It is easily possible to implement this without any target support by just doing a bigger alloca and manually aligning a pointer within that area.  (In fact, you can do this even in the source code without any compiler support.)
>
> This is different from the situation on other platforms like Intel, where some of the "special" areas of the frame may need non-default alignment, like parameter areas, register save areas, or spill slots.  In those cases, it is not possible to implement the alignment requirement without special target support, and that's where the special prolog/epilog code using an extra base register comes in.
>
> Because of this difference, GCC supports two flavors of stack realignment: for those platforms that require it, you can use the extra base register and related code (implemented by the target back-end); but for those platforms that do *not* require it (which is actually most of them), common code simply implements alignment for local variables using generic code (no back-end changes required).
>
> This not only minimizes code changes (most back-ends require no extra code), but also results in more efficient code for targets like SystemZ, since we do not require to reserve an extra hard register.
>
> Jonas patch is trying to implement a similar scheme for LLVM: back-ends may chose to implement realignment via extra base pointer, but for those that don't (need to), common code will still handle the local variable case via generic code.  (As Jonas said in the initial submission, this generic implementation is still not quite as efficient as it could be, but that can be improved later ...)

Uli, thanks for explaining.

> but also results in more efficient code for targets like SystemZ, since we do not require to reserve an extra hard register

This seems untrue. Even if you don't reserve a register for the base pointer, by handling these as dynamic allocations, you force yourself to keep separate pointers to each overaligned stack allocation. In short, you trade one reserved register for N virtual ones (one for each over-aligned stack object). It is true that you might spill those virtual registers when they're not needed, but that's a current-infrastructure problem not a theoretical one, and even so, unlikely to be a win.

In general, over-aligned objects are a performance feature, and should be implemented in a high-performance way. The question here seems to be: Do we want to have a suboptimal, but still functionally correct, support for over-aligned stack objects? I think having this capability makes sense, so long as it comes with appropriate comments and explanations in the code about the downsides. Let's proceed.

http://reviews.llvm.org/D14147