[PATCH] D135687: [AArch64] Fix aligning the stack after calling __chkstk

Martin Storsjö via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 14 13:53:44 PDT 2022


mstorsjo added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1648
+  const bool NeedsRealignment =
+      NumBytes && !IsFunclet && RegInfo->hasStackRealignment(MF);
+  int64_t RealignmentPadding =
----------------
efriedma wrote:
> mstorsjo wrote:
> > efriedma wrote:
> > > I'm not sure it actually makes sense to check NumBytes here.  If we're forcing realignment with an attribute, we want to make sure it happens even if there aren't any local variables.
> > Right, I guess that’s true. However, in the current codepath for non-windows, all alignment is being done within an `if (NumBytes)` condition, so unless there’s further local allocations, it actually won’t realign the stack.
> > 
> > Also, the alignment to apply on the stack (from `-mstack-alignment`) doesn’t seem to be taken into account for actually aligning the stack, only the alignment of the objects in the current stack frame. So this does seem like a preexisting bug.
> > 
> > It seems like the x86 backend does the right thing here (aligning to the maximum of the objects in the stack frame and the stack’s own alignment, while this code here only looks at the objects.
> Not sure I understand the different sources of alignment, but we can leave that for a followup in any case.
The two different cases are these:
```
$ cat align1.c
void other(char *buf);

void func(void) {
    char buf[100] __attribute__((aligned(64)));
    other(buf);
}
$ clang -target aarch64-linux-gnu -S -o - align1.c -O2
```
```
$ cat align2.c 
void other(void);

void func(void) {
    other();
}
$ clang -target aarch64-linux-gnu -S -o - -O2 align2.c -mstack-alignment=64 -mstackrealign
```
For the first case, we want to allocate a 64-byte aligned object on the stack (which implicitly aligns the stack entirely to 64 bytes at this point). For the second case, we want to realign the stack to 64 bytes (even though the function itself doesn't allocate anything) before calling another function, so that the second function can assume that the incoming stack is aligned to 64 bytes and not bother with more similar realignments, while maintaining 64 byte alignment.

The second case isn't taken into account at all by the aarch64 target; it optimizes the call into a tail call, and even if it tweaked so that it doesn't do that, it still doesn't align the stack to 64 bytes. For x86_64, both cases are handled as intended.

(I guess there's seldom need for much more alignment than 16 bytes on aarch64 anyway, so there probably hasn't been much need or desire to fix this yet. I guess SVE might benefit from it at some point though?)

So since the aarch64 target essentially doesn't really handle the explicit stack realignment requests properly right now, I think the patch in the current form is as good as it gets on its own - handling the alignment of stack objects correctly, and no longer blowing up on the realignment requests (but doing as little as we do for other targets about it).


================
Comment at: llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1651
+      (NeedsRealignment && MFI.getMaxAlign() > Align(16))
+          ? MFI.getMaxAlign().value() - 16
+          : 0;
----------------
efriedma wrote:
> mstorsjo wrote:
> > efriedma wrote:
> > > I suggested subtracting "16" here on the assumption the stack is actually 16-byte aligned on entry.  If we're not assuming the incoming stack is 16-byte aligned, we can't do that.  (Theoretically, we could subtract 1, but we can't actually allocate in increments of less than 16.)
> > > 
> > > Normally, we shouldn't be using realignment in the first place if we're only trying to align the stack to 16 bytes, but I guess the "stackrealign" attribute forces us to realign the stack even if we think it's already aligned.  (By default, the CPU faults if the alignment of sp is less than 16, so that seems unlikely, but I guess there could be some environment which disables that flag.)
> > I don’t really think we need to worry about the case when the incoming stack isn’t aligned to 16 bytes, I guess thus is just a degenerate case - while it’s only expected to do anything if the requested alignment is higher than that.
> In that case, should we just pretend the user didn't request stack realignment if they don't request alignment higher than 16?
I guess we could. Currently this patch only needs to care about whether the alignment is higher than 16 for the windows codepath; we could extend that into the rest of the cases, but I'd leave that to a separate patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135687/new/

https://reviews.llvm.org/D135687



More information about the llvm-commits mailing list