[PATCH] D57054: [MachineOutliner][ARM][RFC] Add Machine Outliner support for ARM

Tim Northover via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 8 02:57:37 PST 2019


t.p.northover added inline comments.
Herald added a project: LLVM.


================
Comment at: lib/Target/ARM/ARMBaseInstrInfo.cpp:5921-5923
+        Save = BuildMI(MF, DebugLoc(), get(ARM::tPUSH))
+                   .add(predOps(ARMCC::AL))
+                   .addReg(ARM::LR);
----------------
t.p.northover wrote:
> yroux wrote:
> > t.p.northover wrote:
> > > I think we need to talk about stack alignment here. This sequence misaligns the stack on ARM and Thumb2.
> > > 
> > > If you take the view that the call to `OUTLINED_FUNCTION` is not a public interface (plausible, but by no means a definite) then the corresponding `push` for a nested call in `buildOutlinedFrame` will realign the stack except on Apple's watchOS ABI (armv7k, which has a 16-byte requirement).
> > > 
> > > If it is public, then this definitely has to change.
> > > 
> > > Either way, it should definitely be noted in comments.
> > Correct me Jessica if I'm wrong, my understanding of outlined functions is that they are not public interfaces and don't even respect the ABI but it is not an issue since we have the control on all the call sites, but I didn't thought of such cases of stack misalignment and targets requirement.  I need to think more about it, but I agree at least a comment is needed.
> That's a good point about the PCS, so it's almost certainly not morally a public interface.
> 
> We sort of still have to worry about what the linker might insert between the branch to the outlined function and actually getting there, but since this is all within a single linkage unit I think the worst would be a simple branch island, unlikely to rely on stack alignment. And if it did, it might rely on far more that we break.
> 
> We do still need to guarantee 16-byte alignment for nested calls on armv7k though.
I've done some more thinking about this and I believe it's actually an issue for everyone else too: Not all callsites will have the same LR-saving behaviour so the outlined function doesn't know how much it needs to adjust the stack by to preserve alignment.

So far I see three possible solutions:

  * Reserve LR so the issue never comes up.
  * Make all stack-based saves & restores preserve the proper alignment.
  * Add new outlining kinds for the different global cases and use compressed sequences where possible (mostly when the callsites are homogeneous).

I've tested the first two a little, and actually it seems that reserving LR generates smaller code! Smaller even than when the stack gets misaligned. The extra register pressure seems to be offset by the fact that outlining is significantly smaller.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57054/new/

https://reviews.llvm.org/D57054





More information about the llvm-commits mailing list