[PATCH] D57054: [MachineOutliner][ARM][RFC] Add Machine Outliner support for ARM

Yvan Roux via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 13 23:46:58 PST 2019

yroux marked an inline comment as done.
yroux added inline comments.
Herald added a subscriber: jdoerfert.

Comment at: lib/Target/ARM/ARMBaseInstrInfo.cpp:5921-5923
+        Save = BuildMI(MF, DebugLoc(), get(ARM::tPUSH))
+                   .add(predOps(ARMCC::AL))
+                   .addReg(ARM::LR);
t.p.northover wrote:
> t.p.northover wrote:
> > yroux wrote:
> > > t.p.northover wrote:
> > > > I think we need to talk about stack alignment here. This sequence misaligns the stack on ARM and Thumb2.
> > > > 
> > > > If you take the view that the call to `OUTLINED_FUNCTION` is not a public interface (plausible, but by no means a definite) then the corresponding `push` for a nested call in `buildOutlinedFrame` will realign the stack except on Apple's watchOS ABI (armv7k, which has a 16-byte requirement).
> > > > 
> > > > If it is public, then this definitely has to change.
> > > > 
> > > > Either way, it should definitely be noted in comments.
> > > Correct me Jessica if I'm wrong, my understanding of outlined functions is that they are not public interfaces and don't even respect the ABI but it is not an issue since we have the control on all the call sites, but I didn't thought of such cases of stack misalignment and targets requirement.  I need to think more about it, but I agree at least a comment is needed.
> > That's a good point about the PCS, so it's almost certainly not morally a public interface.
> > 
> > We sort of still have to worry about what the linker might insert between the branch to the outlined function and actually getting there, but since this is all within a single linkage unit I think the worst would be a simple branch island, unlikely to rely on stack alignment. And if it did, it might rely on far more that we break.
> > 
> > We do still need to guarantee 16-byte alignment for nested calls on armv7k though.
> I've done some more thinking about this and I believe it's actually an issue for everyone else too: Not all callsites will have the same LR-saving behaviour so the outlined function doesn't know how much it needs to adjust the stack by to preserve alignment.
> So far I see three possible solutions:
>   * Reserve LR so the issue never comes up.
>   * Make all stack-based saves & restores preserve the proper alignment.
>   * Add new outlining kinds for the different global cases and use compressed sequences where possible (mostly when the callsites are homogeneous).
> I've tested the first two a little, and actually it seems that reserving LR generates smaller code! Smaller even than when the stack gets misaligned. The extra register pressure seems to be offset by the fact that outlining is significantly smaller.
Hi Tim,

I came to the same conclusion that alignment must be preserved (If some LDREXD , STREXD sequence can be reached from the outlined region for instance doubleword alignment is mandatory), I'm working on a save/restore solution which preserve the proper alignment.

I don't see the cases you described where not all callsites have the same LR-saving behaviour, do you have an example and/or a reproducer which exhibits that reserving LR generates smaller code ? (if it is too long to put it here, you can email it)




More information about the llvm-commits mailing list