[PATCH] D70821: [SystemZ] Implement the packed stack layout

Jonas Paulsson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 29 03:54:05 PST 2019


jonpa added a comment.

In D70821#1763704 <https://reviews.llvm.org/D70821#1763704>, @uweigand wrote:

> In D70821#1763645 <https://reviews.llvm.org/D70821#1763645>, @jonpa wrote:
>
> > >   I don't think this approach will really do everything -mpacked-stack is supposed to do. Fundamentally, with the packed stack layout, only some N (< 160) bytes of the incoming register save area are reserved for register save slots, instead of all 160 bytes. Everything below "(incoming SP) + 160 - N" is supposed to be freely usable for any use, including FPR/VR save slots and outgoing function args / register save area, but also just local stack variables and reload spill slots in general.
> >
> > I think it does achieve that... The local area offset is not changed, but my intention is to instead move up all the slots in the local area with the amount that was packed:
>
>
> Hmm, I missed that.  That would probably indeed work, but it does appear confusing to have the frame object offsets not really match what's being emitted in the end ...


Yes, I agree...

> 
> 
>> First, the GPRs are stored topmost and then any (anyregcc) FP arg regs below them in the reg save area. The amount used of the reg save area is tracked during this with ZFI->incRegSaveAreaUsage(). So any registers which already belong in the reg save area still go there, but with new offets. The amount used is then not 160, but ZFI->getRegSaveAreaUsage().
> 
> I still don't understand why the FP special case is necessary.  Currently, FP regs are just spilled into freely-allocatable frame objects.  Given that you move those frame object (the part I originally missed), shouldn't that simply move the FP regs just the same?  (The only difference is the anyregcc FP regs, which should not go into the fixed slot; instead, they can just be spilled into nonfixed frame objects like the others.)

As long as we return our table of fixed offsets from getCalleeSavedSpillSlots() a register included there will get that offset in assignCalleeSavedSpillSlots(), instead of a created on-fixed object. I thought about changing SystemZFrameLowering::getCalleeSavedSpillSlots(), for instance by returning a NumEntries that could just be 4 less since the FP arg regs are last... But that method does currently not have MF as an argument, and I thought maybe it wouldn't be so bad to just put them below the GPRs... This is called before spillCalleeSavedRegisters() which is where we compute the range of GPRs to save...

>> This for me seems like the most natural way to extend the current machinery while changing as little as possible...
> 
> Changing as little as possible isn't really a primary goal :-)   The primary goal is make the final result as easily understandable as possible ...

good point :-)

>>> So ideally, you'd want to set LocalAreaOffset to -N (where N depends on the number of GPRs to be saved) instead of -160 for the packed stack layout. Unfortunately, I believe the LocalAreaOffset mechanism doesn't support a variable value for this field. One option would be to find (or create) some way to instantiate a per-function FrameLowering. Another way might be to just always set the LocalAreaOffset to 0 and then force common code to still leave the register save area fully (non-packed layout) or partially (packed layout) empty by placing fixed frame objects there (cf. the logic in PEI::calculateFrameObjectOffsets).
>> 
>> Perhaps we could just set a new smaller LAO value once we know it in emitPrologue(), instead of adding the amount for each offset in SystemZFrameLowering::getFrameIndexReference()?
> 
> I don't think there is currently any way to ever change the LAO value.
> 
>> Getting rid of the LAO and creating fixed objects as needed also sounds like a nice idea. That way those offsets wouldn't have to be changed again as per this current patch. But then I think we would have to first rework all the places that assume the LAO as per the patch I first suggested...
> 
> I don't see why anything here would need to be changed.  Frame index offsets would **remain** relative to the CFA with this approach, we'd just be using a different method to keep the first 160 byte unoccupied.   (You first patch changed frame index offsets to be relative to incoming SP, which required all those changes.)

We could add a setOffsetOfLocalArea(), to set it to a new smaller value if we also could make sure to always set it correctly for each function. But I think your idea of just adding the fixed objects as needed sounds better, so maybe I should give that a try..?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70821/new/

https://reviews.llvm.org/D70821





More information about the llvm-commits mailing list