[PATCH] D88543: [AArch64] Match the windows canonical callee saved register order

Martin Storsjö via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 1 02:08:02 PDT 2020


mstorsjo added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:2018
+    if (N % 2 != 0)
+      return true;
+  } else if (Reg1 == AArch64::FP && Reg2 == AArch64::LR) {
----------------
efriedma wrote:
> mstorsjo wrote:
> > efriedma wrote:
> > > This if ends up being a little confusing to read... maybe worth duplicating the `if (Reg2 == Reg1 + 1)` check into both the integer and fp branches of the if statement.
> > Hmm, maybe...
> > 
> > I'm a bit dissatisfied with this solution, as it misses some pairing opportunities: E.g. if saving only x20 and x21, but not x19, this code wouldn't manage to pair x20 and x21 - as when iterating over them in the top-to-bottom order, when inspecting x21, it'll choose not to pair it with x20, as it expects x20 to be paired with x19.
> > 
> > But I'm a bit out of ideas of a good way of achieving that, without making a huge mess in the generic code in computeCalleeSaveRegisterPairs...
> > 
> > On the other hand, I'm not sure if that's much of a practical issue that happens often, other than in pathological cases?
> Probably rare; the standard register allocation order will use x19 before x20, so ignoring weird edge cases, the only way to get into this situation is if something explicitly clobbers x20.  Which probably means inline asm, since there isn't really any other way to trigger that reliably.
> 
> On the other hand, I think computeCalleeSaveRegisterPairs would be easier to follow if we actually iterated over the registers in the right order.  The other changes are basically working around the fact that we're iterating in the wrong order.  So maybe worth doing even if we don't really expect any significant optimization benefit.  Hopefully it's not that complicated?
That was actually my original approach, and I have a separate version of this patch that does exactly that.

The reason why I preferred this one, is that it keeps the association between stack objects (as allocated and laid out by PrologEpilogInserter's calculateFrameObjectOffsets) more straightforward.

But I can put that alternative version up for review and point out the (potentially?) problematic mismatch there.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88543/new/

https://reviews.llvm.org/D88543



More information about the llvm-commits mailing list