[PATCH] D18619: [AArch64] Combine callee-save and local stack SP adjustment instructions.

Tim Northover via llvm-commits llvm-commits at lists.llvm.org
Wed May 4 11:39:24 PDT 2016


t.p.northover added inline comments.

================
Comment at: lib/Target/AArch64/AArch64FrameLowering.cpp:336
@@ +335,3 @@
+  // For pre/post inc/dec, convert to non inc/dec and update the stack offset.
+  case AArch64::STPXpre:
+    NewOpc = AArch64::STPXi;
----------------
gberry wrote:
> t.p.northover wrote:
> > gberry wrote:
> > > Yes, that's true with the following caveat.  We would be relying on AArch64LoadStoreOptimizer to combine the first/last store/load and sp decrement/increment in the case where we decide not to combine the SP bumps (e.g. if the local stack size is 0).  I think that is generally okay, though when I have tried simplifying the code in this way in the past I did run into one phase ordering issue.  If I recall correctly, occasionally tail duplication of the return block would no longer get done because it appears to the tail merger that the ret block had 3 instructions (add, ldp, ret) instead of 2 (ldp post inc, ret), which caused it to make a different decision.  This was due to the fact that tail duplication was being done before AArch64LoadStoreOptimizer.  Do you think this simplifcation of the code is worth it in light of this extra complication?  I will also say I didn't see any performance degradations because of the tail duplication difference, I just noticed it looking over code diffs.
> > I'm not opposed to using the load/store optimizer for that if it works (and not particularly worried about the tail-duplication change -- if it's ever important we can tweak its heuristic).
> > 
> > But unless I'm missing something performing that transformation here would still be simpler than the proposed patch. We'd just have a `makeMemOpPrePost(MBBI, TotalOffset)` call if emitPrologue/emitEpilogue decide it would be useful, wouldn't we?
> Ah, I think I misunderstood your original comment.  What you are proposing is:
> - have spillCalleeSavedRegisters/restoreCalleeSavedRegisters emit stores/loads without any pre/post decrement/increment
> - have emitPrologue/emitEpilogue either insert an add/sub or change the first callee-save store/load into a pre/post decrement/increment depending on which is better
> 
> If that is correct, I agree that is cleaner.  If not, let's try again :)
Yep, that's the one. Sorry about the confusion.


http://reviews.llvm.org/D18619





More information about the llvm-commits mailing list