[PATCH] D129727: [ARM64EC 11/?] Add support for lowering variadic indirect calls.

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 19 10:44:09 PDT 2022


efriedma added a comment.

> Load address for __os_arm64x_dispatch_call_no_redirect is after memory copy in IR version. But if we copy memory in LowerCall the load is before memory copy.  It cause one more register usage.

That's a consequence of doing the load in IR, I think.  The load is before the memcpy in the initial SelectionDAG, and nothing tries to rearrange them.  By default, scheduling happens in source order, and we don't reorder across calls after isel.  I don't see any obvious fix; maybe the call lowering code could try to find the load and mess with its chains?  But I wouldn't worry about it; if we actually care about the performance of varargs thunks, there are probably more significant improvements we could make, like trying to inline small memcpys.

> 32 bytes for register store should be the real bottom on the stack but when I move memory copy into LowerCall , the dynamic allocation is always the real bottom on the stack.

Maybe AArch64FrameLowering::hasReservedCallFrame is returning the wrong thing?  Normally, the stack allocation for call arguments is allocated in the prologue; the stack frame layout code needs to know if that's illegal because there a dynamic/large/etc. allocation.

Alternatively, you could just make the extra 32 bytes part of the dynamic allocation, instead of trying to do it "properly".


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129727/new/

https://reviews.llvm.org/D129727



More information about the llvm-commits mailing list