[llvm] [SystemZ] Eliminate call sequence instructions early. (PR #77812)

Ulrich Weigand via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 26 05:44:25 PST 2024


uweigand wrote:

> I think so, but it would help to better understand exactly if/why this is the way it is only for x86

I've had a bit of a closer look, and it turns out it is more targets than just x86.   In general, the callseq pseudos seem to have to main functions:
- They act as (pre-RA) scheduling barriers to attempt to keep call setup sequences closer together.   This is done on all platforms; I'm not sure if this is required for correctness anywhere, but it does appear to have performance benefits on SystemZ at least.
- They carry a "frame size" value, which is used on *some* targets to modify frame index elimination (i.e. the distance between SP and the frame varies from its default value while within a call sequence).  Where required, this is of course needed for functional correctness.

This second aspect is only used on some platforms, however.   This depends on how the stack space needed to hold outgoing function arguments is allocated: some some platforms, this happens during the call sequence (e.g. via "push" instructions on x86), in others (like SystemZ), the function prolog always allocates enough space for all calls in the function ahead of time, and on yet others this decision is made on a per-function basis (e.g. depending on whether the function also contains dynamic stack allocation).

Platforms where `canSimplifyCallFramePseudos` returns true do not need (or use) that frame size value.  This is the case on SystemZ for all functions, in particular.  However, even on those platforms the machine verifier requires that this value is maintained correctly whenever a basic block is split - although all those correctly maintained values will just be ignored in the end.

The default definition of `canSimplifyCallFramePseudos` is always true.  However, that can change if the platform overrides either `canSimplifyCallFramePseudos` or `hasReservedCallFrame`.   Looking at all platforms that do so, it seems the following platforms actually require the frame size: X86, AArch64/ARM, M86K, Mips, and WebAssembly.

All other platforms either always reserve the call frame in the prolog, or else always use an FP whenever the call frame is not reserved in the prolog.  In both cases, frame index eliminiation does not require the call frame size.


https://github.com/llvm/llvm-project/pull/77812


More information about the llvm-commits mailing list