[llvm] [AArch64] Add streaming-mode stack hazards. (PR #98956)

David Green via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 16 02:21:40 PDT 2024


https://github.com/davemgreen commented:

> A couple ways to possibly save some stack space:
> 
> - For FP CSRs, would it make sense to go through integer registers? e.g. fmov x16, d8; str x16, [sp, 16]. More microops, but you don't need hazard padding for the FP CSRs. Maybe depends on how expensive the micro-ops are.
> - If the hazard is related to cache-lines, realigning the stack could reduce the amount of necessary padding.

Thanks - we were hoping to get something into the LLVM-19 compiler so the amount of changes was deliberately kept minimal. In the longer term we might try and change more. Realigning the stack (or portions of it) is certainly an option.

If the SME unit is separate from the CPU and the SME unit hold the values of vector registers, the fmov d->gpr might be a relatively expensive operation. Depending on the function, we might need to stack frames on top of one-another and with the current scheme we end up with `GPR (CPU) => GPU > hazard > FPR > FPR > hazard > GPR => GPR > hazard > FPR > ...`, so I think in the general case you need two hazard paddings. If you don't have one of the regions (or any sub-calls) then things might become simpler though.

https://github.com/llvm/llvm-project/pull/98956


More information about the llvm-commits mailing list