[llvm] [LLVM][AArch64]Use load/store with consecutive registers in SME2 or S… (PR #77665)
Sander de Smalen via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 4 03:07:32 PST 2024
================
@@ -3065,19 +3074,40 @@ bool AArch64FrameLowering::spillCalleeSavedRegisters(
std::swap(Reg1, Reg2);
std::swap(FrameIdxReg1, FrameIdxReg2);
}
+
+ unsigned PairRegs;
+ unsigned PnReg;
+ if (RPI.isPaired() && RPI.isScalable()) {
+ PairRegs = AArch64::Z0_Z1 + (RPI.Reg1 - AArch64::Z0);
+ if (!PtrueCreated) {
+ PtrueCreated = true;
+ // Any one of predicate-as-count will be free to use
+ // This can be replaced in the future if needed
+ PnReg = AArch64::PN8;
----------------
sdesmalen-arm wrote:
It's not correct to blindly pick PN8 (P8) here. You can only clobber P8 if it is preserved by the preceding predicate callee-saves.
i.e.
```
define void @test_clobbers_3_z_regs(<vscale x 16 x i8> %v) {
call void asm sideeffect "", "~{z8},~{z9}"()
ret void
}
```
results in:
```
str x29, [sp, #-16]!
addvl sp, sp, #-2
ptrue pn8.b ; pn8 is not preserved by foo, even though the AAPCS says that it should.
st1b { z8.b, z9.b }, pn8, [sp]
ld1b { z8.b, z9.b }, pn8/z, [sp]
addvl sp, sp, #2
ldr x29, [sp], #16
ret
```
One thing you could do is try to see if one of the argument registers is available (p0 - p3), so that you can reuse one of those. Alternatively, you could mark p8 as clobbered by the function so that the preceding callee-save spills will include p8.
https://github.com/llvm/llvm-project/pull/77665
More information about the llvm-commits
mailing list