[llvm] [LLVM][AArch64]Use load/store with consecutive registers in SME2 or S… (PR #77665)

Sander de Smalen via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 4 03:07:32 PST 2024


================
@@ -3065,19 +3074,40 @@ bool AArch64FrameLowering::spillCalleeSavedRegisters(
       std::swap(Reg1, Reg2);
       std::swap(FrameIdxReg1, FrameIdxReg2);
     }
+
+    unsigned PairRegs;
+    unsigned PnReg;
+    if (RPI.isPaired() && RPI.isScalable()) {
+      PairRegs = AArch64::Z0_Z1 + (RPI.Reg1 - AArch64::Z0);
+      if (!PtrueCreated) {
+        PtrueCreated = true;
+        // Any one of predicate-as-count will be free to use
+        // This can be replaced in the future if needed
+        PnReg = AArch64::PN8;
----------------
sdesmalen-arm wrote:

It's not correct to blindly pick PN8 (P8) here. You can only clobber P8 if it is preserved by the preceding predicate callee-saves.

i.e.
```
define void @test_clobbers_3_z_regs(<vscale x 16 x i8> %v) {
  call void asm sideeffect "", "~{z8},~{z9}"()
  ret void
}
```

results in:
```
        str     x29, [sp, #-16]!
        addvl   sp, sp, #-2
        ptrue   pn8.b       ; pn8 is not preserved by foo, even though the AAPCS says that it should.
        st1b    { z8.b, z9.b }, pn8, [sp]
        ld1b    { z8.b, z9.b }, pn8/z, [sp]
        addvl   sp, sp, #2
        ldr     x29, [sp], #16
        ret
```

One thing you could do is try to see if one of the argument registers is available (p0 - p3), so that you can reuse one of those. Alternatively, you could mark p8 as clobbered by the function so that the preceding callee-save spills will include p8.

https://github.com/llvm/llvm-project/pull/77665


More information about the llvm-commits mailing list