[llvm] [AArch64][SME] Make getRegAllocationHints stricter for multi-vector loads (PR #123081)
Sander de Smalen via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 29 06:46:28 PST 2025
================
@@ -1108,25 +1114,82 @@ bool AArch64RegisterInfo::getRegAllocationHints(
// instructions over reducing the number of clobbered callee-save registers,
// so we add the strided registers as a hint.
unsigned RegID = MRI.getRegClass(VirtReg)->getID();
- // Look through uses of the register for FORM_TRANSPOSED_REG_TUPLE.
- if ((RegID == AArch64::ZPR2StridedOrContiguousRegClassID ||
- RegID == AArch64::ZPR4StridedOrContiguousRegClassID) &&
- any_of(MRI.use_nodbg_instructions(VirtReg), [](const MachineInstr &Use) {
- return Use.getOpcode() ==
- AArch64::FORM_TRANSPOSED_REG_TUPLE_X2_PSEUDO ||
- Use.getOpcode() == AArch64::FORM_TRANSPOSED_REG_TUPLE_X4_PSEUDO;
- })) {
- const TargetRegisterClass *StridedRC =
- RegID == AArch64::ZPR2StridedOrContiguousRegClassID
- ? &AArch64::ZPR2StridedRegClass
- : &AArch64::ZPR4StridedRegClass;
-
- for (MCPhysReg Reg : Order)
- if (StridedRC->contains(Reg))
- Hints.push_back(Reg);
+ if (RegID == AArch64::ZPR2StridedOrContiguousRegClassID ||
+ RegID == AArch64::ZPR4StridedOrContiguousRegClassID) {
+
+ // Look through uses of the register for FORM_TRANSPOSED_REG_TUPLE.
+ for (const MachineInstr &Use : MRI.use_nodbg_instructions(VirtReg)) {
+ if (Use.getOpcode() != AArch64::FORM_TRANSPOSED_REG_TUPLE_X2_PSEUDO &&
+ Use.getOpcode() != AArch64::FORM_TRANSPOSED_REG_TUPLE_X4_PSEUDO)
+ continue;
+
+ unsigned LdOps = Use.getNumOperands() - 1;
----------------
sdesmalen-arm wrote:
I just realised that `LdOps` is a misnomer, because it is assigning the number of operands in the use (form_reg_tuple). If the use here has 4 operands, it could still be that the load has 2, e.g.
```
ld1 { z0, z8 }, p0/z, [...]
ld1 { z1, z9 }, p0/z, [...]
ld1 { z2, z10 }, p0/z, [...]
ld1 { z3, z11 }, p0/z, [...]
{z0, z1, z2, z3} = form_reg_tuple {z0, z8}:0, {z1, z9}:0, {z2, z10}:0, {z3, z11}:0
```
The uses below assume this is about the number of operands of the Use, so it seems like it's just the name that's wrong.
The same is not true for `StridedRC`, which uses the wrong register class (it should decided the strided RC based on `RegID` instead)
https://github.com/llvm/llvm-project/pull/123081
More information about the llvm-commits
mailing list