[llvm] [AArch64][SVE] Handle consecutive Predicates in CC_AArch64_Custom_Block (PR #90122)
Zhaoshi Zheng via llvm-commits
llvm-commits at lists.llvm.org
Fri May 10 16:06:41 PDT 2024
zhaoshiz wrote:
> On lines 58-68, there is some code that suggests the calling convention has additional requirements that need to be taken into account:
>
> ```
> // we cannot allocate enough registers for the tuple we should still leave
> // any remaining registers unallocated. However, when we call the
> // CCAssignFn again we want it to behave as if all remaining registers are
> // allocated. This will force the code to pass the tuple indirectly in
> // accordance with the PCS.
> bool RegsAllocated[8];
> for (int I = 0; I < 8; I++) {
> RegsAllocated[I] = State.isAllocated(ZRegList[I]);
> State.AllocateReg(ZRegList[I]);
> }
> ```
>
> This applies to both Z registers and P registers. I believe this corresponds to [AAPCS (parameter passing)](https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#parameter-passing) Step `C.7` and `C.8`.
>
> Could you also add some tests for this?
I've added similar code to handle P registers in the case a predicate argument is passed indirectly through the stack.
But I ran into some issues when changing values of M and N in predicate types [M x <vsacale x N x i1>]:
1. when M=1, the argument has both flags `InConsectiveRegs` and `InConsectutiveRegsLast` set, which will trigger assertions: for callee or caller at:
https://github.com/llvm/llvm-project/blob/3dcd604eb1d6612fda667793dbb52c5dfaa5fc4f/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L7217
https://github.com/llvm/llvm-project/blob/3dcd604eb1d6612fda667793dbb52c5dfaa5fc4f/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L8214
Is checking (ValueVTs.size() > 1) before return true from AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters(), a good fixe or more needs to be done to handle M=1?
https://github.com/llvm/llvm-project/blob/3dcd604eb1d6612fda667793dbb52c5dfaa5fc4f/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L26132
3. when N is not 16, DAGSelection fails with cannot select errors due to nxv8i1, nxv4i1, and nxv1i1 patterns of STR_PXI/LDR_PXI are remove from AArch64SVEInstrInfo.td: https://github.com/llvm/llvm-project/blob/4198aebc96cb0236fc63e29a92d886e6a2e3fedb/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td#L2950 https://github.com/llvm/llvm-project/blob/4198aebc96cb0236fc63e29a92d886e6a2e3fedb/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td#L2960 in https://reviews.llvm.org/D88994. @efriedma-quic, can you comment on complications of adding them back?
> diff --git a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
> index 93b02b2d692e..96ad0717d483 100644
> --- a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
> +++ b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
> @@ -2120,9 +2120,6 @@ let Predicates = [HasSVEorStreamingSVE] in {
> }
>
> defm Pat_Store_P16 : unpred_store_predicate<nxv16i1, STR_PXI>;
> - defm Pat_Store_P8 : unpred_store_predicate<nxv8i1, STR_PXI>;
> - defm Pat_Store_P4 : unpred_store_predicate<nxv4i1, STR_PXI>;
> - defm Pat_Store_P2 : unpred_store_predicate<nxv2i1, STR_PXI>;
>
> multiclass unpred_load_predicate<ValueType Ty, Instruction Load> {
> def _fi : Pat<(Ty (load (am_sve_fi GPR64sp:$base, simm9:$offset))),
> @@ -2133,9 +2130,6 @@ let Predicates = [HasSVEorStreamingSVE] in {
> }
>
> defm Pat_Load_P16 : unpred_load_predicate<nxv16i1, LDR_PXI>;
> - defm Pat_Load_P8 : unpred_load_predicate<nxv8i1, LDR_PXI>;
> - defm Pat_Load_P4 : unpred_load_predicate<nxv4i1, LDR_PXI>;
> - defm Pat_Load_P2 : unpred_load_predicate<nxv2i1, LDR_PXI>;
https://github.com/llvm/llvm-project/pull/90122
More information about the llvm-commits
mailing list