[llvm] [AArch64][SVE] Handle consecutive Predicates in CC_AArch64_Custom_Block (PR #90122)

Zhaoshi Zheng via llvm-commits llvm-commits at lists.llvm.org
Fri May 10 16:06:41 PDT 2024


zhaoshiz wrote:

> On lines 58-68, there is some code that suggests the calling convention has additional requirements that need to be taken into account:
> 
> ```
>     // we cannot allocate enough registers for the tuple we should still leave
>     // any remaining registers unallocated. However, when we call the
>     // CCAssignFn again we want it to behave as if all remaining registers are
>     // allocated. This will force the code to pass the tuple indirectly in
>     // accordance with the PCS.
>     bool RegsAllocated[8];
>     for (int I = 0; I < 8; I++) {
>       RegsAllocated[I] = State.isAllocated(ZRegList[I]);
>       State.AllocateReg(ZRegList[I]);
>     }
> ```
> 
> This applies to both Z registers and P registers. I believe this corresponds to [AAPCS (parameter passing)](https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#parameter-passing) Step `C.7` and `C.8`.
> 
> Could you also add some tests for this?

I've added similar code to handle P registers in the case a predicate argument is passed indirectly through the stack.
But I ran into some issues when changing values of M and N in predicate types [M x <vsacale x N x i1>]:

1. when M=1, the argument has both flags `InConsectiveRegs` and `InConsectutiveRegsLast` set, which will trigger assertions: for callee or caller at:
https://github.com/llvm/llvm-project/blob/3dcd604eb1d6612fda667793dbb52c5dfaa5fc4f/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L7217
https://github.com/llvm/llvm-project/blob/3dcd604eb1d6612fda667793dbb52c5dfaa5fc4f/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L8214
    Is checking (ValueVTs.size() > 1) before return true from AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters(),  a good fixe or more needs to be done to handle M=1? 
    https://github.com/llvm/llvm-project/blob/3dcd604eb1d6612fda667793dbb52c5dfaa5fc4f/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L26132

3. when N is not 16, DAGSelection fails with cannot select errors due to nxv8i1, nxv4i1, and nxv1i1 patterns of STR_PXI/LDR_PXI are remove from AArch64SVEInstrInfo.td: https://github.com/llvm/llvm-project/blob/4198aebc96cb0236fc63e29a92d886e6a2e3fedb/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td#L2950 https://github.com/llvm/llvm-project/blob/4198aebc96cb0236fc63e29a92d886e6a2e3fedb/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td#L2960 in https://reviews.llvm.org/D88994. @efriedma-quic, can you comment on complications of adding them back?

> diff --git a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
> index 93b02b2d692e..96ad0717d483 100644
> --- a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
> +++ b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
> @@ -2120,9 +2120,6 @@ let Predicates = [HasSVEorStreamingSVE] in {
>    }
> 
>    defm Pat_Store_P16 : unpred_store_predicate<nxv16i1, STR_PXI>;
> -  defm Pat_Store_P8  : unpred_store_predicate<nxv8i1, STR_PXI>;
> -  defm Pat_Store_P4  : unpred_store_predicate<nxv4i1, STR_PXI>;
> -  defm Pat_Store_P2  : unpred_store_predicate<nxv2i1, STR_PXI>;
> 
>    multiclass unpred_load_predicate<ValueType Ty, Instruction Load> {
>      def _fi : Pat<(Ty (load (am_sve_fi GPR64sp:$base, simm9:$offset))),
> @@ -2133,9 +2130,6 @@ let Predicates = [HasSVEorStreamingSVE] in {
>    }
> 
>    defm Pat_Load_P16 : unpred_load_predicate<nxv16i1, LDR_PXI>;
> -  defm Pat_Load_P8  : unpred_load_predicate<nxv8i1, LDR_PXI>;
> -  defm Pat_Load_P4  : unpred_load_predicate<nxv4i1, LDR_PXI>;
> -  defm Pat_Load_P2  : unpred_load_predicate<nxv2i1, LDR_PXI>;

https://github.com/llvm/llvm-project/pull/90122


More information about the llvm-commits mailing list