[PATCH] D77251: [llvm][CodeGen] Addressing modes for SVE ldN.

Francesco Petrogalli via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 20 21:16:35 PDT 2020


fpetrogalli added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:4731
 /// EVT.
+template <unsigned NumVec>
 static EVT getPackedVectorTypeFromPredicateType(LLVMContext &Ctx, EVT PredVT) {
----------------
sdesmalen wrote:
> Why is this a template argument instead of a regular argument with default = 1?
It is a template because I don't foresee `NumVec` becoming a runtime value. I have used this in other places, for example in  `setInfoSVEStN`. I could set the default to be 1, but I'd rather call it out in the template invocation.

Your comment made me think I should have also made sure that the values of `NumVecs` are architecturally valid ==> that's why I have added a static assert in the last version of the patch.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:4768
     return cast<VTSDNode>(Root->getOperand(4))->getVT();
+  case AArch64ISD::SVE_LD2_MERGE_ZERO:
+    return getPackedVectorTypeFromPredicateType<2>(
----------------
sdesmalen wrote:
> How come that ST2..ST4 are not covered here?
We cannot do for the structured load the same we do for the stores because address optimization is done on the intrinsics for structured stores:

```
    case Intrinsic::aarch64_sve_st2: {
      if (VT == MVT::nxv16i8) {
        SelectPredicatedStore</*Scale=*/0>(Node, 2, AArch64::ST2B,
                                           AArch64::ST2B_IMM);
        return;
```

While for the loads the address optimization is done on the custom ISD nodes:

```
  case AArch64ISD::SVE_LD2_MERGE_ZERO: {
    if (VT == MVT::nxv16i8) {
      SelectPredicatedLoad</*Scale=*/0>(Node, 2, AArch64::LD2B_IMM,
                                        AArch64::LD2B);
      return;
```

This means that the method `getMemVTFromNode` determines the memory type with the following generic code for the stores:

```
  if (isa<MemIntrinsicSDNode>(Root))
    return cast<MemIntrinsicSDNode>(Root)->getMemoryVT();
```

We might do the same for the loads, but we would have to teach to `AArch64TargetLowering::getTgtMemIntrinsic` how to extract the memory type from the intrinsics call, like we do with the stores:

```
  switch (Intrinsic) {
  case Intrinsic::aarch64_sve_st2:
    return setInfoSVEStN<2>(Info, I);
  case Intrinsic::aarch64_sve_st3:
    return setInfoSVEStN<3>(Info, I);
  case Intrinsic::aarch64_sve_st4:
    return setInfoSVEStN<4>(Info, I);
```

But that would probably requires changing all the code that converts the intrinsics of the loads into the `SVE_LD*_MERGE_ZERO` ISD nodes...

If we want to do this cleanup maybe it is better to do it for the whole "structure store intrinsics/ISD nodes" in a separate patch? As far as I understand it, this would be quite an extensive change because we would have to make sure that the nodes of the structure loads are still using the intrinsic and not the custom ISD node when we enter this method, which is not what is going on now. I believe there is a reason for having converted the intrinsics nodes into custom ISD nodes before getting here in address optimization?


================
Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-ldN-reg+imm-addr-mode.ll:452
+; CHECK-NEXT: ret
+%base = getelementptr <vscale x 2 x i64>, <vscale x 2 x i64>* %addr, i64 16
+%base_ptr = bitcast <vscale x 2 x i64>* %base to i64 *
----------------
sdesmalen wrote:
> nit: when testing these pairs, can you at least make sure to test the min/max values?
I can do that, just making sure you didn't miss the comment on the top of file that explains why I haven't add range checks for all cases:

```
; NOTE: invalid, upper and lower bound immediate values of the regimm
; addressing mode are checked only for the byte version of each
; instruction (`ld<N>b`), as the code for detecting the immediate is
; common to all instructions, and varies only for the number of
; elements of the structure store, which is <N> = 2, 3, 4.
```

Let me know if you think this is not a sufficient explanation!



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77251/new/

https://reviews.llvm.org/D77251





More information about the llvm-commits mailing list