[PATCH] D74254: [llvm][aarch64] SVE addressing modes.

Mon Feb 17 11:17:04 PST 2020

andwar added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:700
 bool AArch64DAGToDAGISel::SelectRDVLImm(SDValue N, SDValue &Imm) {
+  N.dump();
   if (!isa<ConstantSDNode>(N))
----------------
DELETEME

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:4436
+
+  if ((MulImm % MemWidthBytes) == 0) {
+    signed Offset = MulImm / MemWidthBytes;
----------------
[nit] As per https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code, this could be simplified:

```
if ((MulImm % MemWidthBytes) != 0)
  return false;

signed Offset = MulImm / MemWidthBytes;
if ((Offset < Min) || (Offset > Max)) 
  return false;

Base = N.getOperand(0);
OffImm = CurDAG->getTargetConstant(Offset, SDLoc(N), MVT::i64);
return true;
```
This way we have fewer levels of indentation.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:4437
+  if ((MulImm % MemWidthBytes) == 0) {
+    signed Offset = MulImm / MemWidthBytes;
+    if ((Offset >= Min) && (Offset <= Max)) {
----------------
This operation mixes sizes: `unsigned` vs `int64_t`. It would be safer to be either:
* consistently explicit about the size of integers (`int64_t` and `int32_t`), or
* consistently implicit about the size of integers (`unsigned`, `unsigned long`) 

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:4469
+  // Check if the RHS is a shift node with a constant.
+  if (RHS.getOpcode() == ISD::SHL) {
+    const SDValue SRHS = RHS.getOperand(1);
----------------
[nit] As per https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code, this could be simplified  as:

```
if (RHS.getOpcode() != ISD::SHL) 
  return false;

const SDValue SRHS = RHS.getOperand(1);
auto *C = dyn_cast<ConstantSDNode>(SRHS);
if (nullptr == C) 
  return false;

const uint64_t Shift = C->getZExtValue();
if (Shift == Scale) {
   Base = LHS;
   Offset = RHS.getOperand(0);
    return true;
}
```
This way you we have fewer levels of indentation.

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1213
+      def _reg_reg_z : Pat<(Ty (Load (AddrCP GPR64:$base, GPR64:$offset), (PredTy PPR:$gp), (SVEDup0Undef))),
+                         (RegRegInst PPR:$gp, GPR64:$base, GPR64:$offset)>;
+    }
----------------
[nit] Indent consistently

================
Comment at: llvm/test/CodeGen/AArch64/sve-pred-contiguous-ldst-addressing-mode-reg-imm.ll:16
+  %base_store = getelementptr <vscale x 2 x i64>, <vscale x 2 x i64> * %base, i64 -9
+  call void @llvm.masked.store.nxv2i64(<vscale x 2 x i64> %data, <vscale x 2 x i64>* %base_store, i32 1, <vscale x 2 x i1> %mask)
+  ret void
----------------
fpetrogalli wrote:
> andwar wrote:
> > Could you format this line (and similar lines elsewhere)? E.g.: https://github.com/llvm/llvm-project/blob/318d0ede572080f18d0106dbc354e11c88329a84/llvm/test/CodeGen/AArch64/sve-intrinsics-stores.ll#L11
> > 
> > Makes it easier to parse for humans :) And will be consistent with other files too!
> I prefer keeping all parameters in one line.
> 
> The "consistency with other file" argument doesn't work, most of the tests for SVE uses the convention in this patch:
> 
> ```
> frapet01 at man-08:~/projects/upstream-clang/llvm-project/llvm/test/CodeGen/AArch64 (master)$ grep "@llvm.*)$" sve*.ll | grep -v "()" | wc -l
> 2076
> frapet01 at man-08:~/projects/upstream-clang/llvm-project/llvm/test/CodeGen/AArch64 (master)$ grep "@llvm.*,$" sve*.ll | grep -v "()" | wc -l
> 1433
> ```
> 
> The result becomes even more unbalanced if you look at the totality of tests in the folder:
> 
> ```
> frapet01 at man-08:~/projects/upstream-clang/llvm-project/llvm/test/CodeGen/AArch64 (master)$ grep "@llvm.*)$" *.ll | grep -v "()" | wc -l
> 7260
> frapet01 at man-08:~/projects/upstream-clang/llvm-project/llvm/test/CodeGen/AArch64 (master)$ grep "@llvm.*,$" *.ll | grep -v "()" | wc -l
> 1435
> ```
> 
> I run grep on master, @ `a062a3ed7fd82c277812d80fb83dc6f05b939a84`, _not_ on my dev branch :).
I think that your comparison misses the context, and that's entirely my fault because I didn't make it clear. If you compare long lines only, the split is roughly 50/50.

I kindly asked for this update because these lines are wrapped by Phabricator (they don't fit on my screen and I don't see how to make Phabricator stop doing that). This makes reviewing on Phab a bit frustrating. 

Btw, I think that `awk` would be more fitting here ;-)

```
#! /bin/evn bash

TEST_FILES=(
"llvm/test/CodeGen/AArch64/sve-pred-contiguous-ldst-addressing-mode-reg-imm.ll"
"llvm/test/CodeGen/AArch64/sve-pred-contiguous-ldst-addressing-mode-reg-reg.ll"
"llvm/test/CodeGen/AArch64/sve-pred-non-temporal-ldst-addressing-mode-reg-imm.ll"
"llvm/test/CodeGen/AArch64/sve-pred-non-temporal-ldst-addressing-mode-reg-reg.ll")

for test_file in "${TEST_FILES[@]}"; do
awk -F',' '{
  if ($1 ~ /call/) {
    # Counter number of character for indentation
    for (i = 1; i <= length($0); i++) {
      if (substr($0, i, 1) == "(") {
        lenght_col = i
        break
      }
    }

    # Split function call - 2 args
    if (NF == 2) {
      printf("%s,\n %*s %s\n", $1, lenght_col - 3, "", $2)
    }

    # Split function call - 3 args
    if (NF == 3) {
      printf("%s,\n %*s %s,\n %*s %s\n", $1, lenght_col - 3, "", $2, lenght_col - 3, "", $3)
    }

    # Split function call - 4 args
    if (NF == 4) {
      printf("%s,\n %*s %s,\n %*s %s,\n %*s %s\n", $1, lenght_col - 3, "", $2, lenght_col - 3, "", $3, lenght_col - 3, "", $4)
    }

  } else {
    # Not a call, not reformatting
    print $0
  }
}' ${test_file} > temp.ll

mv temp.ll ${test_file}

done
```

================
Comment at: llvm/test/CodeGen/AArch64/sve-pred-contiguous-ldst-addressing-mode-reg-reg.ll:391
+; CHECK-LABEL: masked_trunc_store_sv8i16_to_sv8i8:
+; C HECK: st1b { z0.h }, p0, [x0, x1]
+; C HECK-NEXT: ret
----------------
FIXME

================
Comment at: llvm/test/CodeGen/AArch64/sve-vscale-combine.ll:13
+ %vscale = call i64 @llvm.vscale.i64()
+ %add = add i64 %vscale, %vscale
+ ret i64 %add
----------------
Could `C0` and `C1` be different than 1?

================
Comment at: llvm/test/CodeGen/AArch64/sve-vscale-combine.ll:28
+; Fold (mul (vscale * C0), C1) to (vscale * C0 * C1)))
+; In this test, C0 = 1, C1 = 16.
+define i64 @combine_mul_vscale_i64() nounwind {
----------------
Did you mean `C0   = 2`? And multiplication by `C0` seems to be missing - there's only multiplication by `32`. Shouldn't the IR look like this:

```
 %vscale = call i64 @llvm.vscale.i64()
 %mul_by_16 = mul i64 %vscale, 16
 %mul_by_2 = mul i64 %mul_by_16, 32
 ret i64 %mul_by_2
```

================
Comment at: llvm/test/CodeGen/AArch64/sve-vscale-combine.ll:6
+
+; Fold (add (vscale * C0), (vscale C1)) to (vscale  C0 + C1))
+define i64 @combine_add_vscale_i64() nounwind {
----------------
andwar wrote:
> `vscale * C1`? and `vscale * (C0 + C1)`?
What about `vscale * C1`?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74254/new/

https://reviews.llvm.org/D74254