[PATCH] D74254: [llvm][aarch64] SVE addressing modes.

Tue Feb 18 09:03:00 PST 2020

fpetrogalli added inline comments.

================
Comment at: llvm/test/CodeGen/AArch64/sve-pred-contiguous-ldst-addressing-mode-reg-imm.ll:16
+  %base_store = getelementptr <vscale x 2 x i64>, <vscale x 2 x i64> * %base, i64 -9
+  call void @llvm.masked.store.nxv2i64(<vscale x 2 x i64> %data, <vscale x 2 x i64>* %base_store, i32 1, <vscale x 2 x i1> %mask)
+  ret void
----------------
andwar wrote:
> fpetrogalli wrote:
> > andwar wrote:
> > > Could you format this line (and similar lines elsewhere)? E.g.: https://github.com/llvm/llvm-project/blob/318d0ede572080f18d0106dbc354e11c88329a84/llvm/test/CodeGen/AArch64/sve-intrinsics-stores.ll#L11
> > > 
> > > Makes it easier to parse for humans :) And will be consistent with other files too!
> > I prefer keeping all parameters in one line.
> > 
> > The "consistency with other file" argument doesn't work, most of the tests for SVE uses the convention in this patch:
> > 
> > ```
> > frapet01 at man-08:~/projects/upstream-clang/llvm-project/llvm/test/CodeGen/AArch64 (master)$ grep "@llvm.*)$" sve*.ll | grep -v "()" | wc -l
> > 2076
> > frapet01 at man-08:~/projects/upstream-clang/llvm-project/llvm/test/CodeGen/AArch64 (master)$ grep "@llvm.*,$" sve*.ll | grep -v "()" | wc -l
> > 1433
> > ```
> > 
> > The result becomes even more unbalanced if you look at the totality of tests in the folder:
> > 
> > ```
> > frapet01 at man-08:~/projects/upstream-clang/llvm-project/llvm/test/CodeGen/AArch64 (master)$ grep "@llvm.*)$" *.ll | grep -v "()" | wc -l
> > 7260
> > frapet01 at man-08:~/projects/upstream-clang/llvm-project/llvm/test/CodeGen/AArch64 (master)$ grep "@llvm.*,$" *.ll | grep -v "()" | wc -l
> > 1435
> > ```
> > 
> > I run grep on master, @ `a062a3ed7fd82c277812d80fb83dc6f05b939a84`, _not_ on my dev branch :).
> I think that your comparison misses the context, and that's entirely my fault because I didn't make it clear. If you compare long lines only, the split is roughly 50/50.
> 
> I kindly asked for this update because these lines are wrapped by Phabricator (they don't fit on my screen and I don't see how to make Phabricator stop doing that). This makes reviewing on Phab a bit frustrating. 
> 
> Btw, I think that `awk` would be more fitting here ;-)
> 
> ```
> #! /bin/evn bash
> 
> TEST_FILES=(
> "llvm/test/CodeGen/AArch64/sve-pred-contiguous-ldst-addressing-mode-reg-imm.ll"
> "llvm/test/CodeGen/AArch64/sve-pred-contiguous-ldst-addressing-mode-reg-reg.ll"
> "llvm/test/CodeGen/AArch64/sve-pred-non-temporal-ldst-addressing-mode-reg-imm.ll"
> "llvm/test/CodeGen/AArch64/sve-pred-non-temporal-ldst-addressing-mode-reg-reg.ll")
> 
> for test_file in "${TEST_FILES[@]}"; do
> awk -F',' '{
>   if ($1 ~ /call/) {
>     # Counter number of character for indentation
>     for (i = 1; i <= length($0); i++) {
>       if (substr($0, i, 1) == "(") {
>         lenght_col = i
>         break
>       }
>     }
> 
>     # Split function call - 2 args
>     if (NF == 2) {
>       printf("%s,\n %*s %s\n", $1, lenght_col - 3, "", $2)
>     }
> 
>     # Split function call - 3 args
>     if (NF == 3) {
>       printf("%s,\n %*s %s,\n %*s %s\n", $1, lenght_col - 3, "", $2, lenght_col - 3, "", $3)
>     }
> 
>     # Split function call - 4 args
>     if (NF == 4) {
>       printf("%s,\n %*s %s,\n %*s %s,\n %*s %s\n", $1, lenght_col - 3, "", $2, lenght_col - 3, "", $3, lenght_col - 3, "", $4)
>     }
> 
>   } else {
>     # Not a call, not reformatting
>     print $0
>   }
> }' ${test_file} > temp.ll
> 
> mv temp.ll ${test_file}
> 
> done
> ```
You win!

================
Comment at: llvm/test/CodeGen/AArch64/sve-vscale-combine.ll:13
+ %vscale = call i64 @llvm.vscale.i64()
+ %add = add i64 %vscale, %vscale
+ ret i64 %add
----------------
andwar wrote:
> Could `C0` and `C1` be different than 1?
Yes, but to get C0 and C1 different from 1 I'd have to use a `mul` instruction, which is already tested in the `combine_mul_vscale_*` tests.

I think it is enough to test with C0=C1=1.

================
Comment at: llvm/test/CodeGen/AArch64/sve-vscale-combine.ll:28
+; Fold (mul (vscale * C0), C1) to (vscale * C0 * C1)))
+; In this test, C0 = 1, C1 = 16.
+define i64 @combine_mul_vscale_i64() nounwind {
----------------
andwar wrote:
> Did you mean `C0   = 2`? And multiplication by `C0` seems to be missing - there's only multiplication by `32`. Shouldn't the IR look like this:
> 
> 
> ```
>  %vscale = call i64 @llvm.vscale.i64()
>  %mul_by_16 = mul i64 %vscale, 16
>  %mul_by_2 = mul i64 %mul_by_16, 32
>  ret i64 %mul_by_2
> ```
When targeting SVE, `@llvm.vscale.*()` returns the number of 16-byte chunks of and SVE register. If I multiply it by 32, it means it is returning the number of 4-bit (half-byte) elements in the SVE register. `RDVL` returns the number of 8-bit lanes in the register, hence the number of 4-bit lanes is given by `RDVL Xn, %2`.

I have updated the comment setting C1 = 32.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74254/new/

https://reviews.llvm.org/D74254