[PATCH] D70967: [AArch64][SVE] Implement element count intrinsics

Wed Dec 4 04:25:10 PST 2019

c-rhodes added inline comments.

================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:937
+    : Intrinsic<[llvm_i64_ty],
+                [LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
+                 llvm_anyvector_ty],
----------------
efriedma wrote:
> What is `LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>` supposed to do here?  Why not just LLVMMatchType?
Downstream CNTP/SADDV/UADDV share the same implementation, I agree reading the definition it's misleading (I would expect a type signature that looks like SADDV/UADDV). I'll change this to `LLVMMatchType`.

================
Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-counting-elems.ll:11
+; CHECK-NEXT: ret
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 2)
+  ret i64 %out
----------------
efriedma wrote:
> No operand for the MUL immediate?  I guess you could try to pattern-match it, but that seems less reliable.
These intrinsics are a reflection of the ACLE intrinsics which don't expose the multiplier for two reasons I guess: 1) at the C and C++ level it's usually simpler to write the multiply out as normal, e.g. `svcntw() * 3;`, and 2) the compiler should be able to fold a suitable multiply in.

We have a separate patch downstream I'll follow up with to support this:

```
define i64 @cntb_mul3() {
; CHECK-LABEL: cntb_mul3:
; CHECK: cntb x0, vl6, mul #3
; CHECK-NEXT: ret
  %cnt = call i64 @llvm.aarch64.sve.cntb(i32 6)
  %out = mul i64 %cnt, 3
  ret i64 %out
}
```

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70967/new/

https://reviews.llvm.org/D70967