[PATCH] D82141: [sve][acle] Add SVE BFloat16 extensions.

Fri Jun 19 02:07:11 PDT 2020

sdesmalen added inline comments.

================
Comment at: clang/include/clang/Basic/arm_sve.td:494
+let ArchGuard = "defined(__ARM_FEATURE_SVE_BF16)" in {
+  def SVBFDOT   : SInst<"svbfdot[_{0}]",   "dd$$", "f", MergeNone, "aarch64_sve_bfdot">;
+  def SVBFMLALB : SInst<"svbfmlalb[_{0}]", "dd$$", "f", MergeNone, "aarch64_sve_bfmlalb">;
----------------
The types for these intrinsics are always `svfloat32_t` and `svbfloat16_t`, which given their semantics is unlikely to ever be extended to other types, so it's easier to make the LLVM IR non-overloaded (i.e. hardcoding `llvm_nxv4f32_ty` and `llvm_nxv8bf16_ty`) and using the `IsOverloadNone` flag for these builtins. Then you can express this builtin as:
```def SVBFDOT: SInst<"svbfdot[_{0}]",  "MMdd", "b", MergeNone, "aarch64_sve_bfdot">;```
and drop the need for the `$` modifier.

================
Comment at: clang/include/clang/Basic/arm_sve.td:498
+  def SVBFMMLA  : SInst<"svbfmmla[_{0}]",  "dd$$", "f", MergeNone, "aarch64_sve_bfmmla">;
+  def SVBFDOT_N   : SInst<"svbfdot[_n_{0}]",   "dd$~", "f", MergeNone, "aarch64_sve_bfdot">;
+  def SVBFMLAL_N  : SInst<"svbfmlalb[_n_{0}]", "dd$~", "f", MergeNone, "aarch64_sve_bfmlalb">;
----------------
Similar to the suggestion above to use `"MMdd"` for SVBFDOT, this could use `"MMda"` and you don't need the `~` modifier.

nit: add whitespace above this line.
nit: the rest of this file tries to align the columns, that makes this file a bit easier to read.

================
Comment at: clang/include/clang/Basic/arm_sve.td:1032
+  defm SVCVT_BF16_F32 : SInstCvtMXZ<"svcvt_bf16[_f32]", "ddPM", "dPM", "b",  "aarch64_sve_cvt_bf16f32">;
+  // svcvtnt_bf16_f32
+  defm SVCVTNT_BF16_F32 : SInstCvtMX<"svcvtnt_bf16[_f32]", "ddPM", "dPM", "b",  "aarch64_sve_cvtnt_bf16f32">;
----------------
nit: redundant comment (same for above)

================
Comment at: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_bfdot.c:27
+
+svfloat32_t test_bfdot_lane_1_f32(svfloat32_t x, svbfloat16_t y, svbfloat16_t z) {
+  // CHECK-LABEL: @test_bfdot_lane_1_f32(
----------------
Testing the edge cases 0 and 3 should be sufficient. (same for all other cases in this patch)

================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1343

+class SVE_bfloat
+    : Intrinsic<[llvm_anyvector_ty],
----------------
nit: `SVE_bfloat` is not very descriptive, maybe use `SVE_4Vec_BF16` and `SVE_4Vec_BF16_Indexed`?

================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1811

+def int_aarch64_sve_cvt_bf16f32     : Builtin_SVCVT<"svcvt_bf16_f32_m",   llvm_nxv8bf16_ty, llvm_nxv8i1_ty, llvm_nxv4f32_ty>;
+def int_aarch64_sve_cvtnt_bf16f32   : Builtin_SVCVT<"svcvtnt_bf16_f32_m", llvm_nxv8bf16_ty, llvm_nxv8i1_ty, llvm_nxv4f32_ty>;
----------------
nit:  use `fcvtbf` instead of `cvt` => `int_aarch64_sve_fcvtbf_bf16f32` ?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82141/new/

https://reviews.llvm.org/D82141