[PATCH] D81740: [AArch32]: BFloat MatMul Intrinsics&CodeGen

Mikhail Maltsev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 15 05:56:37 PDT 2020


miyuki added inline comments.


================
Comment at: clang/test/CodeGen/arm-bf16-dotprod-intrinsics.c:2
+// RUN: %clang_cc1 -triple armv8-arm-none-eabi \
+// RUN:   -O2 -target-feature +neon -target-feature +bf16 \
+// RUN:   -mfloat-abi hard \
----------------
Would it be sufficient to run through `opt -mem2reg -instcombine` instead of the whole -O2 pipeline?


================
Comment at: clang/test/CodeGen/arm-bf16-dotprod-intrinsics.c:12
+// CHECK-LABEL: test_vbfdot_f32
+// CHECK-BF16:  %0 = bitcast <4 x bfloat> %a to <8 x i8>
+// CHECK-BF16:  %1 = bitcast <4 x bfloat> %b to <8 x i8>
----------------
The check prefixes are misleading. How about `CHECK-FPABI-HARD` and `CHECK-FPABI-SOFT`?


================
Comment at: llvm/lib/Target/ARM/ARMInstrNEON.td:9132
+                                        VectorIndex16:$lane)))))),
+    (!cast<Instruction>(NAME) QPR:$Vd, QPR:$Vn, (EXTRACT_SUBREG QPR:$Vm, dsub_0), VectorIndex32:$lane)>;
 }
----------------
Will this work if the selected element is in the top half of the Q register (`$lane >= 4`)?


================
Comment at: llvm/test/CodeGen/ARM/arm-bf16-dotprod-intrinsics.ll:1
+; RUN: llc -mtriple armv8.6a-arm-none-eabi  -mattr=+bf16 -float-abi=hard %s -o - | FileCheck %s --check-prefix=CHECK
+
----------------
`--check-prefix=CHECK` is redundant


================
Comment at: llvm/test/CodeGen/ARM/arm-bf16-dotprod-intrinsics.ll:5
+; CHECK: vdot.bf16       d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}
+define arm_aapcs_vfpcc <2 x float> @test_vbfdot_f32(<2 x float> %r, <4 x bfloat> %a, <4 x bfloat> %b) local_unnamed_addr #0 {
+entry:
----------------
Could you please get rid of `local_unnamed_addr #0`? `#0` is referring to an attribute that is not defined anywhere.


================
Comment at: llvm/test/CodeGen/ARM/arm-bf16-dotprod-intrinsics.ll:9
+  %1 = bitcast <4 x bfloat> %b to <8 x i8>
+  %vbfdot1.i = tail call <2 x float> @llvm.arm.neon.bfdot.v2f32.v8i8(<2 x float> %r, <8 x i8> %0, <8 x i8> %1) #3
+  ret <2 x float> %vbfdot1.i
----------------
Same for `#3`


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81740/new/

https://reviews.llvm.org/D81740





More information about the llvm-commits mailing list