[PATCH] D81411: [ARM][BFloat] Lowering of create/get/set/dup intrinsics

Mon Jun 8 13:18:36 PDT 2020

dmgreen added a comment.

Yeah it's hard to tell what parts of the tests are from bad calling conventions and which are not. If you need this quicker than the other part is available (and loads/stores work), you could loads/store the bfloat, as opposed directly returning the value.

================
Comment at: llvm/lib/Target/ARM/ARMInstrNEON.td:6501
+                                        (DSubReg_i16_reg imm:$lane))),
+                              (VMOVRH $src2), (SubReg_i16_lane imm:$lane))),
+                    (DSubReg_i16_reg imm:$lane)))>;
----------------
Does VMOVRH require fullfp16? Am I right in saying that bfloat doesn't require the set of instructions we put into HasFPRegs16? That sounds like a pain.

================
Comment at: llvm/test/CodeGen/ARM/bf16-create-get-set-dup.ll:89
+entry:
+  %shuffle.i = shufflevector <4 x bfloat> %low, <4 x bfloat> %high, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+  ret <8 x bfloat> %shuffle.i
----------------
Can you switch the operands here to show it doing something.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81411/new/

https://reviews.llvm.org/D81411