[PATCH] D55059: [ARM] FP16: constant initialised v4f16 and v8f16 vectors

Thu Nov 29 09:10:48 PST 2018

SjoerdMeijer created this revision.
SjoerdMeijer added reviewers: olista01, samparker, efriedma.
Herald added subscribers: kristof.beyls, javed.absar.

Compilation and autovectorisation of a fp16 reduction kernel like this:

  _Float16 sum = .0F16;
  for (unsigned i = 0; i < N; i++)
    sum += A[i];
  return sum;

fails with an instruction selection 'cannot match' error.  A BUILD_VECTOR node is created to hold the 'sum' vector, which gets initialised with VMOVIMM. The problem was that BUILD_VECTOR nodes for v4f16 and v8f16 were assigned the wrong type so that it didn't know how to lower the VMOVIMM.

There are different ways to initialise vectors with constants, e.g.  constant pool loads or vmov with immediates. But this BUILD_VECTOR node is another case, that gets created for constant initialised phi nodes, which again, we were not handling.

In a follow up commit, I will add support for 'extractelt' from v4f16 and v8f16 vectors, which is the last step to get this fully working.

https://reviews.llvm.org/D55059

Files:
  lib/Target/ARM/ARMISelDAGToDAG.cpp
  lib/Target/ARM/ARMISelLowering.cpp
  lib/Target/ARM/ARMInstrNEON.td
  test/CodeGen/ARM/fp16-reduction.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D55059.175879.patch
Type: text/x-patch
Size: 16580 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20181129/01edd6a5/attachment.bin>