[all-commits] [llvm/llvm-project] a41ecf: [ARM, MVE] Add ACLE intrinsics for VQMOV[U]N family.

Mon Mar 2 02:33:45 PST 2020

  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: a41ecf0eb05190c8597f98b8d41d7a6e678aec0b
      https://github.com/llvm/llvm-project/commit/a41ecf0eb05190c8597f98b8d41d7a6e678aec0b
  Author: Simon Tatham <simon.tatham at arm.com>
  Date:   2020-03-02 (Mon, 02 Mar 2020)

  Changed paths:
    M clang/include/clang/Basic/arm_mve.td
    M clang/include/clang/Basic/arm_mve_defs.td
    A clang/test/CodeGen/arm-mve-intrinsics/vqmovn.c
    M llvm/include/llvm/IR/IntrinsicsARM.td
    M llvm/lib/Target/ARM/ARMInstrMVE.td
    A llvm/test/CodeGen/Thumb2/mve-intrinsics/vqmovn.ll

  Log Message:
  -----------
  [ARM,MVE] Add ACLE intrinsics for VQMOV[U]N family.

Summary:
These instructions work like VMOVN (narrowing a vector of wide values
to half size, and overwriting every other lane of an output register
with the result), except that the narrowing conversion is saturating.
They come in three signedness flavours: signed to signed, unsigned to
unsigned, and signed to unsigned. All are represented in IR by a
target-specific intrinsic that takes two separate 'unsigned' flags.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75252

  Commit: 69441e53c9f45c58674bbe2fd788baf9b9fe1392
      https://github.com/llvm/llvm-project/commit/69441e53c9f45c58674bbe2fd788baf9b9fe1392
  Author: Simon Tatham <simon.tatham at arm.com>
  Date:   2020-03-02 (Mon, 02 Mar 2020)

  Changed paths:
    M llvm/lib/Target/ARM/ARMInstrMVE.td

  Log Message:
  -----------
  [ARM,MVE] Correct MC operands in VCVT.F32.F16. (NFC)

Summary:
The two MVE instructions that convert between v4f32 and v8f16 were
implemented as instances of the same class, with the same MC operand
list.

But that's not really appropriate, because the narrowing conversion
only partially overwrites its output register (it only has 4 f16
values to write into a vector of 8), so even when unpredicated, it
needs a $Qd_src input, a constraint tying that to the $Qd output, and
a vpred_n.

The widening conversion is better represented like any other
instruction that completely replaces its output when unpredicated: it
should have no $Qd_src operand, and instead, a vpred_r containing a
$inactive parameter. That's a better match to other similar
instructions, such as its integer analogue, the VMOVL instruction that
makes a v4i32 by sign- or zero-extending every other lane of a v8i16.

This commit brings the widening VCVT.F32.F16 into line with the other
instructions that behave like it. That means you can write isel
patterns that use it unpredicated, without having to add a pointless
undefined $QdSrc operand.

No existing code generation uses that instruction yet, so there should
be no functional change from this fix.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75253

  Commit: b08d2ddd69b4a2209930b31fe456b4d7c1ce148f
      https://github.com/llvm/llvm-project/commit/b08d2ddd69b4a2209930b31fe456b4d7c1ce148f
  Author: Simon Tatham <simon.tatham at arm.com>
  Date:   2020-03-02 (Mon, 02 Mar 2020)

  Changed paths:
    M clang/include/clang/Basic/arm_mve.td
    M clang/test/CodeGen/arm-mve-intrinsics/vcvt.c
    M llvm/include/llvm/IR/IntrinsicsARM.td
    M llvm/lib/Target/ARM/ARMInstrMVE.td
    M llvm/test/CodeGen/Thumb2/mve-intrinsics/vcvt.ll

  Log Message:
  -----------
  [ARM,MVE] Add ACLE intrinsics for VCVT.F32.F16 family.

Summary:
These instructions make a vector of `<4 x float>` by widening every
other lane of a vector of `<8 x half>`.

I wondered about representing these using standard IR, along the lines
of a shufflevector to extract elements of the input into a `<4 x half>`
followed by an `fpext` to turn that into `<4 x float>`. But it looks as
if that would take a lot of work in isel lowering to make it match any
pattern I could sensibly write in Tablegen, and also I haven't been
able to think of any other case where that pattern might be generated
in IR, so there wouldn't be any extra code generation win from doing
it that way.

Therefore, I've just used another target-specific intrinsic. We can
always change it to the other way later if anyone thinks of a good
reason.

(In order to put the intrinsic definition near similar things in
`IntrinsicsARM.td`, I've also lifted the definition of the
`MVEMXPredicated` multiclass higher up the file, without changing it.)

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75254

  Commit: 1a8cbfa514ff83ac62c20deec0d9ea2c6606bbdf
      https://github.com/llvm/llvm-project/commit/1a8cbfa514ff83ac62c20deec0d9ea2c6606bbdf
  Author: Simon Tatham <simon.tatham at arm.com>
  Date:   2020-03-02 (Mon, 02 Mar 2020)

  Changed paths:
    M clang/include/clang/Basic/arm_mve.td
    A clang/test/CodeGen/arm-mve-intrinsics/vcvt_anpm.c
    M llvm/include/llvm/IR/IntrinsicsARM.td
    M llvm/lib/Target/ARM/ARMInstrMVE.td
    A llvm/test/CodeGen/Thumb2/mve-intrinsics/vcvt_anpm.ll

  Log Message:
  -----------
  [ARM,MVE] Add ACLE intrinsics for VCVT[ANPM] family.

Summary:
These instructions convert a vector of floats to a vector of integers
of the same size, with assorted non-default rounding modes.
Implemented in IR as target-specific intrinsics, because as far as I
can see there are no matches for that functionality in the standard IR
intrinsics list.

Reviewers: MarkMurrayARM, dmgreen, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D75255

Compare: https://github.com/llvm/llvm-project/compare/12048a9182fc...1a8cbfa514ff