[all-commits] [llvm/llvm-project] 4a4dd8: [ARM, MVE] Add intrinsics for vector comparisons.

Simon Tatham via All-commits all-commits at lists.llvm.org
Mon Nov 18 02:39:51 PST 2019


  Branch: refs/heads/master
  Home:   https://github.com/llvm/llvm-project
  Commit: 4a4dd85e5ab51aa8c01c690cd14205af157178e7
      https://github.com/llvm/llvm-project/commit/4a4dd85e5ab51aa8c01c690cd14205af157178e7
  Author: Simon Tatham <simon.tatham at arm.com>
  Date:   2019-11-18 (Mon, 18 Nov 2019)

  Changed paths:
    M clang/include/clang/Basic/arm_mve.td
    M clang/include/clang/Basic/arm_mve_defs.td
    M clang/lib/CodeGen/CGBuiltin.cpp
    A clang/test/CodeGen/arm-mve-intrinsics/compare.c

  Log Message:
  -----------
  [ARM,MVE] Add intrinsics for vector comparisons.

This adds the `vcmp` family of ACLE MVE intrinsics: vector/vector,
vector/scalar, and the predicated forms of both. All are represented
using standard existing IR: vector/scalar comparisons are represented
by making a vector out of the scalar first, and predicated forms are
represented by taking the bitwise AND of the input predicate and the
output of the comparison. Existing LLVM-side tests demonstrate that
ISel will pattern-match all of that back down to single MVE VCMPs.

The idiom of handling a vector/scalar operation by generating IR to
expand the scalar into a second vector is going to be needed for a lot
of MVE intrinsics, so to make that easy, I've provided a helper
function that automatically works out the element count.

The comparison intrinsics are the first ones that have to //return// a
predicate, in the user-facing `mve_pred16_t` format. This means we
have to use the `arm_mve_pred_v2i` low-level intrinsic to convert it
back from the logical `<n x i1>` form used in IR. I've done that
explicitly in the code gen specification for the builtins, because it
happens much more rarely in the ACLE API than passing a Predicate as
input, so it didn't seem worth automating in MveEmitter.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70297


  Commit: f4f77aa53e5b872bd8a93c3a193714d8eba9578c
      https://github.com/llvm/llvm-project/commit/f4f77aa53e5b872bd8a93c3a193714d8eba9578c
  Author: Simon Tatham <simon.tatham at arm.com>
  Date:   2019-11-18 (Mon, 18 Nov 2019)

  Changed paths:
    M llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
    A llvm/test/CodeGen/Thumb2/mve-vpt-from-intrinsics.ll
    A llvm/test/Transforms/InstCombine/ARM/mve-v2i2v.ll

  Log Message:
  -----------
  [ARM,MVE] Add InstCombine rules for pred_i2v / pred_v2i.

If you're writing C code using the ACLE MVE intrinsics that passes the
result of a vcmp as input to a predicated intrinsic, e.g.

  mve_pred16_t pred = vcmpeqq(v1, v2);
  v_out = vaddq_m(v_inactive, v3, v4, pred);

then clang's codegen for the compare intrinsic will create calls to
`@llvm.arm.mve.pred.v2i` to convert the output of `icmp` into an
`mve_pred16_t` integer representation, and then the next intrinsic
will call `@llvm.arm.mve.pred.i2v` to convert it straight back again.
This will be visible in the generated code as a `vmrs`/`vmsr` pair
that move the predicate value pointlessly out of `p0` and back into it again.

To prevent that, I've added InstCombine rules to remove round trips of
the form `v2i(i2v(x))` and `i2v(v2i(x))`. Also I've taught InstCombine
about the known and demanded bits of those intrinsics. As a result,
you now get just the generated code you wanted:

  vpt.u16 eq, q1, q2
  vaddt.u16 q0, q3, q4

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70313


Compare: https://github.com/llvm/llvm-project/compare/23a766dcad47...f4f77aa53e5b


More information about the All-commits mailing list