[PATCH] D70313: [ARM,MVE] Add InstCombine rules for pred_i2v / pred_v2i.

Fri Nov 15 07:33:45 PST 2019

simon_tatham created this revision.
simon_tatham added reviewers: ostannard, MarkMurrayARM, dmgreen.
Herald added subscribers: llvm-commits, hiraditya, kristof.beyls.
Herald added a project: LLVM.
simon_tatham updated this revision to Diff 229550.
simon_tatham added a comment.

D'oh, moved `mve-vpt-from-intrinsics.ll` into here from D70297 <https://reviews.llvm.org/D70297>.

If you're writing C code using the ACLE MVE intrinsics that passes the
result of a vcmp as input to a predicated intrinsic, e.g.

  mve_pred16_t pred = vcmpeqq(v1, v2);
  v_out = vaddq_m(v_inactive, v3, v4, pred);

then clang's codegen for the compare intrinsic will create calls to
`@llvm.arm.mve.pred.v2i` to convert the output of `icmp` into an
`mve_pred16_t` integer representation, and then the next intrinsic
will call `@llvm.arm.mve.pred.i2v` to convert it straight back again.
This will be visible in the generated code as a `vmrs`/`vmsr` pair
that move the predicate value pointlessly out of `p0` and back into it again.

To prevent that, I've added InstCombine rules to remove round trips of
the form `v2i(i2v(x))` and `i2v(v2i(x))`. Also I've taught InstCombine
about the known and demanded bits of those intrinsics. As a result,
you now get just the generated code you wanted:

  vpt.u16 eq, q1, q2
  vaddt.u16 q0, q3, q4

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D70313

Files:
  llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
  llvm/test/CodeGen/Thumb2/mve-vpt-from-intrinsics.ll
  llvm/test/Transforms/InstCombine/ARM/mve-v2i2v.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D70313.229550.patch
Type: text/x-patch
Size: 10611 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20191115/24d42249/attachment-0001.bin>