[PATCH] D70313: [ARM,MVE] Add InstCombine rules for pred_i2v / pred_v2i.
Simon Tatham via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Nov 15 07:33:45 PST 2019
simon_tatham created this revision.
simon_tatham added reviewers: ostannard, MarkMurrayARM, dmgreen.
Herald added subscribers: llvm-commits, hiraditya, kristof.beyls.
Herald added a project: LLVM.
simon_tatham updated this revision to Diff 229550.
simon_tatham added a comment.
D'oh, moved `mve-vpt-from-intrinsics.ll` into here from D70297 <https://reviews.llvm.org/D70297>.
If you're writing C code using the ACLE MVE intrinsics that passes the
result of a vcmp as input to a predicated intrinsic, e.g.
mve_pred16_t pred = vcmpeqq(v1, v2);
v_out = vaddq_m(v_inactive, v3, v4, pred);
then clang's codegen for the compare intrinsic will create calls to
`@llvm.arm.mve.pred.v2i` to convert the output of `icmp` into an
`mve_pred16_t` integer representation, and then the next intrinsic
will call `@llvm.arm.mve.pred.i2v` to convert it straight back again.
This will be visible in the generated code as a `vmrs`/`vmsr` pair
that move the predicate value pointlessly out of `p0` and back into it again.
To prevent that, I've added InstCombine rules to remove round trips of
the form `v2i(i2v(x))` and `i2v(v2i(x))`. Also I've taught InstCombine
about the known and demanded bits of those intrinsics. As a result,
you now get just the generated code you wanted:
vpt.u16 eq, q1, q2
vaddt.u16 q0, q3, q4
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D70313
Files:
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
llvm/test/CodeGen/Thumb2/mve-vpt-from-intrinsics.ll
llvm/test/Transforms/InstCombine/ARM/mve-v2i2v.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D70313.229550.patch
Type: text/x-patch
Size: 10611 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20191115/24d42249/attachment-0001.bin>
More information about the llvm-commits
mailing list