[PATCH] D94230: [AArch64][SVE] Add SVE IR pass to coalesce ptrue instrinsic calls
Joe Ellis via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 11 02:14:37 PST 2021
joechrisellis added a comment.
Hi @bsmith,
The poor codegen in that example is happening because we're factoring out a ptrue which is immediately converted to a 'sparse' predicate via a sequence of SVE reinterpret intrinsics. E.g.:
%1 = call <vscale x 4 x i1> @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
// <1, 1, 1, 1>
%2 = call <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool.nxv4i1(<vscale x 4 x i1> %1)
// <1, 0, 0, 0, 1, 0 ,0, 0, 1, 0, 0, 0, 1, 0, 0, 0>
%3 = call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> %2)
// <1, 0, 1, 0, 1, 0, 1, 0> ('sparse' predicate)
In these specific circumstances it doesn't make sense to eliminate the ptrue because we're just going to create an even longer chain which cannot be reduced by the SVEIntrinsicOpts.cpp pass (also see D94074 <https://reviews.llvm.org/D94074>, which extends this pass to reduce long conversion chains). I've modified this pass to account for these situations, and the codegen that we get now is identical to before. I've added this case into the tests, too.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94230/new/
https://reviews.llvm.org/D94230
More information about the llvm-commits
mailing list