[PATCH] D94230: [AArch64][SVE] Add SVE IR pass to coalesce ptrue instrinsic calls

Joe Ellis via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 11 02:14:37 PST 2021


joechrisellis added a comment.

Hi @bsmith,

The poor codegen in that example is happening because we're factoring out a ptrue which is immediately converted to a 'sparse' predicate via a sequence of SVE reinterpret intrinsics. E.g.:

  %1 = call <vscale x 4 x i1> @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
  // <1, 1, 1, 1>
  %2 = call <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool.nxv4i1(<vscale x 4 x i1> %1)
  // <1, 0, 0, 0, 1, 0 ,0, 0, 1, 0, 0, 0, 1, 0, 0, 0>
  %3 = call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> %2)
  // <1, 0, 1, 0, 1, 0, 1, 0> ('sparse' predicate)

In these specific circumstances it doesn't make sense to eliminate the ptrue because we're just going to create an even longer chain which cannot be reduced by the SVEIntrinsicOpts.cpp pass (also see D94074 <https://reviews.llvm.org/D94074>, which extends this pass to reduce long conversion chains). I've modified this pass to account for these situations, and the codegen that we get now is identical to before. I've added this case into the tests, too.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94230/new/

https://reviews.llvm.org/D94230



More information about the llvm-commits mailing list